ALGORITHM DESIGN | More Info | Notesale | Buy and Sell Study Notes Online | Extra Student Income | University Notes

Search for notes by fellow students, in your own course and all over the country.

Browse our notes for titles which look like what you need, you can preview any of the notes via a sample of the contents. After you're happy these are the notes you're after simply pop them into your shopping cart.

My Basket

Buy These Notes

You have nothing in your shopping cart yet.

Title: ALGORITHM DESIGN
Description: The summary of this note deal with how to design good algorithms, which is about the mathematical theory behind the design of good programs,and to study algorithms as pure mathematical objects, and so ignore issues such as programming language, machine, and operating system. This has the advantage of clearing away the messy details that affect implementation. But these details may be very important. This Lecture note consists of three major sections. The first is on the mathematical tools necessary for the analysis of algorithms. This will focus on asymptotic, summations, and recurrences.The second element will deal with one particularly important algorithmic problem: sorting a list of numbers. We will show a number of different strategies for sorting, and use this problem as a case study in different techniques for designing and analyzing algorithms. The final third of the course will deal with a collection of various algorithmic problems and solution techniques. Finally we will close this last third with a very brief introduction to the theory of NP-completeness. NP-complete problem are those for which no efficient algorithms are known, but no one knows for sure whether efficient solutions might exist.

Buy These Notes Preview

Document Preview

Extracts from the notes are below, to see the PDF you'll receive please use the links above

Lecture Notes

CMSC 251

CMSC 251: Algorithms1
Spring 1998
Dave Mount
Lecture 1: Course Introduction
(Tuesday, Jan 27, 1998)
Read: Course syllabus and Chapter 1 in CLR (Cormen, Leiserson, and Rivest)
...
Like a cooking recipe, an algorithm
provides a step-by-step method for solving a computational problem
...
Unlike a program, an algorithm is a mathematical entity, which is
independent of a speciﬁc programming language, machine, or compiler
...

Why study algorithm design? There are many facets to good program design
...
To be really complete algorithm designer, it is important to be
aware of programming and machine issues as well
...

Macro issues involve elements such as how does one coordinate the efforts of many programmers
working on a single piece of software, and how does one establish that a complex programming system
satisﬁes its various requirements
...

A great deal of the programming effort on most complex software systems consists of elements whose
programming is fairly mundane (input and output, data conversion, error checking, report generation)
...
(Or as the old adage goes:
80% of the execution time takes place in 20% of the code
...

It may be very important for the success of the overall project that these sections of code be written
in the most efﬁcient manner possible
...
The problem is
that if the underlying design is bad, then often no amount of ﬁne-tuning is going to make a substantial
difference
...
The system was running unacceptably slowly in spite of the efforts of a large team of
programmers and the biggest supercomputer available
...
It turns out that
the system was based on rendering hundreds of millions of polygonal elements, most of which were
1 Copyright, David M
...
of Computer Science, University of Maryland, College Park, MD, 20742
...
Permission
to use, copy, modify, and distribute these notes for educational purposes and without fee is hereby granted, provided that this copyright
notice appear in all copies
...
Recognizing this source of inefﬁciency, he redesigned the algorithms and
data structures, recoded the inner loops, so that the algorithm concentrated its efforts on eliminating
many invisible elements, and just drawing the few thousand visible elements
...

This may seem like a simple insight, but it is remarkable how many times the clever efforts of a single
clear-sighted person can eclipse the efforts of larger groups, who are not paying attention to the basic
principle that we will stress in this course:
Before you implement, ﬁrst be sure you have a good design
...
Because the lesson cannot be taught in just
one course, there are a number of companion courses that are important as well
...
This is not really an independent issue, because most of the fastest
algorithms are fast because they use fast data structures, and vice versa
...
In fact, many of the
courses in the computer science department deal with efﬁcient algorithms and data structures, but just
as they apply to various applications: compilers, operating systems, databases, artiﬁcial intelligence,
computer graphics and vision, etc
...

Implementation Issues: One of the elements that we will focus on in this course is to try to study algorithms
as pure mathematical objects, and so ignore issues such as programming language, machine, and operating system
...

But these details may be very important
...
Frequently accessed data can be stored in registers or cache memory
...
But a good algorithm designer can work within the realm of mathematics, but still keep an open eye to implementation issues down the line that will be important for ﬁnal
implementation
...
From our mathematical analysis, all have equal running times
...
Why? It is the best from the perspective of locality of reference
...

Thus this course is not the last word in good program design, and in fact it is perhaps more accurately just the ﬁrst word in good program design
...
Next prune this selection by consideration of practical matters (like locality of reference)
...
Also, be sure to use whatever development tools that you have, such as proﬁlers (programs which pin-point the sections of the code that are
responsible for most of the running time)
...
The ﬁrst is on the mathematical tools
necessary for the analysis of algorithms
...

The second element will deal with one particularly important algorithmic problem: sorting a list of
numbers
...
The ﬁnal third of the course will
deal with a collection of various algorithmic problems and solution techniques
...
NP-complete problem
are those for which no efﬁcient algorithms are known, but no one knows for sure whether efﬁcient
solutions might exist
...

Analyzing Algorithms: In order to design good algorithms, we must ﬁrst agree the criteria for measuring
algorithms
...

These resources include mostly running time and memory
...

In practice there are many issues that need to be considered in the design algorithms
...
Also,
one of the luxuries we will have in this course is to be able to assume that we are given a clean, fullyspeciﬁed mathematical description of the computational problem
...
Thus,
in practice it is often necessary to design algorithms that are simple, and easily modiﬁed if problem
parameters and speciﬁcations are slightly modiﬁed
...

Model of Computation: Another goal that we will have in this course is that our analyses be as independent
as possible of the variations in machine, operating system, compiler, or programming language
...
e
...
Thus
gives us quite a bit of ﬂexibility in how we present our algorithms, and many low-level details may be
omitted (since it will be the job of the programmer who implements the algorithm to ﬁll them in)
...
Ideally this model should be a reasonable abstraction of a
standard generic single-processor machine
...

A RAM is an idealized machine with an inﬁnitely large random-access memory
...
Each instruction involves performing some basic operation
on two values in the machines memory (which might be characters or integers; let’s avoid ﬂoating
point for now)
...
g
...
g
...
We
assume that each basic operation takes the same constant time to execute
...
It does not model some elements, such as efﬁciency due to locality of reference, as described
in the previous lecture
...
For example, the model would allow you to add two numbers that contain a billion digits in constant
time
...
Nonetheless, the RAM model seems
to be fairly sound, and has done a good job of modeling typical machine technology since the early
60’s
...
To motivate the
problem, suppose that you want to buy a car
...
But cars are expensive, and since you’re a swinger on
a budget, you want the cheapest
...
We say that the fast, cheap car dominates the slow, expensive car relative to your selection
criteria
...

Here is how we might model this as a formal problem
...
x, p
...
A point p is said to dominated by point q if p
...
x and
p
...
y
...
, pn } in 2-space a point is said to be maximal if it
is not dominated by any other point in P
...
If for each car we associated (x, y) values where
x is the speed of the car, and y is the negation of the price (thus high y values mean cheap cars), then
the maximal points correspond to the fastest and cheapest cars
...
, pn } in 2-space, each represented by
its x and y integer coordinates, output the set of the maximal points of P , that is, those points pi ,
such that pi is not dominated by any other point of P
...
)

14

(7,13)

12

(12,12)

(4,11)
(9,10)

10
8

(14,10)

(7,7)

6

(2,5)

4

(15,7)
(11,5)

(4,4)

2

(13,3)

(5,1)
2

4

6

8

10 12 14 16

Figure 1: Maximal Points
...
For example,
we have intentionally not discussed issues as to how points are represented (e
...
, using a structure with
records for the x and y coordinates, or a 2-dimensional array) nor have we discussed input and output
formats
...
However, we would like to
keep as many of the messy issues out since they would just clutter up the algorithm
...
Here is the simplest one that I can imagine
...
, pn } be the
initial set of points
...
If pi is not dominated by any
other point, then output it
...
However, if you want to be a bit more formal, it could be written in pseudocode as follows:
Brute Force Maxima
Maxima(int n, Point P[1
...
n-1]
for i = 1 to n {
maximal = true;
// P[i] is maximal by default
for j = 1 to n {
if (i != j) and (P[i]
...
x) and (P[i]
...
y) {

4

Lecture Notes

CMSC 251

maximal = false;
break;

// P[i] is dominated by P[j]

}
}
if (maximal) output P[i];

// no one dominated
...
In particular, do not assume that more
detail is better
...
Of course, the appropriate level of detail is a judgement call
...
When writing pseudocode, you should omit details that detract from the main ideas of the
algorithm, and just go with the essentials
...
For example, I assumed that
the points in P are all distinct
...
(Can you see why?) Again, these are important considerations for implementation, but
we will often omit error checking because we want to see the algorithm in its simplest form
...
If the algorithm is tricky, then this proof should contain the explanations of why the tricks works
...
We simply implemented
the deﬁnition: a point is maximal if no other point dominates it
...
Obviously the running time of an implementation of
the algorithm would depend on the speed of the machine, optimizations of the compiler, etc
...
This implies that we cannot really make distinctions between algorithms
whose running times differ by a small constant factor, since these algorithms may be faster or slower
depending on how well they exploit the particular machine and compiler
...

We’ll see later why even with this big assumption, we can still make meaningful comparisons between
algorithms
...

Running time depends on input size
...
Formally, the input size is deﬁned to be the number of characters in the input ﬁle, assuming some
reasonable encoding of inputs (e
...
numbers are represented in base 10 and separated by a space)
...

Also, different inputs of the same size may generally result in different execution times
...
) There are two common criteria used in measuring
running times:
Worst-case time: is the maximum running time over all (legal) inputs of size n? Let I denote a legal
input instance, and let |I| denote its length, and let T (I) denote the running time of the algorithm
5

Lecture Notes

CMSC 251

on input I
...

|I|=n

Average-case time: is the average running time over all inputs of size n? More generally, for each
input I, let p(I) denote the probability of seeing this input
...

Tavg (n) =

p(I)T (I)
...
This is because for many of the problems
we will work with, average-case running time is just too difﬁcult to compute, and it is difﬁcult to specify
a natural probability distribution on inputs that are really meaningful for all applications
...

Running Time of the Brute Force Algorithm: Let us agree that the input size is n, and for the running
time we will count the number of time that any element of P is accessed
...

The condition in the if-statement makes four accesses to P
...
The output statement makes two
accesses (to P [i]
...
y) for each point that is output
...

Thus we might express the worst-case running time as a pair of nested summations, one for the i-loop
and the other for the j-loop:


n

2 +

T (n) =
i=1

4
...

n

4 is just 4n, and so

n

(4n + 2) = (4n + 2)n = 4n2 + 2n
...
Also, we are most interested in
what happens as n gets large
...

It is only for large values of n that running time becomes an important issue
...
We will sum this
analysis up by simply saying that the worst-case running time of the brute force algorithm is Θ(n2 )
...
Later we will discuss more formally what
this notation means
...
) We saw that this analysis involved computing a summation
...
Given a ﬁnite
sequence of values a1 , a2 ,
...

i=1

If n = 0, then the value of the sum is the additive identity, 0
...
These are easy to verify by simply writing out the summation and applying simple
6

Lecture Notes

CMSC 251

high school algebra
...

i=1

There are some particularly important summations, which you should probably commit to memory (or
at least remember their asymptotic growth rates)
...

Arithmetic Series: For n ≥ 0,
n

i = 1 + 2 + ··· + n =
i=1

n(n + 1)
= Θ(n2 )
...

x−1

If 0 < x < 1 then this is Θ(1), and if x > 1, then this is Θ(xn )
...
For n ≥ 0,
n

Hn =
i=1

1 1
1
1
= 1 + + + · · · + ≈ ln n = Θ(ln n)
...
3 in CLR
...
Recall that the algorithm consisted of two nested loops
...
n]) {
for i = 1 to n {

...

...
The stuff in the “
...

Last time we counted the number of times that the algorithm accessed a coordinate of any point
...
) We showed that as a function of n
in the worst case this quantity was
T (n) = 4n2 + 2n
...
Also, we do not care about constant factors (because we wanted
simplicity and machine independence, and ﬁgured that the constant factors were better measured by
implementing the algorithm)
...

In this and the next lecture we will consider the questions of (1) how is it that one goes about analyzing
the running time of an algorithm as function such as T (n) above, and (2) how does one arrive at a
simple asymptotic expression for that running time
...
Again, we will ignore stuff that takes constant time
(expressed as “
...

A Not-So-Simple Example:
for i = 1 to n {

...

k = j;
while (k >= 0) {

...
Let I(),
M (), T () be the running times for (one full execution of) the inner loop, middle loop, and the entire
program
...
Let’s consider one pass
through the innermost loop
...
It is executed for
k = j, j − 1, j − 2,
...

We could attempt to arrive at this more formally by expressing this as a summation:
j

1=j+1

I(j) =
k=0

Why the “1”? Because the stuff inside this loop takes constant time to execute
...

Now let us consider one pass through the middle loop
...
Using
the summation we derived above for the innermost loop, and the fact that this loop is executed for j
running from 1 to 2i, it follows that the execution time is
2i

2i

I(j) =

M (i) =
j=1

(j + 1)
...

2

Lecture Notes

CMSC 251

Our sum is not quite of the right form, but we can split it into two sums:
2i

2i

j+

M (i) =
j=1

1
...
The former is an arithmetic series, and so we ﬁnd can plug in 2i for n,
and j for i in the formula above to yield the value:
M (i) =

4i2 + 2i + 4i
2i(2i + 1)
+ 2i =
= 2i2 + 3i
...

T (n) =
i=1

Splitting this up (by the linearity of addition) we have
n

n

i2 + 3

T (n) = 2
i=1

i
...

n
The former summation i=1 i2 is not one that we have seen before
...

Quadratic Series: For n ≥ 0
...

6

Assuming this fact for now, we conclude that the total running time is:
T (n) = 2

n(n + 1)
2n3 + 3n2 + n
+3
,
6
2

which after some algebraic manipulations gives
T (n) =

4n3 + 15n2 + 11n

...

Solving Summations: In the example above, we saw an unfamiliar summation,
could be solved in closed form as:
n

i2 =
i=1

n
2
i=1 i ,

which we claimed

2n3 + 3n2 + n

...
In general, when you are presented with an
unfamiliar summation, how do you approach solving it, or if not solving it in closed form, at least
getting an asymptotic approximation
...

9

Lecture Notes

CMSC 251

Use crude bounds: One of the simples approaches, that usually works for arriving at asymptotic
bounds is to replace every term in the summation with a simple upper bound
...
This would give
n

n

i2 ≤
i=1

n2 = n3
...

This technique works pretty well with relatively slow growing functions (e
...
, anything growing
more slowly than than a polynomial, that is, ic for some constant c)
...

Approximate using integrals: Integration and summation are closely related
...
) Here is a handy formula
...

n

n
0

f (x)dx ≤

n+1

f (i) ≤
i=1

f(x)

f(2)

0 1 2 3
...

1

0 1 2 3
...

Most running times are increasing functions of input size, so this formula is useful in analyzing
algorithm running times
...
In this case, f (x) = x2
...

3
3
3

Note that the constant factor on the leading term of n3 /3 is equal to the exact formula
...

Use constructive induction: This is a fairly good method to apply whenever you can guess the general
form of the summation, but perhaps you are not sure of the various constant factors
...
However, we believe that they are constants (i
...
, they
are independent of n)
...

Since this is the ﬁrst induction proof we have done, let us recall how induction works
...
Let n be the
integer variable on which we are performing the induction
...
There is some smallest value
n0 for which IH(n0 ) is suppose to hold
...

Prove IH(n0);
for n = n0+1 to infinity do
Prove IH(n), assuming that IH(n’) holds for all n’ < n;

This is sometimes called strong induction, because we assume that the hypothesis holds for all
n < n
...

Basis Case: (n = 0) Recall that an empty summation is equal to the additive identity, 0
...
For this to be true, we must have
d = 0
...

The structure of proving summations by induction is almost always the same
...
Here we
go
...

To complete the proof, we want this is equal to an3 + bn2 + cn + d
...
This gives us the following
constraints
a = a,

b = −3a + b + 1,

c = 3a − 2b + c,

d = −a + b − c + d
...
From the second constraint above we can
cancel b from both sides, implying that a = 1/3
...
Finally from the last constraint we have c = −a + b = 1/6
...

3
2
6
6

As desired, all of the values a through d are constants, independent of n
...

Notice that constructive induction gave us the exact formula for the summation
...

11

Lecture Notes

CMSC 251

In summary, there is no one way to solve a summation
...
The ultimate goal is to
come up with a close-form solution
...

Lecture 4: 2-d Maxima Revisited and Asymptotics
(Thursday, Feb 5, 1998)
Read: Chapts
...

2-dimensional Maxima Revisited: Recall the max-dominance problem from the previous lectures
...
x ≤ q
...
y ≤ q
...
Given a set of n points, P =
{p1 , p2 ,
...
The problem is to output all the maximal points of P
...
The question we consider today is whether there is an approach that is
signiﬁcantly better?
The problem with the brute-force algorithm is that uses no intelligence in pruning out decisions
...
Any point that pi dominates will also be dominated by pj
...
) This observation
by itself, does not lead to a signiﬁcantly faster algorithm though
...

Plane-sweep Algorithm: The question is whether we can make an signiﬁcant improvement in the running
time? Here is an idea for how we might do it
...
As we sweep this line, we will build a structure holding the maximal points lying to the left of
the sweep line
...
This approach of solving geometric problems by sweeping a line across
the plane is called plane sweep
...
To do this, we will begin by sorting the points in increasing order of
their x-coordinates
...
(This
limiting assumption is actually easy to overcome, but it is good to work with the simpler version, and
save the messy details for the actual implementation
...
As we encounter each new point, we will update the current list of maximal
points
...
But the
bottom line is that there exist any number of good sorting algorithms whose running time to sort n
values is Θ(n log n)
...

So the only remaining problem is, how do we store the existing maximal points, and how do we update
them when a new point is processed? We claim that as each new point is added, it must be maximal for
the current set
...
) However, this new point may
dominate some of the existing maximal points, and so we may need to delete them from the list of
maxima
...
) Consider the ﬁgure below
...
Notice that since the pi has greater x-coordinate
than all the existing points, it dominates an existing point if and only if its y-coordinate is also larger
12

Lecture Notes

CMSC 251

sweep line

sweep line
14

14
12
10

(3,13)

(12,12)

(4,11)

(14,10)
(9,10)

8

(7,7)

6

12
10

2

(10,5)

4

6

8

(7,7)
(4,4)

(10,5)
(13,3)

(5,1)
2

(a)

(15,7)

(2,5)

2

10 12 14 16

(14,10)
(9,10)

4
(13,3)

(5,1)
2

(4,11)

6
(4,4)

(12,12)

8

(15,7)

(2,5)

4

(3,13)

4

6

8

10 12 14 16

(b)

Figure 3: Plane sweep algorithm for 2-d maxima
...
Thus, among the existing maximal points, we want to ﬁnd those having smaller (or equal)
y-coordinate, and eliminate them
...
As we read maximal points from left to right (in order of increasing
x-coordinates) the y-coordinates appear in decreasing order
...
x ≥ q
...
y ≥ q
...
Then it would follow that q is
dominated by p, and hence it is not maximal, a contradiction
...
So we have to scan this list to ﬁnd the
breakpoint between the maximal and dominated points
...
But we must do the scan in the proper direction for
the algorithm to be efﬁcient
...
If you can answer this question
correctly, then it says something about your intuition for designing efﬁcient algorithms
...

The correct answer is to scan the list from left to right
...
If you only encounter one point
in the scan, then the scan will always be very efﬁcient
...
If we scan the list from left to right, then every point that we
encounter whose y-coordinate is less than pi ’s will be dominated, and hence it will be eliminated from
the computation forever
...
On the other hand, if we scan
from left to right, then in the worst case (consider when all the points are maximal) we may rescan the
same points over and over again
...
Since we add maximal points
onto the end of the list, and delete them from the end of the list, we can use a stack to store the maximal
points, where the top of the stack contains the point with the highest x-coordinate
...
The top element of the stack is denoted S
...
Popping the stack means removing the top
element
...
n]) {

13

Lecture Notes

CMSC 251

Sort P in ascending order by x-coordinate;
S = empty;
// initialize stack of maxima
for i = 1 to n do {
// add points in order of x-coordinate
while (S is not empty and S
...
y <= P[i]
...
The most
important element was that since the current maxima appear on the stack in decreasing order of xcoordinates (as we look down from the top of the stack), they occur in increasing order of y-coordinates
...

Analysis: This is an interesting program to analyze, primarily because the techniques that we discussed in
the last lecture do not apply readily here
...
In particular, we have two nested
loops
...
The inner while-loop could be iterated up to n − 1
times in the worst case (in particular, when the last point added dominates all the others)
...

However, this is a good example of how not to be fooled by analyses that are too simple minded
...
Why is this? First observe that the total number of elements that have ever been
pushed onto the stack is at most n, since we execute exactly one Push for each time through the outer
for-loop
...
It is impossible to pop more elements off the stack than are ever pushed on
...
(Make
sure that you believe the argument before going on
...

Is this really better? How much of an improvement is this plane-sweep algorithm over the brute-force algorithm? Probably the most accurate way to ﬁnd out would be to code the two up, and compare their
running times
...
(We have ignored
constant factors, but we’ll see that they cannot play a very big role
...
What is the base of the logarithm? It turns out that it will not matter
for the asymptotics (we’ll show this later), so for concreteness, let’s assume logarithm base 2, which
we’ll denote as lg n
...

n lg n
lg n
For relatively small values of n (e
...
less than 100), both algorithms are probably running fast enough
that the difference will be practically negligible
...
Of course, we have not
considered the constant factors
...
For even larger

14

Lecture Notes

CMSC 251

inputs, say, n = 1, 000, 000, we are looking at a ratio of roughly 1, 000, 000/20 = 50, 000
...

For example, suppose that there was a constant factor difference of 10 to 1, in favor of the brute-force
algorithm
...
If the plane-sweep algorithm
took, say 10 seconds to execute, then the brute-force algorithm would take 14 hours
...
It tells us which algorithm is
better for large values of n
...
But efﬁcient algorithm design is most important for large inputs, and the general rule of
computing is that input sizes continue to grow until people can no longer tolerate the running times
...

Asymptotic Notation: We continue to use the notation Θ() but have never deﬁned it
...

Deﬁnition: Given any function g(n), we deﬁne Θ(g(n)) to be a set of functions that are asymptotically
equivalent to g(n), or put formally:
Θ(g(n)) = {f (n)

| there exist positive constants c1 , c2 , and n0 such that
0 ≤ c1 g(n) ≤ f (n) ≤ c2 g(n) for all n ≥ n0 }
...
It seems that the reasonably simple
concept of “throw away all but the fastest growing term, and ignore constant factors” should have a
simpler and more intuitive deﬁnition than this
...

First off, we can see that we have been misusing the notation
...
This cannot be true
...
This should
properly be written as T (n) ∈ Θ(n2 )
...

Going back to an earlier lecture, recall that we argued that the brute-force algorithm for 2-d maxima
had a running time of T (n) = 4n2 + 2n, which we claimed was Θ(n2 )
...
In
this case g(n) = n2
...

There are really three inequalities here
...
The next is:
c1 n2 ≤ 4n2 + 2n
...
The other
inequality is
4n2 + 2n ≤ c2 n2
...

We have two constraints on n, n ≥ 0 and n ≥ 1
...

Thus, we have given a formal proof that 4n2 + 2n ∈ Θ(n2 ), as desired
...

15

Lecture Notes

CMSC 251

Lecture 5: Asymptotics
(Tuesday, Feb 10, 1998)
Read: Chapt
...
The Limit Rule is not really covered in the text
...
4 for next time
...

Today, we will explore this and other asymptotic notations in greater depth, and hopefully give a better
understanding of what they mean
...

Deﬁnition: Given any function g(n), we deﬁne Θ(g(n)) to be a set of functions:
Θ(g(n)) = {f (n)

| there exist strictly positive constants c1 , c2 , and n0 such that
0 ≤ c1 g(n) ≤ f (n) ≤ c2 g(n) for all n ≥ n0 }
...
Intuitively, what we want to say with “f (n) ∈ Θ(g(n))” is that f (n) and
g(n) are asymptotically equivalent
...
For example, functions like 4n2 , (8n2 + 2n − 3), (n2 /5 + n − 10 log n), and n(n − 3) are all
intuitively asymptotically equivalent, since as n becomes large, the dominant (fastest growing) term is
some constant times n2
...
The portion of the deﬁnition
that allows us to select c1 and c2 is essentially saying “the constants do not matter because you may
pick c1 and c2 however you like to satisfy these conditions
...
”
An example: Consider the function f (n) = 8n2 + 2n − 3
...
Let’s see
why the formal deﬁnition bears out this informal observation
...
We’ll do both very carefully
...
” Consider the following (almost correct) reasoning:
f (n) = 8n2 + 2n − 3 ≥ 8n2 − 3 = 7n2 + (n2 − 3) ≥ 7n2 = 7n2
...
But in the above reasoning we have implicitly made
the assumptions that 2n ≥ 0 and n2 − 3 ≥ 0
...
In particular, if n ≥ 3, then both are true
...

Upper bound: f (n) grows asymptotically no faster than n2 : This is established by the portion of
the deﬁnition that reads “there exist positive constants c2 and n0 , such that f (n) ≤ c2 n2 for all
n ≥ n0
...

This means that if we let c2 = 10, then we are done
...
This is not true for all n, but it is true for all n ≥ 1
...

16

Lecture Notes

CMSC 251

√
From the lower bound, we have n0 ≥ 3 and from the upper bound we have n0 ≥ 1, and so combining
√
these we let n0 be the larger of the two: n0 = 3
...
Since we have shown (by construction) the existence of
constants c1 , c2 , and n0 , we have established that f (n) ∈ n2
...
)
Now let’s show why f (n) is not in some other asymptotic class
...

/
If this were true, then we would have to satisfy both the upper and lower bounds
...
But the upper
bound is false
...
” Informally, we know that as n becomes large enough
f (n) = 8n2 + 2n − 3 will eventually exceed c2 n no matter how large we make c2 (since f (n) is
growing quadratically and c2 n is only growing linearly)
...
Since
this is true for all sufﬁciently large n then it must be true in the limit as n tends to inﬁnity
...

lim 8n + 2 −
n→∞
n
It is easy to see that in the limit the left side tends to ∞, and so no matter how large c2 is, this statement
is violated
...

/
Let’s show that f (n) ∈ Θ(n3 )
...
” Informally this is true because f (n) is
growing quadratically, and eventually any cubic function will exceed it
...

Since this is true for all sufﬁciently large n then it must be true in the limit as n tends to inﬁnity
...

It is easy to see that in the limit the left side tends to 0, and so the only way to satisfy this requirement
/
is to set c1 = 0, but by hypothesis c1 is positive
...

O-notation and Ω-notation: We have seen that the deﬁnition of Θ-notation relies on proving both a lower
and upper asymptotic bound
...
The
O-notation allows us to state asymptotic upper bounds and the Ω-notation allows us to state asymptotic
lower bounds
...

Deﬁnition: Given any function g(n),
Ω(g(n)) = {f (n)

| there exist positive constants c and n0 such that
0 ≤ cg(n) ≤ f (n) for all n ≥ n0 }
...
You will see that O-notation only enforces the upper bound of
the Θ deﬁnition, and Ω-notation only enforces the lower bound
...
Intuitively, f (n) ∈ O(g(n)) means that f (n) grows
asymptotically at the same rate or slower than g(n)
...

For example f (n) = 3n2 + 4n ∈ Θ(n2 ) but it is not in Θ(n) or Θ(n3 )
...
Finally, f (n) ∈ Ω(n2 ) and in Ω(n) but not in Ω(n3 )
...

Limit Rule for Θ-notation: Given positive functions f (n) and g(n), if
f (n)
= c,
n→∞ g(n)
lim

for some constant c > 0 (strictly positive but not inﬁnity), then f (n) ∈ Θ(g(n))
...

Limit Rule for Ω-notation: Given positive functions f (n) and g(n), if
lim

n→∞

f (n)
=0
g(n)

(either a strictly positive constant or inﬁnity) then f (n) ∈ Ω(g(n))
...
The only exceptions that I
know of are strange instances where the limit does not exist (e
...
f (n) = n(1+sin n) )
...

You may recall the important rules from calculus for evaluating limits
...
) Most of the rules are pretty self evident (e
...
, the limit of a ﬁnite sum is
the sum of the individual limits)
...

Polynomial Functions: Using the Limit Rule it is quite easy to analyze polynomial functions
...
Then f (n) ∈ Θ(n4 )
...
Using the limit rule we have:
2
5
4
7
f (n)
= lim 2 − − 2 + 3 − 4
4
n→∞ n
n→∞
n n
n
n
lim

= 2 − 0 − 0 + 0 − 0 = 2
...

18

Lecture Notes

CMSC 251

In fact, it is easy to generalize this to arbitrary polynomials
...
Then p(n) ∈ Θ(nd )
...

Exponentials and Logarithms: Exponentials and logarithms are very important in analyzing algorithms
...
The terminology lgb n means (lg n)b
...

n→∞ an
n→∞ nc
We won’t prove these, but they can be shown by taking appropriate powers, and then applying L’Hˆ pital’s
o
rule
...
For example:
n500 ∈ O(2n )
...
Conversely, logarithmic
powers (sometimes called polylogarithmic functions) grow more slowly than any polynomial
...

For this reason, we will usually be happy to allow any number of additional logarithmic factors, if it
means avoiding any additional powers of n
...
They
are true in the limit for large n, but you should be careful just how high the crossover point is
...
Thus, you should take this with a grain of salt
...
For example lg2 n ≤ n for all n ≥ 16
...

• Θ(1): Constant time; you can’t beat it!
• Θ(log n): This is typically the speed that most efﬁcient data structures operate in for a single
access
...
g
...
) Also it is the time to ﬁnd an object in a
sorted list of length n by binary search
...

• Θ(n log n): This is the running time of the best sorting algorithms
...

• Θ(n2 ), Θ(n3 ),
...
These running times are acceptable either when the exponent is small or when the data size is not too large (e
...
n ≤ 1, 000)
...
This is only acceptable when either (1) your know that you
inputs will be of very small size (e
...
n ≤ 50), or (2) you know that this is a worst-case running
time that will rarely occur in practical instances
...

• Θ(n!), Θ(nn ): Acceptable only for really small inputs (e
...
n ≤ 20)
...
You betcha! For example, if you want to see a function that grows
inconceivably fast, look up the deﬁnition of Ackerman’s function in our book
...
1 (on MergeSort) and Chapt
...

Divide and Conquer: The ancient Roman politicians understood an important principle of good algorithm
design (although they were probably not thinking about algorithms at the time)
...
This is called
divide-and-conquer
...
But once you have broken the problem into pieces, how do you solve
these pieces? The answer is to apply divide-and-conquer to them, thus further breaking them down
...
g
...

Summarizing, the main elements to a divide-and-conquer solution are
• Divide (the problem into a small number of pieces),
• Conquer (solve each piece, by applying divide-and-conquer recursively to it), and
• Combine (the pieces together into a global solution)
...
In fact the technique is so powerful, that when someone ﬁrst suggests a problem to me,
the ﬁrst question I usually ask (after what is the brute-force solution) is “does there exist a divide-andconquer solution for this problem?”
Divide-and-conquer algorithms are typically recursive, since the conquer part involves invoking the
same technique on a smaller subproblem
...

For the next couple of lectures we will discuss some examples of divide-and-conquer algorithms, and
how to analyze them using recurrences
...
This is a simple and very efﬁcient algorithm for sorting a list of numbers, called MergeSort
...
n]
...
This is normally done
by permuting the elements within the array A
...

Divide: Split A down the middle into two subsequences, each of size roughly n/2
...

Combine: Merge the two sorted subsequences into a single sorted list
...
An sequence
of length one is trivially sorted
...
It turns out that the merging process is
quite easy to implement
...
The “divide” phase is shown on the left
...
The “conquer and combine” phases are
shown on the right
...

20

Lecture Notes

CMSC 251

input:

7 5 2 4
7 5
7

5

output:

7 5 2 4 1 6 3 0

split

2 4
2

4

1 6 3 0
1 6

1

6

2 4 5 7
5 7

3 0
3

0 1 2 3 4 5 6 7

0

7

5

merge

2 4
2

4

0 1 3 6

1 6
1

6

0 3
3

0

Figure 4: MergeSort
...
We’ll assume that the procedure that merges two sorted
list is available to us
...
Because the algorithm is called recursively on sublists,
in addition to passing in the array itself, we will pass in two indices, which indicate the ﬁrst and last
indices of the subarray that we are to sort
...
r] and return the sorted result in the same subarray
...
If r = p, then this means that there is only one element to sort, and we may return
immediately
...
We ﬁnd the index q, midway between p and r, namely q = (p + r)/2 (rounded down to the
nearest integer)
...
q] and A[q + 1
...
(We need to be careful
here
...
q − 1] and A[q
...
) Call MergeSort
recursively to sort each subarray
...

MergeSort
MergeSort(array A, int p, int r) {
if (p < r) {
q = (p + r)/2
MergeSort(A, p, q)
MergeSort(A, q+1, r)
Merge(A, p, q, r)
}
}

// we have at least 2 items
// sort A[p
...
r]
// merge everything together

Merging: All that is left is to describe the procedure that merges two sorted lists
...
q], and the right subarray, A[q + 1
...

We merge these two subarrays by copying the elements to a temporary working array called B
...
r]
...
) We have to indices i and j, that point to the current elements of
each subarray
...
When we run out of elements in one array, then
we just copy the rest of the other array into B
...

(The use of the temporary array is a bit unpleasant, but this is impossible to overcome entirely
...
)
In case you are not aware of C notation, the operator i++ returns the current value of i, and then
increments this variable by one
...
q] with A[q+1
...
r]
i = k = p
j = q+1
while (i <= q and j <= r) {
if (A[i] <= A[j]) B[k++] = A[i++]
else
B[k++] = A[j++]
}
while (i <= q) B[k++] = A[i++]
while (j <= r) B[k++] = A[j++]
for i = p to r do A[i] = B[i]

// initialize pointers
// while both subarrays are nonempty
// copy from left subarray
// copy from right subarray
// copy any leftover to B
// copy B back to A

}

This completes the description of the algorithm
...
(Do you see why?)
If you ﬁnd the recursion to be a bit confusing
...
Convince yourself
that as you unravel the recursion you are essentially walking through the tree (the recursion tree) shown
in the ﬁgure
...
(We have drawn two trees in the ﬁgure, but this is just to make the distinction
between the inputs and outputs clearer
...
This is often handled in the implementation by using two arrays,
both of equal size
...
At
even levels we merge from from B to A
...
Of course, this
only improves the constant factors; it does not change the asymptotic running time
...
g
...
Then invoke a simple Θ(n2 ) algorithm, like insertion sort
on these small lists
...
Note that since they are running on subsequences of size at
most 20, the running times is Θ(202 ) = Θ(1)
...

It might seem at ﬁrst glance that it should be possible to merge the lists “in-place”, without the need
for additional temporary storage
...
It turns out that there are faster ways to sort numbers in-place,
e
...
using either HeapSort or QuickSort
...
Suppose that in the ifstatement above, we have A[i] = A[j]
...
Would
it have mattered if instead we had copied from the right sublist? The simple answer is no—since the
elements are equal, they can appear in either order in the ﬁnal sublist
...
Many times we are sorting data that does not have a single attribute,
but has many attributes (name, SSN, grade, etc
...
If we sort on a second attribute (say, grade), then it would be nice if people with
same grade are still sorted by name
...
This is a nice property for a sorting algorithm to have
...
It can be shown
that as a result, MergeSort is a stable sorting algorithm
...
)

22

Lecture Notes

CMSC 251

Analysis: What remains is to analyze the running time of MergeSort
...
Let n = r − p + 1 denote the total length of both the left
and right subarrays
...
It is easy to see that each loop can be executed at most n times
...
) Thus the running time to Merge n items is Θ(n)
...
(We’ll see later why we do this
...
To avoid
circularity, the recurrence for a given value of n is deﬁned in terms of values that are strictly smaller
than n
...
g
...

Let’s see how to apply this to MergeSort
...
For concreteness we could count whatever we like: number of lines of pseudocode,
number of comparisons, number of array accesses, since these will only differ by a constant factor
...

First observe that if we call MergeSort with a list containing a single element, then the running time is a
constant
...
When we call MergeSort
with a list of length n > 1, e
...
Merge(A, p, r), where r −p+1 = n, the algorithm ﬁrst computes
q = (p + r)/2
...
q], which contains q − p + 1 elements
...
Thus the remaining subarray A[q + 1
...
How long does it take
to sort the left subarray? We do not know this, but because n/2 < n for n > 1, we can express this
as T ( n/2 )
...

Finally, to merge both sorted lists takes n time, by the comments made above
...

Lecture 7: Recurrences
(Tuesday, Feb 17, 1998)
Read: Chapt
...
Skip Section 4
...

Divide and Conquer and Recurrences: Last time we introduced divide-and-conquer as a basic technique
for designing efﬁcient algorithms
...
We also described MergeSort, a sorting
algorithm based on divide-and-conquer
...
To do this, we introduced the notion of a recurrence, that is, a recursively
deﬁned function
...

MergeSort Recurrence: Here is the recurrence we derived last time for MergeSort
...
We argued that if the list is of length 1, then the total sorting
time is a constant Θ(1)
...
Thus, the total running time for MergeSort could be described by
the following recurrence:
T (n) =

1
T ( n/2 ) + T ( n/2 ) + n

if n = 1,
otherwise
...
This
is done to make the recurrence more concrete
...
g
...
The analysis
would have been a bit more complex, but we would arrive at the same asymptotic formula
...

T (1) = 1
(by the basis
...

T (8)

= T (4) + T (4) + 8 = 12 + 12 + 8 = 32

...

T (32)

= T (16) + T (16) + 32 = 80 + 80 + 32 = 192
...
Since the recurrence divides by 2 each time,
let’s consider powers of 2, since the function will behave most regularly for these values
...

This suggests that for powers of 2, T (n)/n = (lg n) + 1, or equivalently, T (n) = (n lg n) + n which
is Θ(n log n)
...

Logarithms in Θ-notation: Notice that I have broken away from my usual convention of say lg n and just
said log n inside the Θ()
...

Recall the change of base formula:
loga n

...
Consequently logb n and loga n differ only by a constant
factor
...
Henceforth, I will not be
fussy about the bases of logarithms if asymptotic results are sufﬁcient
...
So whenever it is reasonable to do so, we will just forget about them, and make
whatever simplifying assumptions we like about n to make things work out
...
Notice that this means that our analysis will
24

Lecture Notes

CMSC 251

only be correct for a very limited (but inﬁnitely large) set of values of n, but it turns out that as long as
the algorithm doesn’t act signiﬁcantly different for powers of 2 versus other numbers, the asymptotic
analysis will hold for all n as well
...

Veriﬁcation through Induction: We have just generated a guess for the solution to our recurrence
...
The proof will be by strong induction on n
...
Instead we use strong induction
...

Proof: (By strong induction on n
...

Induction step: Let n > 1, and assume that the formula T (n ) = (n lg n )+n , holds whenever
n < n
...
To do this, we need to express T (n)
in terms of smaller values
...

Now, n/2 < n, so we can apply the induction hypothesis here, yielding T (n/2) = (n/2) lg(n/2)+
(n/2)
...

The Iteration Method: The above method of “guessing” a solution and verifying through induction works
ﬁne as long as your recurrence is simple enough that you can come up with a good guess
...
The following method is quite powerful
...
By in large, summations are
easier to solve than recurrences (and if nothing else, you can usually approximate them by integrals)
...
Let’s start expanding out the deﬁnition until we see a pattern developing
...
This has a recursive formula inside T (n/2)
which we can expand, by ﬁlling in the deﬁnition but this time with the argument n/2 rather than n
...
We then simplify and repeat
...

T (n)

=

2T (n/2) + n

= 2(2T (n/4) + n/2) + n = 4T (n/4) + n + n
= 4(2T (n/8) + n/4) + n + n = 8T (n/8) + n + n + n
= 8(2T (n/16) + n/8) + n + n + n = 16T (n/16) + n + n + n + n
=
...

=

2k T (n/(2k )) + (n + n + · · · + n)

=

T (n)

2k T (n/(2k )) + kn
...
Here’s how we can do that
...
Thus,
let us select k to be a value which forces the term n/(2k ) = 1
...
If we substitute this value of k into the equation we get
=

2(lg n) T (n/(2(lg n) )) + (lg n)n

=

T (n)

2(lg n) T (1) + n lg n = 2(lg n) + n lg n = n + n lg n
...
Thus we have arrived at the same conclusion, but this time no guesswork was involved
...

Let’s try a messier recurrence:
T (n) =

1
3T (n/4) + n

if n = 1,
otherwise
...
As before, the idea is to repeatedly apply the deﬁnition, until a pattern emerges
...

n
+ 3k−1 (n/4k−1 ) + · · · + 9(n/16) + 3(n/4) + n
= 3k T
4k
=

3 T
k

n
4k

k−1

+
i=0

3i
n
...
To get rid of it we recall that
we know the value of T (1), and so we set n/4k = 1 implying that 4k = n, that is, k = log4 n
...

4i

Again, in the last step we used the formula alogb n = nlogb a where a = 3 and b = 4, and the fact that
T (1) = 1
...
) By the way, log4 3 = 0
...
≈ 0
...
79
...
First observe that the value n remains constant
throughout the sum, and so we can pull it out front
...

T (n) = nlog4 3 + n

(log4 n)−1
i=0

3
4

i

...
We may apply the formula for the geometric series, which gave in
an earlier lecture
...

xi =
x−1
i=0
In this case x = 3/4 and m = log4 n − 1
...

(3/4) − 1

Applying our favorite log identity once more to the expression in the numerator (with a = 3/4 and
b = 4) we get
(3/4)log4 n = nlog4 (3/4) = n(log4 3−log4 4) = n(log4 3−1) =

nlog4 3

...

So the ﬁnal result (at last!) is
T (n) = 4n − 3nlog4 3 ≈ 4n − 3n0
...

It is interesting to note the unusual exponent of log4 3 ≈ 0
...
We have seen that two nested loops typically leads to Θ(n2 ) time, and three nested loops typically leads to Θ(n3 ) time, so it seems remarkable
that we could generate a strange exponent like 0
...
However, as we shall
see, this is often the case in divide-and-conquer recurrences
...
4 on recurrences, skip Section 4
...

Recap: Last time we discussed recurrences, that is, functions that are deﬁned recursively
...
We also discussed two methods for solving recurrences, namely guess-and-verify (by induction), and iteration
...
Today we will discuss two more techniques for solving recurrences
...

27

Lecture Notes

CMSC 251

Visualizing Recurrences Using the Recursion Tree: Iteration is a very powerful technique for solving recurrences
...
Here is a nice way to visualize what is going on in iteration
...

Recall that the recurrence for MergeSort (which we simpliﬁed by assuming that n is a power of 2, and
hence could drop the ﬂoors and ceilings)
1
2T (n/2) + n

T (n) =

if n = 1,
otherwise
...
Recall that to
merge two lists of size m/2 to a list of size m takes Θ(m) time, which we will just write as m
...

T(n)
n
T(n/2)
n/2
T(n/4)
n/4

=n
+

T(n/2)
n/2
n/4

n/4

n/4

1 1 1
...

n(n/n) = n

lg n + 1 levels

n (lg n +1)

Figure 5: Using the recursion tree to visualize a recurrence
...
At the
second level we have two merges, each taking n/2 time, for a total of 2(n/2) = n
...
This continues until the bottommost
level of the tree
...
, lg n), and each level contributes a
total of n time, the total running time is n(lg n + 1) = n lg n + n
...

This can be used for a number of simple recurrences
...
The tree is illustrated below
...

Again, we label each node with the amount of work at that level
...

For the top level (or 0th level) the work is n2
...
This can be written as n2 (3/4)
...
In general it is easy to extrapolate to see that at the level i, we have 3i
nodes, each involving (n/2i )2 work, for a total of 3i (n/2i )2 = n2 (3/4)i
...
Note that we have not determined where the tree bottoms out,
so we have left off the upper bound on the sum
...

Lecture Notes

CMSC 251

T(n)
n2
T(n/2)
(n/2) 2

T(n/2)
(n/2) 2

T(n/2)
(n/2) 2

T(n/4)
(n/4)2 (n/4)2 (n/4)2 (n/4)2 (n/4)2 (n/4)2

...

n 2 (3/4)i

Figure 6: Another recursion tree example
...
Why? The
summation is a geometric series, and the base (3/4) is less than 1
...
Thus the running time is Θ(n2 )
...
(It is easy to be off by ±1 here, but this sort of small error will not affect the asymptotic
result
...
) So, we can plug in lg n for the “?” in the above summation
...

If we wanted to get a more exact answer, we could plug the summation into the formula for the geometric series and simplify
...

(3/4) − 1

This will take some work to simplify, but at this point it is all just tedious algebra to get the formula
into simple form
...
)
T (n)

= n2

(3/4)lg n+1 − 1
= − 4n2 ((3/4)lg n+1 − 1)
(3/4) − 1

= 4n2 (1 − (3/4)lg n+1 ) = 4n2 (1 − (3/4)(3/4)lg n )
= 4n2 (1 − (3/4)nlg(3/4) ) = 4n2 (1 − (3/4)nlg 3−lg 4 )
= 4n2 (1 − (3/4)nlg 3−2 ) = 4n2 (1 − (3/4)(nlg 3 /n2 ))
= 4n2 − 3nlg 3
...
58, so the whole expression is Θ(n2 )
...

(Simpliﬁed) Master Theorem: If you analyze many divide-and-conquer algorithms, you will see that the
same general type of recurrence keeps popping up
...
For example, in
MergeSort, a = 2, b = 2, and k = 1
...
This result is called the Master
Theorem, because it can be used to “master” so many different types of recurrence
...
We will give a simpler version, which is general enough for
most typical situations
...
If the one from
the book doesn’t apply, then you will probably need iteration, or some other technique
...
(As usual let us assume that n is a power of b
...
) Then
Case 1: if a > bk then T (n) ∈ Θ(nlogb a )
...

Case 3: if a < bk then T (n) ∈ Θ(nk )
...
Thus, a = bk (2 = 21 ) and so Case 2 applies
...

In the recurrence above, T (n) = 3T (n/2) + n2 , we have a = 3, b = 2 and k = 2
...
From this we have T (n) ∈ Θ(n2 )
...
In
this case we have a > bk (4 > 31 ), and so Case 1 applies
...
26 )
...

There are many recurrences that cannot be put into this form
...

This solves to T (n) = Θ(n log2 n), but the Master Theorem (neither this form nor the one in CLR)
will tell you this
...

Recursion Trees Revisited: The recursion trees offer some intuition about why it is that there are three cases
in the Master Theorem
...

For example, in the MergeSort recurrence (which corresponds to Case 2 in the Master Theorem) every
level of the recursion tree provides the same total work, namely n
...

Next consider the earlier recurrence T (n) = 3T (n/2) + n2 (which corresponds to Case 3 in the Master
Theorem)
...
Each level of the
tree provided a smaller fraction of work
...
Even with an inﬁnite number of levels, the geometric series that result
will converge to a constant value
...
A common way to design
the most efﬁcient divide-and-conquer algorithms is to try to arrange the recursion so that most of the
work is done at the root, and at each successive level of the tree the work at this level reduces (by some
constant factor)
...

30

Lecture Notes

CMSC 251

Finally, in the recurrence T (n) = 4T (n/3) + n (which corresponds to Case 1), most of the work is
done at the leaf level of the recursion tree
...

(You might try this to see if you get the same result
...
The largest
contribution will be from the leaf level
...
2 and 10
...
You are not responsible for the randomized
analysis of Section 10
...
Our presentation of the partitioning algorithm and analysis are somewhat different
from the ones in the book
...
Today we will give a rather surprising (and very tricky) algorithm which shows
the power of these techniques
...

Suppose that you are given a set of n numbers
...
Since duplicate elements make our life more
complex (by creating multiple elements of the same rank), we will make the simplifying assumption
that all the elements are distinct for now
...
Thus, the
rank of an element is its ﬁnal position if the set is sorted
...

Of particular interest in statistics is the median
...
When n is even there are two natural choices, namely the elements of ranks n/2
and (n/2) + 1
...
We will deﬁne
the median to be either of these elements
...
For example, the median income in a community is likely to be more meaningful
measure of the central tendency than the average is, since if Bill Gates lives in your community then
his gigantic income may signiﬁcantly bias the average, whereas it cannot have a signiﬁcant inﬂuence
on the median
...
Today we will focus on the
following generalization, called the selection problem
...

The selection problem can easily be solved in Θ(n log n) time, simply by sorting the numbers of A,
and then returning A[k]
...
In particular, is it possible
to solve this problem in Θ(n) time? We will see that the answer is yes, and the solution is far from
obvious
...
We think of divide-and-conquer as breaking the problem into a small number of smaller subproblems, which are then solved recursively
...

31

Lecture Notes

CMSC 251

The sieve technique works in phases as follows
...
We do not know which item is of interest, however
after doing some amount of analysis of the data, taking say Θ(nk ) time, for some constant k, we ﬁnd
that we do not know what the desired item is, but we can identify a large enough number of elements
that cannot be the desired value, and can be eliminated from further consideration
...
g
...
0001n)
...
Each of the resulting
recursive solutions then do the same thing, eliminating a constant fraction of the remaining set
...
Recall that we are given an array A[1
...
Since the algorithm will be applied inductively, we will assume that we
are given a subarray A[p
...
The initial call will be to the entire array A[1
...

There are two principal algorithms for solving the selection problem, but they differ only in one step,
which involves judiciously choosing an item from the array, called the pivot element, which we will
denote by x
...

We then partition A into three parts
...
q − 1] will contain all
the elements that are less than x, and A[q + 1
...

(Recall that we assumed that all the elements are distinct
...
This is illustrated below
...
q−1] < x
A[q+1
...

It is easy to see that the rank of the pivot x is q −p+1 in A[p
...
Let x rnk = q −p +1
...
If k < x rnk , then we know that we need
to recursively search in A[p
...
r]
...

Here is the complete pseudocode
...
r]
// only 1 item left, return it

Lecture Notes

CMSC 251

else {
x = Choose_Pivot(A, p, r)
// choose the pivot element
q = Partition(A, p, r, x)
// partition ...
r]>
x_rnk = q - p + 1
// rank of the pivot
if (k == x_rnk) return x
// the pivot is the kth smallest
else if (k < x_rnk)
return Select(A, p, q-1, k)
// select from left subarray
else
return Select(A, q+1, r, k-x_rnk)// select from right subarray
}
}

Notice that this algorithm satisﬁes the basic form of a sieve algorithm
...

When k = x rnk then we get lucky and eliminate everything
...

We will discuss the details of choosing the pivot and partitioning later, but assume for now that they
both take Θ(n) time
...
In fact, if x is one of the smallest elements of A or one of the largest, then we get
into trouble, because we may only eliminate it and the few smaller or larger elements of A
...

Let us suppose for now (optimistically) that we are able to design the procedure Choose Pivot in
such a way that is eliminates exactly half the array with each phase, meaning that we recurse on the
remaining n/2 elements
...

T (n) =

1
T (n/2) + n

if n = 1,
otherwise
...
If we expand this recurrence
level by level we see that we get the summation
T (n) = n +

n n
+ + ··· ≤
2
4

∞
i=0

∞

n
1
= n

...
For any c such that |c| < 1,
Using this we have
T (n) ≤ 2n ∈ O(n)
...

(This only proves the upper bound on the running time, but it is easy to see that it takes at least Ω(n)
time, so the total running time is Θ(n)
...
Normally you would think that in order to design a Θ(n) time algorithm
you could only make a single, or perhaps a constant number of passes over the data set
...
However, because we eliminate a constant fraction
of elements with each phase, we get this convergent geometric series in the analysis, which shows that
the total running time is indeed linear in n
...
It is often possible
to achieve running times in ways that you would not expect
...
If we eliminated even one per cent, then
the recurrence would have been T (n) = T (99n/100)+n, and we would have gotten a geometric series
involving 99/100, which is still less than 1, implying a convergent series
...

33

Lecture Notes

CMSC 251

Choosing the Pivot: There are two issues that we have left unresolved
...
Both need to be solved in Θ(n) time
...
Later, when we discuss QuickSort, we will discuss
partitioning in detail
...
Recall that before we said that
we might think of the pivot as a random element of A
...
Let’s see
why
...
Let’s consider the top of the recurrence, when we are given A[1
...
Suppose
that the pivot x turns out to be of rank q in the array
...
q − 1] < x, A[q] = x and A[q + 1
...
If k = q, then we are done
...
They are of sizes q − 1 and n − q, respectively
...
Thus if q > n/2, then we may
have to recurse on the left subarray of size q − 1, and if q < n/2, then we may have to recurse on the
right subarray of size n − q
...

If we could select q so that it is roughly of middle rank, then we will be in good shape
...
Earlier we said that we
might think of the pivot as a random element of the array A
...

The reason is that roughly half of the elements lie between ranks n/4 and 3n/4, so picking a random
element as the pivot will succeed about half the time to eliminate at least n/4
...
We
will return to this later
...
Recall that we are given an array A[1
...
We will have to describe this algorithm at a very high
level, since the details are rather involved
...
g
...
5], A[6
...
15], etc
...
This can
easily be done in Θ(n) time
...
There will be m group medians
...

For example, we could just BubbleSort each group and take the middle element
...
Copy the group
medians to a new array B
...
For this, we will have to call the
selection algorithm recursively on B, e
...
Select(B, 1, m, k), where m = n/5 , and
k = (m + 1)/2
...
Return x as the desired pivot
...
To establish the correctness of this procedure, we need
to argue that x satisﬁes the desired rank properties
...

Proof: We will show that x is of rank at least n/4
...
To do this, we need to show that there are at least n/4 elements that are less than or
equal to x
...
Consider the groups shown in the tabular form
above
...
(Because x is

34

Lecture Notes

CMSC 251

14

32

23

5

10

60

29

6

2

3

5

8

1

11

57

2

52

44

27

21

11

14

25

12

17

10

21

29

24

43

12

17

48

1

58

24

30

23

34

19

41

39

6

30

63

34

8

55

39

37

32

52

44

27

55

58

37

25

3

64

19

41

57

43

63

64

48

60

Group

Get group medians

8

3

6

2

5

11

1

10

12

14

25

17

29

21

19

23

24

30

34

39

41

27

52

37

32

44

58

55

48

63

57

43

64

60

Get median of medians
(Sorting of group medians is not really performed)

Figure 8: Choosing the Pivot
...

their median
...
Therefore, there are at least
3((n/5)/2 = 3n/10 ≥ n/4 elements that are less than or equal to x in the entire array
...
We achieved
the main goal, namely that of eliminating a constant fraction (at least 1/4) of the remaining list at each
stage of the algorithm
...

However, in order to achieve this, within Select Pivot() we needed to make a recursive call to
Select() on an array B consisting of n/5 elements
...
As
usual, we will ignore ﬂoors and ceilings, and write the Θ(n) as n for concreteness
...

This is a very strange recurrence because it involves a mixture of different fractions (n/5 and 3n/4)
...
However, this is a good place to apply constructive induction
...

Theorem: There is a constant c, such that T (n) ≤ cn
...

Step: We assume that T (n ) ≤ cn for all n < n
...
By
deﬁnition we have
T (n) = T (n/5) + T (3n/4) + n
...

20
20

≤ c
=

+n

This last expression will be ≤ cn, provided that we select c such that c ≥ (19c/20) + 1
...

Combining the constraints that c ≥ 1, and c ≥ 20, we see that by letting c = 20, we are done
...
(You might try it replacing the 5 with 3, 4, or 6 and
see what happens
...

Ofﬁce hours: The TA, Kyongil, will have extra ofﬁce hours on Monday before the midterm, from 1:00-2:00
...

Long Integer Multiplication: The following little algorithm shows a bit more about the surprising applications of divide-and-conquer
...
The reason for doing arithmetic on long numbers stems
from cryptography
...
For
example, the character string to be encrypted is converted into a sequence of numbers, and encryption
keys are stored as long integers
...

Addition and subtraction on large numbers is relatively easy
...
(Go back and analyze your solution to the problem on Homework 1)
...

This raises the question of whether there is a more efﬁcient way to multiply two very large numbers
...
In fact, we will see that it is possible
...
We normally think of this algorithm as applying on a digit-by-digit basis, but if we partition an n digit number
into two “super digits” with roughly n/2 each into longer sequences, the same multiplication rule still
applies
...
Let A and B be the two numbers to multiply
...
Because of the way we write numbers, it is
more natural to think of the elements of A as being indexed in decreasing order from left to right as
A[n − 1
...
n − 1]
...
Let
w
y

= A[n − 1
...
m]

x = A[m − 1
...
0]
...

If we think of w, x, y and z as n/2 digit numbers, we can express A and B as
A

=

w · 10m + x

B

=

y · 10m + z,

and their product is
mult(A, B) = mult(w, y)102m + (mult(w, z) + mult(x, y))10m + mult(x, z)
...
Observe that all the additions involve
numbers involving roughly n/2 digits, and so they take Θ(n) time each
...
This suggests that
if we were to implement this algorithm, its running time would be given by the following recurrence
T (n) =

1
4T (n/2) + n

if n = 1,
otherwise
...
Unfortunately, this is no better than the standard
algorithm
...
It shows that the critical element is the number of multiplications on numbers of size n/2
...
So, if we could ﬁnd a way to arrive at the same result algebraically, but by trading
off multiplications in favor of additions, then we would have a more efﬁcient algorithm
...
)
The key turns out to be a algebraic “trick”
...
Above, it took us four multiplications to compute these
...

C

=

mult(w, y)

D
E

=
=

mult(x, z)
mult((w + x), (y + z)) − C − D = (wy + wz + xy + xz) − wy − xz = (wz + xy)
...

Altogether we perform 3 multiplications, 4 additions, and 2 subtractions all of numbers with n/2
digitis
...
The additions, subtractions, and
shifts take Θ(n) time in total
...

Now when we apply the Master Theorem, we have a = 3, b = 2 and k = 1, yielding T (n) ∈
Θ(nlg 3 ) ≈ Θ(n1
...

Is this really an improvement? This algorithm carries a larger constant factor because of the overhead
of recursion and the additional arithmetic operations
...
For example, if we assume that the clever algorithm has overheads
that are 5 times greater than the simple algorithm (e
...
5n1
...
If the overhead was 10 times larger, then the crossover would occur for
n ≥ 260
...
Generally you are
responsible for anything discussed in class, and anything appearing on homeworks
...

Worst-case, Average-case: Recall that a worst-case means that we consider the highest running time
over all inputs of size n, average case means that we average running times over all inputs of size
n (and generally weighting each input by its probability of occuring)
...
)
General analysis methods: Be sure you understand the induction proofs given in class and on the
homeworks
...

Summations: Write down (and practice recognizing) the basic formulas for summations
...
Practice with simplifying summations
...

i

Also be sure you can apply the integration rule to summations
...
3 of CLR
...

Know the what the other forms, o and ω, mean informally
...
I’ll be happy to check any of your answers
...
For example which is larger lg n or lg n? (It is the former, can you see
why?) Remember the following rule and know how to use it
...

n→∞ nc
lim

(Chapt
...
)
Recurrences: Know how to analyze the running time of a recursive program by expressing it as a
recurrence
...
(You are NOT responsible for the more complex version given in
the text
...
4
...
Understand the
divide-and-conquer algorithm for MergeSort, and be able to work an example by hand
...
(Chapt
10 on Medians; skip the randomized analysis
...
)

Lecture 11: First Midterm Exam
(Tuesday, March 3, 1998)
First midterm exam today
...

Lecture 12: Heaps and HeapSort
(Thursday, Mar 5, 1998)
Read: Chapt 7 in CLR
...
The reasons for studying sorting
algorithms in details are twofold
...
Procedures
for sorting are parts of many large software systems, either explicitly or implicitly
...
The other reason is
more pedagogical
...
Some possess certain
desirable properties, and others do not
...
Thus, sorting forms an interesting case study in algorithm
theory
...
n] of n numbers, and are asked to reorder these
elements into increasing order
...
The key value need not be a number
...
Totally ordered means that for any two elements of
the domain, x, and y, either x < y, x =, or x > y
...
For example, sets can
be partially ordered under the subset relation, ⊂, but this is not a total order, it is not true that for any
two sets either x ⊂ y, x = y or x ⊃ y
...
We may discuss this later
...
These include the
following:
Bubblesort: Scan the array
...
Repeat until all consecutive items are in order
...
i − 1] have already been sorted
...

Selection sort: Assume that A[1
...
Find the
smallest element in A[i
...

These algorithms are all easy to implement, but they run in Θ(n2 ) time in the worst case
...
We will study two others,
HeapSort and QuickSort
...
A heap is
a concrete implementation of an abstract data type called a priority queue
...
A simple priority
queue supports three basic operations:
Create: Create an empty queue
...

ExtractMax: Return the element with maximum key value from the queue
...
It is easy to modify the implementation (by reversing < and >
to do this
...

Adjust Priority: Change the priority of an item in the queue
...

Heaps: A heap is a data structure that supports the main priority queue operations (insert and delete max)
in Θ(log n) time
...

By a binary tree we mean a data structure which is either empty or else it consists of three things: a
root node, a left subtree and a right subtree
...

They are called the left child and right child of the root node
...
A nonleaf node is called an internal node
...

The depth of a node in a binary tree is its distance from the root
...
The height of a binary tree is its maximum depth
...
An important fact about a complete binary trees is that a complete binary tree of
height h has
h

n = 1 + 2 +
...
If we solve for h in terms of n, we see that the the height of a complete binary tree
with n nodes is h = (lg(n + 1)) − 1 ≈ lg n ∈ Θ(log n)
...
This means that all the levels of the tree are full
except the bottommost level, which is ﬁlled from left to right
...
The keys of
a heap are stored in something called heap order
...
This implies that as you follow any path from a leaf to the root the keys
appear in (nonstrict) increasing order
...

16
14

10

8
2

Left−complete Binary Tree

7
4

9

3

1

Heap ordering

Figure 11: Heap
...

Lecture 13: HeapSort
(Tuesday, Mar 10, 1998)
Read: Chapt 7 in CLR
...
It consists of a left-complete binary tree (meaning that all levels of
the tree except possibly the bottommost) are full, and the bottommost level is ﬁlled from left to right
...
The keys of the heap are stored in the tree in what is called heap order
...
From this it follows that the
largest key in the heap appears at the root
...
The
reason for this is the left-complete nature of the tree
...
n]
...
For this reason, we maintain a variable
m ≤ n which keeps track of the current number of elements that are actually stored actively in the
heap
...
m]
...
Because the binary tree is leftcomplete, we know exactly how many elements each level will supply
...
Only the bottommost level may supply fewer than the
appropriate power of 2, but then we can use the value of m to determine where the last element is
...

We should emphasize that this only works because the tree is left-complete
...

41

Lecture Notes

CMSC 251

16 1
1
2 14
4
8

2

8

7
4

9

1

5

6

3

9

4

5

6

7

8

9 10 11 12 13

16 14 10 8

10 3

2

3

7

9

3

2

4

1

7
m=10

n=13

10

Heap as a binary tree

Heap as an array

Figure 12: Storing a heap in an array
...

In particular it is easy to see the following
...

Right(i) : return 2i + 1
...

IsLeaf (i) : return Left(i) > m
...
)
IsRoot(i) : return i == 1
...

So is a heap a binary tree or an array? The answer is that from a conceptual standpoint, it is a binary
tree
...

Maintaining the Heap Property: There is one principal operation for maintaining the heap property
...
(In other books it is sometimes called sifting down
...
In particular this root element may
be too small
...
Which child?
We should take the larger of the two children to satisfy the heap ordering property
...
Here is the pseudocode
...
The element A[max ] is set to the maximum of A[i]
and it two children
...

Heapify
Heapify(array A, int i, int m) {
l = Left(i)
r = Right(i)
max = i
if (l <= m and A[l] > A[max]) max = l
if (r <= m and A[r] > A[max]) max = r
if (max != i) {
swap A[i] with A[max]
Heapify(A, max, m)
}
}

42

// sift down A[i] in A[1
...
2 on page 143 of CLR for an example of how Heapify works (in the case where m = 10)
...
You might try simulating this same algorithm on the array, to see how
it works at a ﬁner details
...
We have done so because
many algorithms on trees are most naturally implemented using recursion, so it is nice to practice this
here
...
This is left as an exercise
...
First building a heap, and then extracting the
maximum elements from the heap, one by one
...

How long does Hepify take to run? Observe that we perform a constant amount of work at each level
of the tree until we make a call to Heapify at the next lower level of the tree
...
Since there are Θ(log n) levels altogether in the tree, the total
time for Heapify is O(log n)
...
)
Building a Heap: We can use Heapify to build a heap as follows
...
They are just in the same order that they were given to us in the array
A
...
(Note: We
cannot start at the top of the tree
...
) Actually, we can be a bit more
efﬁcient
...
This will be in position n/2
...
Since we will work with the entire array, the parameter m for Heapify, which indicates
the current heap size will be equal to n, the size of array A, in all the calls
...
n]) {
for i = n/2 downto 1 {
Heapify(A, i, n)
}
}

// build heap from A[1
...
3 on page 146 of CLR
...
Next time we will show that this actually runs faster, and in fact it runs in Θ(n) time
...
The idea is that we need to repeatedly extract the
maximum item from the heap
...
But
once we remove it we are left with a hole in the tree
...
But now the heap order will very likely be destroyed
...

HeapSort
HeapSort(int n, array A[1
...
n]
// build the heap
// initially heap contains all
// extract the m-th largest
// unlink A[m] from heap

43

Lecture Notes

CMSC 251

Heapify(A, 1, m)

// fix things up

}
}

An example of HeapSort is shown in Figure 7
...
We make n − 1 calls to Heapify,
each of which takes O(log n) time
...

Lecture 14: HeapSort Analysis and Partitioning
(Thursday, Mar 12, 1998)
Read: Chapt 7 and 8 in CLR
...

HeapSort Analysis: Last time we presented HeapSort
...
One clever aspect of the data structure is that it resides inside the
array to be sorted
...

Based on this we can see that (1) that it takes O(n log n) time to build a heap, because we need
to apply Heapify roughly n/2 times (to each of the internal nodes), and (2) that it takes O(n log n)
time to extract each of the maximum elements, since we need to extract roughly n elements and each
extraction involves a constant amount of work and one Heapify
...

Is this tight? That is, is the running time Θ(n log n)? The answer is yes
...
However, it turns out that the ﬁrst part of the analysis is not tight
...
Although in the wider context of the HeapSort
algorithm this is not signiﬁcant (because the running time is dominated by the Θ(n log n) extraction
phase)
...
For example, it
is common to extract some unknown number of the smallest elements until some criterion (depending
on the particular application) is met
...

BuildHeap Analysis: Let us consider the running time of BuildHeap more carefully
...
In this case the most convenient assumption is
that n is of the form n = 2h+1 − 1, where h is the height of the tree
...
This assumption
will save us from worrying about ﬂoors and ceilings
...
All the leaves reside on level h
...
In the worst case the element might sift down all the way to the leaf
level
...

At the bottommost level there are 2h nodes, but we do not call Heapify on any of these so the work is
0
...
At the 3rd
level from the bottom there are 2h−2 nodes, and each might sift down 2 levels
...

from the bottom there are 2h−j nodes, and each might sift down j levels
...

2j

If we factor out the 2h term, we have
h

T (n) = 2h
j=0

j

...
We could try to approximate it by an integral, which
would involve integration by parts, but it turns out that there is a very cute solution to this particular
sum
...
First, write down the inﬁnite general geometric series,
for any constant x < 1
...

xj =
1−x
j=0
Then take the derivative of both sides with respect to x, and multiply by x giving:
∞

jxj−1 =
j=0

∞

1
(1 − x)2

jxj =
j=0

x
,
(1 − x)2

and if we plug x = 1/2, then voila! we have the desired formula:
∞
j=0

1/2
1/2
j
= 2
...

Using this we have
h

T (n) = 2h
j=0

∞

j
j
≤ 2h
≤ 2h · 2 = 2h+1
...
Clearly the algorithm takes at least
Ω(n) time (since it must access every element of the array at least once) so the total running time for
BuildHeap is Θ(n)
...
This is the second time we have seen a relatively complex
structured algorithm, with doubly nested loops, come out with a running time of Θ(n)
...
Actually if you think deeply about
this, there is a sense in which a parallel version of BuildHeap can be viewed as operating like a sieve,
but maybe this is getting too philosophical
...
This is that the vast majority of nodes are at the
lowest level of the tree
...
875n
...
Thus the lesson
to be learned is that when designing algorithms that operate on trees, it is important to be most efﬁcient
on the bottommost levels of the tree (as BuildHeap is) since that is where most of the weight of the tree
resides
...
QuickSort is interesting in a number of respects
...
We will show that in the worst case its running time is O(n2 ), its expected
case running time is O(n log n)
...
In addition, QuickSort has a better locality-of-reference behavior than either
MergeSort or HeapSort, and thus it tends to run fastest of all three algorithms
...
QuickSort (and its variants) are considered the methods of choice for most standard library
sorting algorithms
...
Today we will discuss one aspect of QuickSort, namely the
partitioning algorithm
...
We are given an array A[p
...
Recall that the partitioning algorithm is suppose to partition A into three subarrays:
A[p
...
r] whose elements
are greater than or equal to x
...
If a different rule is used for selecting x, this is easily handled by swapping this element
with A[p] before calling this procedure
...
1)
...
(But I suspect that the one in the
text is probably a bit for efﬁcient for actual implementation
...
The subarray is broken into
four segments
...

(1) A[p] = x is the pivot value,
(2) A[p + 1
...
s − 1] contains items that are greater than or equal to x, and
(4) A[s
...

This is illustrated below
...
With each step through the algorithm we test the
value of A[s] against x
...
Otherwise we increment q, swap
A[s] with A[q], and then increment s
...
In the
ﬁrst case this is obvious
...
The algorithm ends when s = r, meaning that
46

Lecture Notes

CMSC 251

p
x

r

>= x

?

q

Intermediate
configuration

s

p
x

r
?

Initial configuration

q s
swap

x

>= x

Final configuration

q
Figure 14: Partitioning intermediate structure
...
To ﬁnish things off we swap A[p] (the pivot) with A[q], and
return the value of q
...
r]
// pivot item in A[p]

// put the pivot into final position
// return location of pivot

An example is shown below
...
Fixed a bug in the analysis
...
My presentation and analysis are somewhat different than the text’s
...

Today we will study QuickSort
...

We will present QuickSort as a randomized algorithm, that is, an algorithm which makes random
choices
...
Usually the lower you make this probability,
the longer the algorithm takes to run
...

Las Vegas algorithms: These algorithms always produce the correct result, but the running time is a
random variable
...

The most well known Monte Carlo algorithm is one for determining whether a number is prime
...
The QuickSort algorithm that we will discuss today is an example of a Las Vegas algorithm
...

QuickSort Overview: QuickSort is also based on the divide-and-conquer design paradigm
...
Here is an overview of QuickSort
...
Let A[p
...
The initial call
is to A[1
...

Basis: If the list contains 0 or 1 elements, then return
...

Partition: Partition the array in three subarrays, those elements A[1
...
n] ≥ x
...
q − 1] and A[q + 1
...

The pseudocode for QuickSort is given below
...
The Partition
routine was discussed last time
...
Since we want a random pivot, we pick a random index i from p to r, and then swap A[i] with
A[p]
...
r]

48

// Sort A[p
...
q-1]
sort A[q+1
...
However its analysis is not
so obvious
...
In particular, if the rank of the pivot (recall that this means its position in
the ﬁnal sorted list) is very large or very small, then the partition will be unbalanced
...
However,
if the rank of the pivot is anywhere near the middle portion of the array, then the split will be reasonably
well balanced, and the overall running time will be good
...
We will see that the expected
running time is O(n log n)
...
Since this is a recursive program, it is natural to use a recurrence to describe its running
time
...
It depends on how the pivot is chosen
...
n],
and further suppose that the pivot that we select is of rank q, for some q in the range 1 to n
...
The ﬁrst is to
the subarray A[1
...
n] which
has r − (q + 1) + 1 = r − q elements
...

This depends on the value of q
...
As a
basis we have that T (0) = T (1) = Θ(1)
...

Recurrences that have max’s and min’s embedded in them are very messy to solve
...
(A rule of thumb of algorithm analysis is that the worst
cases tends to happen either at the extremes or in the middle
...
) In this case, the worst case happens at either of the extremes (but see
the book for a more careful analysis based on an analysis of the second derivative)
...

k−2

=

(n − i)
...
+ (n − 1) + n + (n + 1))
n+1

i =

≤
i=1

(n + 1)(n + 2)
∈ O(n2 )
...

Average-case Analysis: Next we show that in the average case QuickSort runs in Θ(n log n) time
...
However, in this case, the analysis does not depend on
the input distribution at all—it only depends on the random choices that the algorithm makes
...
In
this case the average is computed over all possible random choices that the algorithm might make for
the choice of the pivot index in the second step of the QuickSort procedure above
...
It will simplify the analysis to assume that all of the elements are distinct
...

So we can modify the above recurrence to compute an average rather than a max, giving:
1

T (n) =

1
n

n
q=1 (T (q

− 1) + T (n − q) + n)

if n ≤ 1
otherwise
...
Expansion is possible, but
rather tricky
...
We know that we want a
Θ(n log n) running time, so let’s try T (n) ≤ an lg n + b
...
)
Theorem: There exist a constant c such that T (n) ≤ cn ln n, for all n ≥ 2
...
This has been done to make the proof easier, as we shall see
...
For the basis case n = 2 we have
T (2)

=
=

1
2

2

(T (q − 1) + T (2 − q) + 2)
q=1

8
1
((T (0) + T (1) + 2) + (T (1) + T (0) + 2) =
= 4
...
885
...
We want to prove it is true for T (n)
...

q=1

50

Lecture Notes

CMSC 251

Observe that if we split the sum into two sums, they both add the same values T (0) + T (1) +

...
Thus we can replace this with
n−1
2 q=0 T (q)
...
If we make this substitution and apply the induction hypothesis to the remaining sum
we have (which we can because q < n) we have
T (n)

=
≤
=

2
n
2
n
2c
n

n−1

T (q)

+n =

q=0

2
n

n−1

T (0) + T (1) +

T (q)

+n

q=2

n−1

1+1+

(cq lg q)

+n

q=2
n−1

(cq ln q)

+n+

q=2

4

...
Later we will show that
n−1

S(n) =

q ln q ≤
q=2

n2
n2
ln n −

...

n

To ﬁnish the proof, we want all of this to be at most cn ln n
...

2
n

After some simple manipulations we see that this is equivalent to:
4
cn
+
2
n
4
cn
≥ n+
2
n
8
c ≥ 2+ 2
...
From
9
the basis case we have c ≥ 2
...

The Leftover Sum: The only missing element to the proof is dealing with the sum
n−1

q ln q
...
For any monotonically increasing function f (x)
b−1

b

f (i) ≤

f (x)dx
...

2

If you are a calculus macho man, then you can integrate this by parts, and if you are a calculus
wimp (like me) then you can look it up in a book of integrals
n

x ln xdx =
2

x2
x2
ln x −
2
4

n

=
x=2

n2
n2
ln n −
2
4

− (2 ln 2 − 1) ≤

n2
n2
ln n −

...

Summary: So even though the worst-case running time of QuickSort is Θ(n2 ), the average-case running
time is Θ(n log n)
...
For large values of n, the running time is Θ(n log n) with high probability
...
Poor choices are
rare, and so continuously making poor choices are very rare
...
The answer
is that this would work, but the resulting algorithm would be so slow practically that no one would ever
use it
...
In MergeSort and in the expected case for QuickSort, the size of the stack is O(log n),
so this is not really a problem
...
The reason for this stems from the fact that (unlike Heapsort) which
can make large jumps around in the array, the main work in QuickSort (in partitioning) spends most of
its time accessing elements that are close to one another
...
In MergeSort we are always comparing two array elements against
each other
...
g
...

Lecture 16: Lower Bounds for Sorting
(Thursday, Mar 19, 1998)
Read: Chapt
...

Review of Sorting: So far we have seen a number of algorithms for sorting a list of numbers in ascending
order
...
A sorting algorithm is stable if duplicate elements remain in the same relative
position after sorting
...
These are all simple Θ(n2 )
in-place sorting algorithms
...

Mergesort: Mergesort is a stable Θ(n log n) sorting algorithm
...

Quicksort: Widely regarded as the fastest of the fast algorithms
...
The probability that the algorithm takes asymptotically longer (assuming that the pivot is chosen randomly) is extremely small for large n
...

Heapsort: Heapsort is based on a nice data structure, called a heap, which is a fast priority queue
...
(It is also easy to set up a heap for extracting the smallest item
...
It is an
in-place algorithm, but it is not stable
...
Such an algorithm is called a
comparison-based sorting algorithm, and includes all of the algorithms given above
...
This does not preclude the possibility of a sorting algorithm whose
actions are determined by other types of operations, for example, consulting the individual bits of
numbers, performing arithmetic operations, indexing into an array based on arithmetic operations on
keys
...
, an must
make at least Ω(n log n) comparisons in the worst-case
...

It is easy to show that a problem can be solved fast (just give an algorithm)
...
In fact, it seems surprising that you could even hope to prove such a thing
...

Decision Tree Argument: In order to prove lower bounds, we need an abstract way of modeling “any possible” comparison-based sorting algorithm, we model such algorithms in terms of an abstract model
called a decision tree
...
Let a1 , a2 ,
...
Given two elements, ai and
aj , their relative order can only be determined by the results of comparisons like ai < aj , ai ≤ aj ,
ai = aj , ai ≥ aj , and ai > aj
...
Each
node of the decision tree represents a comparison made in the algorithm (e
...
, a4 : a7 ), and the two
branches represent the possible results, for example, the left subtree consists of the remaining comparisons made under the assumption that a4 ≤ a7 and the right subtree for a4 > a7
...
)
Observe that once we know the value of n, then the “action” of the sorting algorithm is completely
determined by the results of its comparisons
...
But the bottom-line is that at the end of the algorithm, the keys will be permuted in the ﬁnal array
53

Lecture Notes

CMSC 251

in some way
...

To make this more concrete, let us assume that n = 3, and let’s build a decision tree for SelectionSort
...
The ﬁrst ﬁnds the smallest element of the entire list,
and swaps it with the ﬁrst element
...
Here is the decision tree (in outline form)
...
The possible results are:
a1 ≤ a2 : Then a1 is the current minimum
...
Then we pass to phase 2 and compare a2 with a3
...

a2 > a3 : These two are swapped and the ﬁnal output is a1 , a3 , a2
...
The we pass to phase 2 and compare a2 with a1 (which is now in the third position of
the array) yielding either:
a2 ≤ a1 : Final output is a3 , a2 , a1
...

a1 > a2 : Then a2 is the current minimum
...
We swap a2 with a1 , and then pass to
phase 2, and compare the remaining items a1 and a3
...

a1 > a3 : These two are swapped and the ﬁnal output is a2 , a3 , a1
...
We pass to phase 2 and compare a2 with a1 (which is now in the third position of the
array) yielding either:
a2 ≤ a1 : Final output is a3 , a2 , a1
...

The ﬁnal decision tree is shown below
...
For example, in order to reach the fourth leaf from the left it must be that a1 ≤ a2 and a1 > a2 , which cannot
both be true
...
) As you can see, converting a complex
sorting algorithm like HeapSort into a decision tree for a large value of n will be very tedious and
complex, but I hope you are convinced by this exercise that it can be done in a simple mechanical way
...

54

>
3,1,2

Lecture Notes

CMSC 251

Using Decision Trees for Analyzing Sorting: Consider any sorting algorithm
...
Notice that the running time
fo the algorithm must be at least as large as T (n), since we are not counting data movement or other
computations at all
...
Observe that the height of the decision
tree is exactly equal to T (n), because any path from the root to a leaf corresponds to a sequence of
comparisons made by the algorithm
...
This means that this
sorting algorithm can distinguish between at most 2T (n) different ﬁnal actions
...
Each action can be thought of as
a speciﬁc way of permuting the oringinal input to get the sorted output
...
For each
different permutation, the algorithm must “unscramble” the numbers in an essentially different way,
that is it must take a different action, implying that A(n) ≥ n!
...
)
Since A(n) ≤ 2T (n) we have 2T (n) ≥ n!, implying that
T (n) ≥ lg(n!)
...

≥

Thus we have, the following theorem
...

This can be generalized to show that the average-case time to sort is also Ω(n log n) (by arguing about
the average height of a leaf in a tree with at least n! leaves)
...

Lecture 17: Linear Time Sorting
(Tuesday, Mar 31, 1998)
Read: Chapt
...

Linear Time Sorting: Last time we presented a proof that it is not possible to sort faster than Ω(n log n)
time assuming that the algorithm is based on making 2-way comparisons
...

This lower bound implies that if we hope to sort numbers faster than in O(n log n) time, we cannot
do it by making comparisons alone
...
They answer is yes, but only under very restrictive circumstances
...
g
...
We present three algorithms based on the theme of speeding up
sorting in special cases, by not making comparisons
...
The algorithm
sorts in Θ(n + k) time
...

The basic idea is to determine, for each element in the input array, its rank in the ﬁnal sorted array
...

Notice that once you know the rank of every element, you sort by simply copying each element to the
appropriate location of the ﬁnal sorted output array
...

As usual A[1
...
Recall that although we usually think of A as just being a list of
numbers, it is actually a list of records, and the numeric value is the key on which the list is being
sorted
...
key
...
n] : Holds the initial input
...
A[j]
...

B[1
...

R[1
...
R[x] is the rank of x in A, where x ∈ [1
...

The algorithm is remarkably simple, but deceptively clever
...
We do this in two steps
...
We can do this initializing R to zero, and then for each j, from 1 to n, we increment
R[A[j]
...
Thus, if A[j]
...
To determine the number of elements that are less than or equal to x, we replace
R[x] with the sum of elements in the subarray R[1
...
This is done by just keeping a running total of
the elements of R
...
This means that if x = A[j]
...
Thus, we set B[R[x]] = A[j]
...
There is a subtlety here however
...
To do this, we
decrement R[i] after copying
...
key]++
for x = 2 to k do R[x] += R[x-1]
for j = n downto 1 do {
x = A[j]
...
n] to B[1
...
If k = O(n), then the total running time is Θ(n)
...
You should trace through a few examples, to convince
yourself how it works
...

Obviously this not an in-place sorting algorithm (we need two additional arrays)
...
I’ll leave it as an exercise to prove this
...
It would not be stable if the loop were running the other way
...
If the integers are in the range from 1 to 1 million, we may not want to allocate an
array of a million elements
...
Actually, what constitutes a “digit” is up to the implementor
...
There is a
tradeoff between the space and time
...
Let’s think of our list as being composed of n numbers, each having d decimal
digits (or digits in any base)
...
To sort these numbers we can simply sort repeatedly, starting at the lowest order digit,
and ﬁnishing with the highest order digit
...
As usual, let A[1
...
We
will not discuss how it is that A is broken into digits, but this might be done through bit manipulations
(shifting and masking off bits) or by accessing elements byte-by-byte, etc
...
n] with d digits
for i = 1 to d do {
Sort A (stably) with respect to i-th lowest order digit;
}
}

57

Lecture Notes

CMSC 251

Here is an example
...
This is usually a constant, but the algorithm’s running time
will be Θ(dn) as long as k ∈ O(n)
...
It can be any number in the
range from 1 to cn for some constant c, and we will still have an Θ(n) time algorithm
...
In
general, this can be used to sort numbers in the range from 1 to nd in Θ(dn) time
...
But you should be
careful about attempting to make generalizations when the sizes of the numbers are not bounded
...
Then it follows that you need
at least B = lg n bits to store these values
...
Thus, if you were
to apply radix sort in this situation, the running time would be Θ(dn) = Θ(n log n)
...
Furthermore, the locality of reference behavior of Counting Sort (and
hence of RadixSort) is not as good as QuickSort
...
This is at a level of similarity, where it would probably be best to implement
both algorithms on your particular machine to determine which is really faster
...
Recall that our goal is
to provide you with the necessary tools for designing and analyzing efﬁcient algorithms
...
However it is always a good idea to see the text to get a better insight into some of the topics
we have covered
...

• Review Chapts 1: InsertionSort and MergeSort
...
Look at Section 7
...

• Chapt 8: QuickSort
...
Section 8
...

• Chapt 9 (skip 9
...

58

Lecture Notes

CMSC 251

• Chapt
...
1): Selection
...
It is similar
to the QuickSort analysis
...

Cheat Sheets: The exam is closed-book, closed-notes, but you are allowed two sheets of notes (front and
back)
...
Also add Stirling’s approximation (page 35), and the
integration formula for summations (page 50)
...
But it is a good idea to write down a brief description of each algorithm
...

Keep track of algorithm running times and their limitations
...
You
can sort short integers in Θ(n) time through CountingSort, but you cannot use this algorithm to sort
arbitrary numbers, such as reals
...

Slow Algorithms: BubbleSort, InsertionSort, SelectionSort are all simple Θ(n2 ) algorithm
...
They are all in-place sorting algorithms (they use no additional array storage), and BubbleSort and InsertionSort are stable sorting algorithms (if implemented carefully)
...
It is stable, but requires additional array
storage for merging, and so it is not an in-place algorithm
...
Heaps are a
nice way of implementing a priority queue data structure, allowing insertions, and extracting the
maximum in Θ(log n) time, where n is the number of active elements in the heap
...
HeapSort is not stable, but it is an in-place sorting algorithm
...
If chosen randomly, then the expected
time is Θ(n log n), but the worst-case is Θ(n2 )
...
This algorithm is not stable, but it is considered an in-place
sorting algorithm even though it does require some additional array storage
...

Lower bounds: Assuming comparisons are used, you cannot sort faster than Ω(n log n) time
...

Counting sort: If you are sorting n small integers (in the range of 1 to k) then this algorithm will sort
them in Θ(n + k) time
...
In this way it circumvents the lower bound argument
...

What sort of questions might there be? Some will ask you to about the properties of these sorting
algorithms, or asking which algorithm would be most appropriate to use in a certain circumstance
...
Finally, there may be problems asking you to devise algorithms to
solve some sort of novel sorting problem
...
No lecture
...
4, 5
...

Graph Algorithms: For the next few weeks, we will be discussing a number of various topics
...
Intuitively, a graph is a collection of vertices or nodes, connected by a collection
of edges
...
Basically, any time you have a set of objects, and there is some
“connection” or “relationship” or “interaction” between pairs of objects, a graph is a good way to model
this
...
The list of application
is almost too long to even consider enumerating it
...
Furthermore, many of these problems form the
basic building blocks from which more complex algorithms are then built
...
e
...

For example, the ﬁgure below (left) shows a directed graph
...
Some deﬁnitions of graphs disallow this
...
This shows the graph G = (V, E) where V = {1, 2, 3} and
E = {(1, 1), (1, 2), (2, 3), (3, 2), (1, 3)}
...

In an undirected graph (or just graph) G = (V, E) the edge set consists of unordered pairs of distinct
vertices (thus self-loops are not allowed)
...

We say that vertex w is adjacent to vertex v if there is an edge from v to w
...
In a directed graph we
will often say that an edge either leaves or enters a vertex
...
The
meaning of the weight is dependent on the application, e
...
distance between vertices or ﬂow capacity
through the edge
...

Certain notions (such as path) are deﬁned for both, but other notions (such as connectivity) are only
deﬁned for one
...
In an undirected graph we just talk about the
degree of a vertex, as the number of edges which are incident on this vertex
...

In a directed graph, each edge contributes 1 to the in-degree of a vertex and contributes one to the
out-degree of each vertex, and thus we have
Observation: For a digraph G = (V, E),
in-deg(v) =
v∈V

out-deg(v) = |E|
...
e
...

In an undirected graph each edge contributes once to the outdegree of two different edges and thus we
have
Observation: For an undirected graph G = (V, E),
deg(v) = 2|E|
...
4, 5
...

Graphs: Last time we introduced the notion of a graph (undirected) and a digraph (directed)
...
Today we continue this discussion
...
For graphs edges
are undirected and for graphs they are directed
...
A path in a directed graph is a
sequence of vertices v0 , v1 ,
...
, k
...
We say that w is reachable from u if there is a path from u to w
...
A path is simple if all
vertices (except possibly the ﬁrst and last) are distinct
...
A cycle is simple if,
in addition, v1 ,
...
(Note: A self-loop counts as a simple cycle of length 1)
...
This is to rule out the degenerate cycle
u, w, u , which simply jumps back and forth along a single edge
...
A Hamiltonian cycle is a cycle that visits every vertex in a
graph exactly once
...
(By the way, this is pronounced “Oiler-ian” and not “Yooler-ian”
...

61

Lecture Notes

CMSC 251

One of the early problems which motivated interest in graph theory was the K¨ nigsberg Bridge Probo
lem
...
The question is whether it is possible
to cross all 7 bridges without visiting any bridge twice
...

1

1

3

4

3

4
2
2

Figure 19: Bridge’s at K¨ nigsberg Problem
...
In this graph, all 4 of the vertices have odd degree
...
A graph is
connected if every vertex can reach every other vertex
...
The term “free” is intended to emphasize the fact that the tree has no root, in
contrast to a rooted tree, as is usually seen in data structures
...
Furthermore, there is a unique path between any two vertices in a
free tree
...

The “reachability” relation is an equivalence relation on vertices, that is, it is reﬂexive (a vertex is
reachable from itself), symmetric (if u is reachable from v, then v is reachable from u), and transitive
(if u is reachable from v and v is reachable from w, then u is reachable from w)
...
These are called
connected components
...
An acyclic graph (which is not necessarily
connected) consists of many free trees, and is called (what else?) a forest
...
(There
is another type of connectivity in digraphs called weak connectivity, but we will not consider it
...
These are called the strongly connected components of the digraph
...
Note that it is different
from a directed tree
...

Isomorphic graphs are essentially “equal” except that their vertices have been given different names
...
For example,
consider the graphs in the ﬁgure
...
Observe that in all three cases all the vertices have degree 3, so that
is not much of a help
...
This implies that (a) cannot be isomorphic to either (b) or (c)
...

out that (b) and (c) are isomorphic
...
The notation
(u → v) means that vertex u from graph (b) is mapped to vertex v in graph (c)
...

{(1 → 1), (2 → 2), (3 → 3), (4 → 7), (5 → 8), (6 → 5), (7 → 10), (8 → 4), (9 → 6), (10 → 9)}
...
Given a subset V ⊆ V , the subgraph induced by V is the graph G = (V , E ) where
E = {(u, v) ∈ E | u, v ∈ V }
...

An undirected graph that has the maximum possible number of edges is called a complete graph
...
For example, K5 is the complete graph on 5
vertices
...
In other words, all the vertices of V are adjacent to one another
...
For example, in the
ﬁgure below (a), the subset {1, 2, 4, 6} is a clique, and {3, 4, 7, 8} is an independent set
...

A bipartite graph is an undirected graph in which the vertices can be partitioned into two sets V1 and V2
such that all the edges go between a vertex in V1 and V2 (never within the same group)
...

63

Lecture Notes

CMSC 251

¯
The complement of a graph G = (V, E), often denoted G is a graph on the same vertex set, but in
which the edge set has been complemented
...
This may also be
called the transpose and denoted GT
...
Planar
graphs are important special cases of graphs, since they arise in applications of geographic information
systems (as subdivisions of region into smaller subregions), circuits (where wires cannot cross), solid
modeling (for modeling complex surfaces as collections of small triangles)
...
For example, the ﬁgure below shows
two essentially different drawings of the same graph
...

The neighbors of a vertex are the vertices that it is adjacent to
...
For example, in the embedding
on the left, the neighbors of vertex 1 in counterclockwise order are 2, 3, 4, 5 , but on the right the order
is 2, 5, 4, 3
...

1

1

5

2

4

5

3

2

4

3

Figure 22: Planar Embeddings
...
For example, in the ﬁgure on the left, the triangular region bounded by vertices
1, 2, 5 is a face
...

This embedding has 6 faces (including the unbounded face)
...
Is it possible that two different embeddings have different numbers of faces? The answer is
no
...

Euler’s Formula: A connected planar embedding of a graph with V vertices, E edges, and F faces,
satisﬁes:
V − E + F = 2
...

Size Issues: When referring to graphs and digraphs we will always let n = |V | and e = |E|
...
Beware, the sometimes V is a set, and
sometimes it is a number
...
)
Because the running time of an algorithm will depend on the size of the graph, it is important to know
how n and e relate to one another
...
For an undirected graph e ≤
O(n2 )
...
One interesting question is what the minimum
number of edges that a connected graph must have
...

For example, the important class of planar graphs (graphs which can be drawn on the plane so that no
two edges cross over one another) e = O(n)
...
This is important to keep in mind when designing graph algorithms, because when n is really
large and O(n2 ) running time is often unacceptably large for real-time response
...
1 through 23
...

Representations of Graphs and Digraphs: We will describe two ways of representing graphs and digraphs
...
Let G = (V, E) be a digraph with n = |V | and let e = |E|
...
, n}
...

1
0

A[v, w] =

if (v, w) ∈ E
otherwise
...
For example if (v, w) ∈ E then
A[v, w] = W (v, w) (the weight on edge (v, w))
...
g
...
(By ∞
we mean (in practice) some number which is larger than any allowable weight
...
)
Adjacency List: An array Adj[1
...
e
...
If the edges have weights then these weights may also be stored in the linked list
elements
...

We can represent undirected graphs using exactly the same representation, but we will store each edge
twice
...
Notice that even though we represent undirected graphs in the same way that we

65

Lecture Notes

CMSC 251

represent digraphs, it is important to remember that these two classes of objects are mathematically
distinct from one another
...
For example, suppose you write an algorithm that operates by
marking edges of a graph
...
When dealing with adjacency
lists, it may not be convenient to walk down the entire linked list, so it is common to include cross links
between corresponding edges
...

An adjacency matrix requires Θ(n2 ) storage and an adjacency list requires Θ(n + e) storage (one entry
for each vertex in Adj and each list has outdeg(v) entries, which when summed is Θ(e)
...

Shortest Paths: To motivate our ﬁrst algorithm on graphs, consider the following problem
...
The length of a path in a graph
(without edge weights) is the number of edges on the path
...
If there are ties (two shortest paths of the same length) then either path
may be chosen arbitrarily
...
For each vertex v ∈ V , we will store d[v]
which is the distance (length of the shortest path) from s to v
...
We will also store a
predecessor (or parent) pointer π[v], which indicates the ﬁrst vertex along the shortest path if we walk
from v backwards to s
...

It may not be obvious at ﬁrst, but these single predecessor pointers are sufﬁcient to reconstruct the
shortest path to any vertex
...
For a path to be a shortest
path, every subpath of the path must be a shortest path
...
)
Using this observation, we know that the last edge on the shortest path from s to v is the edge (u, v),
then the ﬁrst part of the path must consist of a shortest path from s to u
...

Obviously, there is simple brute-force strategy for computing shortest paths
...

However, since there can be as many as n! simple paths in a graph (consider a complete graph), then
this strategy is clearly impractical
...
Start with the source vertex s
...
Label all of them with this distance
...

neighbors of these neighbors
...
Next consider the unvisited neighbors
of the neighbors of the neighbors, and so on
...
This algorithm can be visualized as simulating a wave propagating outwards from s, visiting
the vertices in bands at ever increasing distances from s
...
Deﬁne the distance between a vertex v and s to be the
minimum number of edges on a path from s to v
...
At any given
time there is a “frontier” of vertices that have been discovered, but not yet processed
...

Initially all vertices (except the source) are colored white, meaning that they have not been seen
...
When a gray vertex is
processed, then it becomes black
...
size(V)]
int color[1
...
size(V)]
queue Q = empty

//
//
//
//

for each u in V {
color[u] = white
d[u]
= INFINITY
pred[u] = NULL
}
color[s] = gray
d[s] = 0
enqueue(Q, s)
while (Q is nonempty) {
u = dequeue(Q)
for each v in Adj[u] {
if (color[v] == white) {
color[v] = gray
d[v]
= d[u]+1
pred[v] = u

67

vertex distances
vertex colors
predecessor pointers
FIFO queue

// initialization

// initialize source s
// put source in the queue
// u is the next vertex to visit
//
//
//
//

if neighbor v undiscovered

...
set its distance

...
put it in the queue

}
}
color[u] = black

// we are done with u

}
}

The search makes use of a queue, a ﬁrst-in ﬁrst-out list, where elements are removed in the same order
they are inserted
...

We will also maintain arrays color [u] which holds the color of vertex u (either white, gray or black),
pred[u] which points to the predecessor of u (i
...
the vertex who ﬁrst discovered u, and d[u], the
distance from s to u
...

u

v

w

u

v

w

u

v

w

?

?

?

?

1

?

2

1

2

0

1

?

0

?

?

0

1

?

t
Q: s

s

x

s
t
Q: v, x

x

s
t
Q: x, u, w

x

u

v

w

u

v

w

u

v

w

2

1

2

2

1

2

2

1

2

3

0

1

?

0

1

t
Q: w, t

s

x

t
Q: u, w

s

x

3

0

1

t
Q: t

s

x

Figure 26: Breadth-ﬁrst search: Example
...
If we reverse these
edges we get a rooted unordered tree called a BFS tree for G
...
) These edges of G are called tree edges and the remaining edges of G are called cross edges
...
The reason is that if any cross edge spanned two or
more levels, then when the vertex at the higher level (closer to the root) was being processed, it would
have discovered the other vertex, implying that the other vertex would appear on the very next level of
the tree, a contradiction
...
)
Analysis: The running time analysis of BFS is similar to the running time analysis of many graph traversal
algorithms
...
Observe that the initialization portion requires Θ(n) time
...
Since we never visit a vertex twice, the number of times we go
through the while loop is at most n (exactly n assuming each vertex is reachable from the source)
...
(The +1 is because even
if deg(u) = 0, we need to spend a constant amount of time to set up the loop
...

u∈V

68

Lecture Notes

CMSC 251

0
1
2

v

u

3

s
1

x

w

t

2

Figure 27: BFS tree
...

Lecture 23: All-Pairs Shortest Paths
(Tuesday, April 21, 1998)
Read: Chapt 26 (up to Section 26
...

All-Pairs Shortest Paths: Last time we showed how to compute shortest paths starting at a designated
source vertex, and assuming that there are no weights on the edges
...
First, we compute shortest paths not from a single vertex, but
from every vertex in the graph
...

Let G = (V, E) be a directed graph with edge weights
...
Intuitively, this weight denotes the distance of the road from u to v, or
more generally the cost of traveling from u to v
...
Intuitively a negative weight means that you get paid for traveling from u to v
...
, uk , the cost of this path is the sum of the edge weights:
k

cost(π) = W (u0 , u1 ) + W (u1 , u2 ) + · · · W (uk−1 , uk ) =

W (ui−1 , ui )
...
)
The distance between two vertices is the cost of the minimum cost path between them
...
We will present two algorithms for this problem
...
The latter is called the Floyd-Warshall
algorithm
...

For these algorithms, we will assume that the digraph is represented as an adjacency matrix, rather than
the more common adjacency list
...
However, storing all the distance information between
each pair of vertices, will quickly yield a dense digraph (since typically almost every vertex can reach
almost every other vertex)
...

Because both algorithms are matrix-based, we will employ common matrix notation, using i, j and k
to denote vertices rather than u, v, and w as we usually do
...
The edge weights may be positive, zero, or negative, but we assume that
69

Lecture Notes

CMSC 251

there are no cycles whose total weight is negative
...
If the
shortest path ever enters such a cycle, it would never exit
...
Thus, the shortest path would have a weight
of −∞, and would consist of an inﬁnite number of edges
...

Input Format: The input is an n × n matrix W of edge weights, which are based on the edge weights in the
digraph
...


if i = j,
 0
W (i, j)
if i = j and (i, j) ∈ E,
wij =

+∞
if i = j and (i, j) ∈ E
...
The reason for setting wii = 0 is that intuitively the cost
of getting from any vertex to should be 0, since we have no distance to travel
...
Notice that it cannot
be negative (otherwise we would have a negative cost cycle consisting of this single edge)
...

The output will be an n × n distance matrix D = dij where dij = δ(i, j), the shortest path cost from
vertex i to j
...
To help us do this, we will also
compute an auxiliary matrix pred [i, j]
...
If the shortest path travels directly from i to j without passing through
any other vertices, then pred [i, j] will be set to null
...

Dynamic Programming for Shortest Paths: The algorithm is based on a technique called dynamic programming
...
The technique is related to divide-and-conquer, in the
sense that it breaks problems down into smaller problems that it solves recursively
...
The basic elements that characterize a dynamic programming algorithm are:
Substructure: Decompose the problem into smaller subproblems
...

Bottom-up computation: Combine solutions on small subproblems to solve larger subproblems, and
eventually to arrive at a solution to the complete problem
...

There is one very natural way to do this
...
First we will introduce the natural decomposition, and later present the Floyd-Warshall algorithm
makes use of a different, but more efﬁcient dynamic programming formulation
...
Let us ﬁrst sketch the natural way to break the problem into subproblems
...
At ﬁrst the estimates will be crude
...
A
natural way to do this is to restrict the number of edges that are allowed to be in the shortest path
...
Let D(m) denote the matrix whose entries are these values
...
Since we know that no shortest path can use more
than n − 1 edges (for otherwise it would have to repeat a vertex), we know that D(n−1) is the ﬁnal
distance matrix
...

1
(1)

d1,3 = INF (no path)

4

1

8
2

4

j

(m−1)

dij

(2)

d1,3 = 9 (using: 1,2,3)

2

i

wkj

(3)

d1,3 = 4 (unsing: 1,4,2,3)

9

(m−1)

dik

1

k

3

(a)

(b)

Figure 28: Dynamic Programming Formulation
...
However, observe that D(1) = W , since the edges of the
digraph are just paths of length 1
...

So as our basis case we have
(1)
dij = wij
...

(m)
Consider how to compute the quantity dij
...
There are two cases:
(m−1)

Case 1: If the shortest path uses strictly fewer than m edges, then its cost is just dij

...
The path from
i to k should be shortest (by the principle of optimality) so the length of the resulting path is
(m−1)
dik
+ wij
...
So we minimize over all possible choices
...

This suggests the following rule:
(m−1)

(m)

dij

= min

dij

(m−1)

min1≤k≤n dik

+ wkj

...
In the second case, we consider
all vertices k, and consider the length of the shortest path from i to k, using m − 1 edges, and then the
single edge length cost from k to j
...

We can simplify this formula a bit by observing that since wjj = 0, we have dij
This term occurs in the second case (when k = j)
...
This gives
(m)

dij

= min

1≤k≤n

(m−1)

dik

+ wkj ,

The next question is how shall we implement this rule
...
Here is a possible implementation
...
The array of weights
71

Lecture Notes

CMSC 251

Recursive Shortest Paths
Dist(int m, int i, int j) {
if (m == 1) return W[i,j]
best = INF
for k = 1 to n do
best = min(best, Dist(m-1, i, k) + w[k,j])
return best
}

// single edge case

// apply the update rule

Unfortunately this will be very slow
...
The algorithm makes n calls to itself with the ﬁrst argument
of m − 1
...
Otherwise, we make n
recursive calls to T (m − 1, n)
...

The total running time is T (n − 1, n)
...
The result will
be O(nn ), a huge value
...
If you unravel the recursion, you will see that this
algorithm is just blindly trying all possible paths from i to j
...

So how do we make this faster? The answer is to use table-lookup
...
Observe that there are only O(n3 ) different possible numbers dij that we have to
compute
...
Then if we want this value
again, rather than recompute it, we will simply look its value up in the table
...
The main procedure ShortestPath(n, w) is
given the number of vertices n and the matrix of edge weights W
...
For each m, D[m] is a 2-dimensional matrix, implying that D is a 3-dimensional
matrix
...
Then each call to ExtendPaths() computes D(m) from
D(m−1) , from the above formula
...
n, 1
...
n-1][1
...
n]
copy W to D[1]
for m = 2 to n-1 do
D[m] = ExtendPaths(n, D[m-1], W)
return D[n-1]
}

// initialize D[1]
// comput D[m] from D[m-1]

ExtendShortestPath(int n, int d[1
...
n], int W[1
...
n]) {
matrix dd[1
...
n] = d[1
...
n]
// copy d to temp matrix
for i = 1 to n do
// start from i
for j = 1 to n do
//
...
passing through k
dd[i,j] = min(dd[i,j], d[i,k] + W[k,j])
return dd
// return matrix of distances
}

The procedure ExtendShortestPath() consists of 3 nested loops, and so its running time is Θ(n3 )
...
Next time we will
see that we can improve on this
...

72

Lecture Notes

CMSC 251

1

4

1

8
2

4

9

2

0
?
W =
4
?

8
0
?
2

?
1
0
9

1
?
?
0

(1)

= D

? = infinity

1
3

1

4

1 13

2

4

5

1

5

9
3

1

3
2

12

0 3
(2)
5 0
D =
4 12
13 2

9
1
0
3

1
?
5
0

7

1

5
4
2

4

6
5

3

4
3

1

0
5
D =
4
7
(3)

3
2

7

3
0
7
2

4
1
0
3

1
6
5
0

3

Figure 29: Shortest Path Example
...
2) in CLR
...
The Floyd-Warshall algorithm dates back to the early 60’s
...
Floyd realized that the same technique could be used to compute shortest paths with
only minor variations
...
The genius of the
Floyd-Warshall algorithm is in ﬁnding a different formulation for the shortest path subproblem than
the path length formulation introduced earlier
...
As before, we will compute a set of matrices whose entries are dij
...

For a path p = v1 , v2 ,
...
, v −1 are the intermediate vertices
(k)
of this path
...
We deﬁne dij
to be the shortest path from i to j such that any intermediate vertices on the path are chosen from the
set {1, 2,
...
In other words, we consider a path from i to j which either consists of the single
edge (i, j), or it visits some intermediate vertices along the way, but these intermediate can only be
chosen from {1, 2,
...
The path is free to visit any subset of these vertices, and to do so in any
order
...
For example, in the digraph
(k)
shown in the following ﬁgure, notice how the value of d32 changes as k varies
...
, k}:

73

Lecture Notes

CMSC 251

(0)

d3,2 = INF (no path)

(k−1)

dij

i

j

1
(1)

d3,2 = 12 (3,1,2)

4

1

8
2

4

(2)

Vertices 1 through k−1

d3,2 = 12 (3,1,2)

2

(3)

d3,2 = 12 (3,1,2)

9

1

(k−1)

(k−1)

dik

dkj

(4)

3

d3,2 = 7 (3,1,4,2)
k
(a)
Figure 30: Floyd-Warshall Formulation
...
, k − 1} and hence the length of the shortest path is dij

...
(The assumption that there are no negative
cost cycles is being used here
...
In order
for the overall path to be as short as possible we should take the shortest path from i to k, and
the shortest path from k to j
...
) Each of these paths uses
(k−1)
(k−1)
+ dkj
...
, k − 1}
...

(n)

The ﬁnal answer is dij because this allows all possible vertices as intermediate vertices
...
Instead, we
compute it by storing the values in a table, and looking the values up as we need them
...
We have also included predecessor pointers, pred [i, j] for extracting the ﬁnal
shortest paths
...

Floyd-Warshall Algorithm
Floyd_Warshall(int n, int W[1
...
n]) {
array d[1
...
n]
for i = 1 to n do {
for j = 1 to n do {
d[i,j] = W[i,j]
pred[i,j] = null
}
}
for k = 1 to n do
for i = 1 to n do
for j = 1 to n do
if (d[i,k] + d[k,j]) < d[i,j]) {
d[i,j] = d[i,k] + d[k,j]
pred[i,j] = k

74

// initialize

// use intermediates {1
...
from i
//
...
The space used by the algorithm is Θ(n2 )
...
It is left as an exercise that this does
not affect the correctness of the algorithm
...

1

1

4

1

8
2

4

2

0
(0)
?
D =
4
?

8
0
?
2

?
1
0
9

1
?
?
0

4

1

2

4

5
9

9

? = infinity

1
3

4

9

0
(2)
?
D =
4
?

8
2

3

8
0
12
2

?
1
0
9

1
?
5
0

8
0
12
2

9
1
0
3

1
6
5
0

12

1

1

4

5

0
?
D =
4
2
?
(1)

3

1

1

8

1

2

12

8
0
12
2

9
1
0
3

1
?
5
0

5

7 4

1

2

4

6
5

9
3

3

8

0
5
D =
4
2
7
(3)

12

1
3

1

5

7 4

1

2

4

3
2

6
5

3

4

1

0
5
D =
4
7
(4)

7

3
0
7
2

4
1
0
3

1
6
5
0

3

Figure 31: Floyd-Warshall Example
...
Here
is the idea, whenever we discover that the shortest path from i to j passes through an intermediate
vertex k, we set pred [i, j] = k
...
To ﬁnd the shortest path from i to j, we consult pred [i, j]
...
Otherwise, we recursively compute the shortest path from i to
pred [i, j] and the shortest path from pred [i, j] to j
...
3 in CLR
...
There are
a number of important problems here
...
(This is what text editors and functions
like ”grep” do when you perform a search
...
This arises for example in genetics research
...
The DNA strands can be broken down into a long sequences
each of which is one of four basic types: C, G, T, A
...
Exact substring search will only ﬁnd exact matches
...
The method of string similarities should be insensitive to
random insertions and deletions of characters from some originating string
...
The ﬁrst is the edit distance, that is, the minimum number of single
character insertions, deletions, or transpositions necessary to convert one string into another
...

Longest Common Subsequence: Let us think of character strings as sequences of characters
...
, xm and Z = z1 , z2 ,
...
, ik (1 ≤ i1 < i2 <
...
, Xik
...

Given two strings X and Y , the longest common subsequence of X and Y is a longest sequence Z
which is both a subsequence of X and Y
...
Then the longest common
subsequence is Z = ABADABA
...
Given two sequences X =
x1 ,
...
, yn determine a longest common subsequence
...
For example the LCS of ABC and BAC is either AC or BC
...

Instead, we will derive a dynamic programming solution
...
There are many ways to do this for strings, but it turns out for this problem
that considering all pairs of preﬁxes will sufﬁce for us
...
, xi
...

The idea will be to compute the longest common subsequence for every possible pair of preﬁxes
...
Eventually we are interested
in c[m, n] since this will be the LCS of the two entire strings
...
We begin with
some observations
...
If either sequence is empty, then the longest common subsequence is
empty
...
Example: Let Xi = ABCA and let Yj = DACA
...
(We will explain why later
...
(At ﬁrst you might object: But how did you know that
these two A’s matched with each other
...
)
Thus, if xi = yj then c[i, j] = c[i − 1, j − 1] + 1
...
In this case xi and yj cannot both be in the
LCS (since they would have to be the last character of the LCS)
...

In the ﬁrst case the LCS of Xi and Yj is the LCS of Xi−1 and Yj , which is c[i − 1, j]
...
We do not know which is
the case, so we try both and take the one that gives us the longer LCS
...

We left undone the business of showing that if both strings end in the same character, then the LCS
must also end in this same character
...
Because A is the last character
of both strings, it follows that this particular instance of the character A cannot be used anywhere else
in the LCS
...
But
this would contradict the deﬁnition of the LCS as being longest
...

Implementing the Rule: The task now is to simply implement this rule
...
We will store some helpful pointers in a parallel array,
b[0
...
n]
...
m], char y[1
...
m, 0
...

The running time of the algorithm is clearly O(mn) since there are two nested loops with m and n
iterations, respectively
...

Extracting the Actual Sequence: Extracting the ﬁnal LCS is done by using the back pointers stored in
b[0
...
n]
...
So we take this common character, and continue with entry b[i − 1, j − 1] to the northwest
( )
...
Similarly, if b[i, j] = SKIPY , then we know that Y [j] is not in the LCS,
and so we skip it and go to b[i, j − 1] to the left (←)
...

Print Subsequence
getLCS(char x[1
...
n], int b[0
...
n]) {
LCS = empty string
i = m
j = n
while(i != 0 && j != 0) {
switch b[i,j] {
case ADDXY:
add x[i] (or equivalently y[j]) to front of LCS
i--;
j--;
break
case SKIPX:
i--; break
case SKIPY:
j--; break
}
}
return LCS
}

78

Lecture Notes

CMSC 251

Lecture 26: Chain Matrix Multiplication
(Thursday, April 30, 1998)
Read: Section 16
...

Chain Matrix Multiplication: This problem involves the question of determining the optimal sequence for
performing a series of operations
...
We will study the problem in a very restricted instance, where the dynamic programming issues are easiest to see
...
An
Matrix multiplication is an associative but not a commutative operation
...
Also recall that when two (nonsquare) matrices are being multiplied, there are restrictions on
the dimensions
...
You can multiply a p × q matrix A times a
q × r matrix B, and the result will be a p × r matrix C
...
) In particular for 1 ≤ i ≤ p and 1 ≤ j ≤ r,
q

C[i, j] =

A[i, k]B[k, j]
...
g
...

A

*

B

q

=

C
Multiplication
time = pqr

=

p

p
r
q

r
Figure 33: Matrix Multiplication
...
Consider the case of 3 matrices: A1 be 5 × 4, A2 be 4 × 6 and A3 be 6 × 2
...

Even for this small example, considerable savings can be achieved by reordering the evaluation sequence
...
, An
and dimensions p0 , p1 ,
...

Important Note: This algorithm does not perform the multiplications, it just ﬁgures out the best order
in which to perform the multiplications
...
Unfortunately, the
number of ways of parenthesizing an expression is very large
...
If you have n items, then there are n − 1 places where you could break
the list with the outermost pair of parentheses, namely just after the 1st item, just after the 2nd item,
etc
...
When we split just after the kth item, we create two sublists to
be parenthesized, one with k items, and the other with n − k items
...
Since these are independent choices, if there are L ways to parenthesize
the left sublist and R ways to parenthesize the right sublist, then the total is L · R
...

This is related to a famous function in combinatorics called the Catalan numbers (which in turn is
related to the number of different binary trees on n nodes)
...

n+1 n

Applying Stirling’s formula, we ﬁnd that C(n) ∈ Ω(4n /n3/2 )
...
Thus, this will not be practical
except for very small n
...
We want to break the problem into subproblems,
whose solutions can be combined to solve the global problem
...
j to be the product of matrices i through j
...
j is a pi−1 × pj matrix
...
At this level we are simply multiplying two matrices together
...
n = A1
...
n
...
k and Ak+1
...

The former problem can be solved by just considering all possible values of k
...
n we must use the optimal sequences for A1
...
n
...

We will store the solutions to the subproblems in a table, and build the table in a bottom-up manner
...
j
...
As a basis observe that
if i = j then the sequence contains only one matrix, and so the cost is 0
...
)
Thus, m[i, i] = 0
...
j
...
k times Ak+1
...

The optimum time to compute Ai
...
j is m[k +1, j]
...
Since Ai
...
j is a pk × pj matrix, the time to multiply them is pi−1 · pk · pj
...

m[i, i]

=

m[i, j]

=

0
min (m[i, k] + m[k + 1, j] + pi−1 pk pj )

i≤k
80

for i < j
...
The only tricky part is arranging
the order in which to compute the values
...
This suggests that we should organize things
our computation according to the number of matrices in the subchain
...
The subchains of length 1 (m[i, i]) are trivial
...
, n
...
We need to be a
little careful in setting up the loops
...

Since we want j ≤ n, this means that i + L − 1 ≤ n, or in other words, i ≤ n − L + 1
...

Chain Matrix Multiplication
Matrix-Chain(array p[1
...
n-1,2
...
It is used to extract the actual sequence
...
We’ll leave this as an exercise in solving sums, but the key is that there are
three nested loops, and each can iterate at most n times
...
The basic idea is
to leave a split marker indicating what the best split is, that is, what value of k lead to the minimum
value of m[i, j]
...
For example, suppose that s[i, j] = k
...
j is to ﬁrst multiply the subchain Ai
...
j , and
ﬁnally multiply these together
...
Note that
we only need to store s[i, j] when we have at least two matrices, that is, if j > i
...

Assume that the matrices are stored in an array of matrices A[1
...
The procedure returns a matrix
...
A[k]
// Y = A[k+1]
...

In the ﬁgure below we show an example
...
The initial set of dimensions are 5, 4, 6, 2, 7
meaning that we are multiplying A1 (5 × 4) times A2 (4 × 6) times A3 (6 × 2) times A4 (2 × 7)
...

Lecture 27: NP-Completeness: General Introduction
(Tuesday, May 5, 1998)
Read: Chapt 36, up through section 36
...

Easy and Hard Problems: At this point of the semester hopefully you have learned a few things of what
it means for an algorithm to be efﬁcient, and how to design algorithms and determine their efﬁciency
asymptotically
...
The question that often arises in practice is that you have tried every trick in the book,
and still your best algorithm is not fast enough
...
g
...
g
...
When you analyze its running time, you realize that
√
n
it is running in exponential time, perhaps n n , or 2n , or 2(2 ) , or n!, or worse
...
But at the same time there was also a growing list of
problems for which there seemed to be no known efﬁcient algorithmic solutions
...
People began to wonder whether there was some
unknown paradigm that would lead to a solution to these problems, or perhaps some proof that these
problems are inherently hard to solve and no algorithmic solutions exist that run under exponential
time
...
It turns out that many of these “hard” problems
were interrelated in the sense that if you could solve any one of them in polynomial time, then you
could solve all of them in polynomial time
...

Polynomial Time: We need some way to separate the class of efﬁciently solvable problems from inefﬁciently solvable problems
...

82

Lecture Notes

CMSC 251

We have measured the running time of algorithms using worst-case complexity, as a function of n, the
size of the input
...
(By
a reasonably efﬁcient encoding, we assume that there is not some signiﬁcantly shorter way of providing
the same information
...
)
We have also assumed that operations on numbers can be performed in constant time
...

Up until now all the algorithms we have seen have had the property that their worst-case running times
are bounded above by some polynomial in the input size, n
...
A problem is said
to be solvable in polynomial time if there is a polynomial time algorithm that solves it
...
For example, a running time of O(n log n)
does not look like a polynomial, but it is bounded above by a the polynomial O(n2 ), so it is considered
to be in polynomial time
...
For example, a running time
of O(nk ) is not considered in polynomial time if k is an input parameter that could vary as a function
of n
...

Decision Problems: Many of the problems that we have discussed involve optimization of one form or
another: ﬁnd the shortest path, ﬁnd the minimum cost spanning tree, ﬁnd the knapsack packing of
greatest value
...
A problem is called a decision problem if its output is a simple “yes” or
“no” (or you may think of this as True/False, 0/1, accept/reject)
...
For example, rather than
asking, what is the minimum number of colors needed to color a graph, instead we would phrase this
as a decision problem: Given a graph G and an integer k, is it possible to color G with k colors
...

One historical artifact of NP-completeness is that problems are stated in terms of language-recognition
problems
...
We will not be taking this approach, but you should be aware that if you look in the book, it
will often describe NP-complete problems as languages
...

NP and Polynomial Time Veriﬁcation: Before talking about the class of NP-complete problems, it is important to introduce the notion of a veriﬁcation algorithm
...

Consider the following problem, called the undirected Hamiltonian cycle problem (UHC)
...

An interesting aspect of this problems is that if the graph did contain a Hamiltonian cycle, then
it would be easy for someone to convince you that it did
...
, v13 ”
...
Thus, even though we know of no efﬁcient

83

Lecture Notes

CMSC 251

Nonhamiltonian

Hamiltonian

Figure 35: Undirected Hamiltonian cycle
...
The given cycle is called a certiﬁcate
...
If it is possible to verify the accuracy of a certiﬁcate for a
problem in polynomial time , we say that the problem is polynomial time veriﬁable
...
For example, consider the
problem of determining whether a graph has exactly one Hamiltonian cycle
...

Deﬁnition: Deﬁne NP to be the set of all decision problems that can be veriﬁed by a polynomial time
algorithm
...

The Hamiltonian cycle problem is NP-complete, and so it is widely believed that there is no polynomial
time solution to the problem
...
This referred to a program running on a nondeterministic computer that can make guesses
...
We have avoided introducing nondeterminism
here
...

Like P, NP is a set of languages based on some complexity measure (the complexity of veriﬁcation)
...
In other words, if we can solve a problem in polynomial time, then we can
certainly verify that an answer is correct in polynomial time
...

However it is not known whether P = NP
...
In
other words, just being able to verify that you have a correct solution does not help you in ﬁnding the
actual solution very much
...

NP-Completeness: We will not give a formal deﬁnition of NP-completeness
...
For now, think of the set of NP-complete problems as the
“hardest” problems to solve in the entire class NP
...
These are called NP-hard problems
...
This is where the concept
of a reduction comes in
...

Reductions: Before discussing reductions, let us ﬁrst consider the following example
...
You know (or you strongly believe at least) that it is impossible to solve

84

Lecture Notes

CMSC 251

NP−Complete

NP−Hard
Harder

NP
P

Easy

Figure 36: Relationship between P, NP, and NP-complete
...
You want to prove that B cannot be solved in polynomial time
...

/
/
To do this, we could prove the contrapositive,
(B ∈ P) ⇒ (A ∈ P)
...

How do we do this? Suppose that we have a subroutine that can solve any instance of problem B in
polynomial time
...
Thus we have “reduced” problem A to problem B
...
We know (or strongly
believe) that A cannot be solved in polynomial time, thus we are essentially proving that the subroutine
cannot exist, implying that B cannot be solved in polynomial time
...
It is a fact that the problem of determining whether an
undirected graph has a Hamiltonian cycle (UHC) is an NP-complete problem
...

Suppose your boss of yours tells you that he wants you to ﬁnd a polynomial solution to a different
problem, namely the problem of ﬁnding a Hamiltonian cycle in a directed graph (DHC)
...
After all,
would allowing directions on the edges make this problem any easier? Suppose you and your boss both
agree that the UHC problem (for undirected graphs) is NP-complete, and so it would be unreasonable
for him to expect you to solve this problem
...
After
all, by adding directions to the edges you eliminate the ambiguity of which direction to travel along
each edge
...

You explain to your boss: “Suppose I could ﬁnd an efﬁcient (i
...
, polynomial time) solution to the
DHC problem, then I’ll show you that it would then be possible to solve UHC in polynomial time
...
Since you both agree that UHC is not efﬁciently solvable, this means that this efﬁcient
subroutine for DHC must not exist
...

85

Lecture Notes

CMSC 251

Here is how you might do this
...
Now, every simple
path in the G is a simple path in G , and vice versa
...
Now, if you could develop an efﬁcient solution to the DHC problem, you could use this
algorithm and this little transformation solve the UHC problem
...
You take the undirected graph G, convert it to an equivalent
directed graph G (by edge-doubling), and then call your (supposed) algorithm for directed Hamiltonian
cycles
...

UHC to DHC Reduction
bool Undir_Ham_Cycle(graph G) {
create digraph G’ with the same number of vertices as G
for each edge {u,v} in G {
add edges (u,v) and (v,u) to G’
}
return Dir_Ham_Cycle(G’)
}

You would now have a polynomial time algorithm for UHC
...

Nonhamiltonian

Hamiltonian

Figure 37: Directed Hamiltonian cycle reduction
...
You have just shown how to convert a
solution to DHC into a solution for UHC
...

Lecture 28: NP-Completeness and Reductions
(Thursday, May 7, 1998)
Read: Chapt 36, through Section 36
...

Summary: Last time we introduced a number of concepts, on the way to deﬁning NP-completeness
...

Decision Problems: are problems for which the answer is either “yes” or “no
...

P: is the class of all decisions problems that can be solved in polynomial time (that is, O(nk ) for some
constant k)
...
This
means that if the answer to the problem is “yes” then it is possible give some piece of information
that would allow someone to verify that this is the correct answer in polynomial time
...
)
Reductions: Last time we introduced the notion of a reduction
...
(Note: This deﬁnition differs somewhat from the deﬁnition in the text, but it is
good enough for our purposes
...

The operative word in the deﬁnition is “if”
...

Some important facts about reductions are:
Lemma: If A ≤P B and B ∈ P then A ∈ P
...

Lemma: (Transitivity) If A ≤P B and B ≤P C then A ≤P C
...
To see the second lemma, observe that B cannot be in
P, since otherwise A would be in P by the ﬁrst lemma, giving a contradiction
...
It says that if you can use a subroutine for B to solve A in polynomial time, and you can
use a subroutine for C to solve B in polynomial time, then you can use the subroutine for C to solve
A in polynomial time
...

NP-completeness: Last time we gave the informal deﬁnition that the NP-complete problems are the “hardest” problems in NP
...

Deﬁnition: A decision problem B ∈ NP is NP-complete if
A ≤P B for all A ∈ NP
...

We can use transitivity to simplify this
...

Thus, if you can solve B in polynomial time, then you could solve A in polynomial time
...

Example: 3-Coloring and Clique Cover: Let us consider an example to make this clearer
...

3-coloring (3COL): Given a graph G, can each of its vertices be labeled with one of 3 different “colors”, such that no two adjacent vertices have the same label
...
, Vk , such that i Vi = V , and that each Vi is a clique of G
...

Recall that the clique is a subset of vertices, such that every pair of vertices in the subset are adjacent
to each other
...
Your boss tells you that he wants you to solve the CC
problem
...
How do you prove this to your boss?
There are two things to be shown, ﬁrst that the CC problem is in NP
...
(In particular, to convince someone that a graph has a clique cover of size k, just specify
what the k subsets are
...
)
The second item is to show that a known NP-complete problem (we will choose 3COL) is polynomially
reducible to CC
...
Given a graph G and an integer k, this subroutine returns true if G has a clique cover of size k
and false otherwise, and furthermore, this subroutine runs in polynomial time
...

Let’s see in what respect the two problems are similar
...
In the clique cover problem, for two vertices to be in the same group they must be adjacent to
each other
...
In some sense, the problems are almost the same, but the requirement adjacent/non-adjacent
is exactly reversed
...
The main observation is that a graph G is 3-colorable,
if and only if its complement G, has a clique-cover of size k = 3
...

Using this fact, we can reduce the the 3-coloring problem to the clique cover problem as follows
...

3COL to CC Reduction
bool 3Colorable(graph G) {
let G’ = complement(G)
return CliqueCover(G’,3)
}

There are a few important things to observe here
...
Remember, these are all “what if” games
...
But since we know that 3COL is hard to solve, this means that CC is
also hard to solve
...

A second thing to observe that is the direction of the reduction
...
But in NP-complete,
we do not know how to solve either problem (and indeed we are trying to show that an efﬁcient solution
does not exist)
...
You reduce the known
problem to the problem you want to show is NP-complete
...

Remember: Always reduce the known NP-complete problem to the problem you want to prove
is NP-complete
...
It just
tried to make one problem look more like the other problem
...
It has just been dressed up to look differently
...
We have seen the
Hamiltonian Cycle (HC) problem (Given a graph, does it have a cycle that visits every vertex exactly
once?)
...
How would we do this
...
That is, HC ≤P HP
...
How could we use it to solve HC in
polynomial time?
Here is a ﬁrst attempt (that doesn’t work)
...
So if we just invoke the
HamPath subroutine on the graph and it returns “no” then we can safely answer “no” for HamCycle
...
Thus this will not do the job
...
The problem is that cycles and paths are
different things
...
Suppose that the
89

Lecture Notes

CMSC 251

graph G has a Hamiltonian cycle
...
There must be an edge {u, v}
in the graph
...
How do we know which edge to delete? We don’t so
we could try them all
...

However, there is a problem here as well
...
But when we call the HP subroutine, we have no way to enforce this condition
...
We cannot look inside the
subroutine or modify the subroutine
...
) We can only call it and check
its answer
...

So is there a way to force the HP subroutine to start the path at u and end it at v? The answer is yes,
but we will need to modify the graph to make this happen
...
Because
these vertices have degree one, if a Hamiltonian path exists, it must start at x and end at y
...
Here is how it works
...
For each edge
{u, v} (hoping that it will be the last edge on a Hamiltonian cycle) we create a new graph by deleting
this edge and adding vertex x onto u and vertex y onto v
...
Then
we invoke our Hamiltonian Path subroutine to see whether G has a Hamiltonian path
...
Then we know that the original graph had
a Hamiltonian cycle (starting at u and ending at y)
...

90

Lecture Notes

CMSC 251

HC to HP Reduction
bool HamCycle(graph G) {
for each edge {u,v} in G {
copy G to a new graph G’
delete edge {u,v} from G’
add new vertices x and y to G’
add new edges {x,u} and {y,v} to G’
if (HamPath(G’)) return true
}
return false
// failed for every edge
}

This is a rather inefﬁcient reduction, but it does work
...
Can you see how to do it with fewer calls? (Hint: Consider applying this to the
edges coming out of just one vertex
...
)
As before, notice that we didn’t really attempt to solve either problem
...
Since HC is NP-complete, this means that there is not likely to be an efﬁcient solution to HP
either
...
I would estimate that about 50–70% of the exam will cover material since the
last midterm, and the remainder will be comprehensive
...

Overview: This semester we have discussed general approaches to algorithm design
...
Finding good computational solutions to problems involves
many skills
...
)
Of course, to be a complete programmer, you need to be able to orchestrate all of these elements
...
However, these are
important stages, because a poor initial design is much harder to ﬁx later
...
As we saw with sorting, there may be many ways of solving a problem
...

The intent of the course has been to investigate basic techniques for algorithm analysis, various algorithm design paradigms: divide-and-conquer graph traversals, dynamic programming, etc
...
Here is an overview of the topics that we covered this semester
...
General facts about growth rates of functions
...

Recurrences: Analysis of recursive programs, strong induction, expansion, Master Theorem
...

Heapsort: Nonstable, Θ(n log n), in-place algorithm
...

Quicksort: Nonstable, Θ(n log n) expected case, (almost) in-place sorting algorithm
...

Sorting lower bounds: Any sorting algorithm that is based on comparisons requires Ω(n log n)
steps in the worst-case
...
Considering the number
of possible outcomes, and observe that they form the leaves of the decision tree
...
In this case, N = n!, the
number of different permutations of n keys
...
If k is too large, then you can try breaking the
numbers up into smaller digits, and apply radix sort instead
...
If there are d digits, then its running time is Θ(d(n + k)),
where k is the number of different values in each digit
...
A graph (digraph) consists of a set
of vertices and a set of undirected (directed) edges
...
A graph is sparse if the
number of edges is o(n2 ), and dense otherwise
...
These are simple, but require Θ(n2 ) storage
...

Adjacency list: Adj [u] is a pointer to a linked list containing the neighbors of u
...

Breadth-ﬁrst search: We discussed one graph algorithm: breadth ﬁrst search
...
Recall
that it colors vertices (white, gray, black) to indicate their status in the search, and it also uses a
FIFO queue to determine which order it will visit the vertices
...
This runs in Θ(n + e) time
...
) We showed that breadth-ﬁrst search could be used to compute shortest paths from a
single source vertex in an (unweighted) graph or digraph
...
Its basic elements are those of subdividing large complex problems into
smaller subproblems, solving subproblems in a bottom-up manner (going from smaller to larger)
...
This is not always
the case (e
...
, when there is dependence between the subproblems, it might be better to do worse
and one to get a big savings on the other)
...
2) Shortest paths in a weighted digraph between all
pairs of vertices
...
We gave two algorithms
...
The second
(the Floyd-Warshall algorithm) uses a DP formulation based on considering which vertices
you are allowed to pass through
...

Longest Common Subsequence: (Section 16
...
We showed that the LCS of two sequences of lengths
n and m could be computed in Θ(nm)
...
1) Given a chain of matrices, determine the optimum order in which to multiply them
...

NP-completeness: (Chapt 36
...

NP-completeness reductions: We showed that to prove that a problem is NP-complete you need
to show (1) that it is in NP (by showing that it is possible to verify correctness if the answer is
“yes”) and (2) show that some known NP-complete problem can be reduced to your problem
...

We showed how to reduce 3-coloring to clique cover, and how to reduce Hamiltonian cycle
to Hamiltonian path

Buy These Notes Preview

Notesale: Turn your study into money

Already a Member? >

Search for notes by fellow students, in your own course and all over the country.

My Basket

Document Preview