Search for notes by fellow students, in your own course and all over the country.

Browse our notes for titles which look like what you need, you can preview any of the notes via a sample of the contents. After you're happy these are the notes you're after simply pop them into your shopping cart.

My Basket

You have nothing in your shopping cart yet.

Title: Algorithms
Description: Brief description of algorithms

Document Preview

Extracts from the notes are below, to see the PDF you'll receive please use the links above


T H O M A S H
...
L E I S E R S O N
R O N A L D L
...
Cormen
Charles E
...
Rivest
Clifford Stein

Introduction to Algorithms
Third Edition

The MIT Press
Cambridge, Massachusetts

London, England

c 2009 Massachusetts Institute of Technology
All rights reserved
...

For information about special quantity discounts, please email special sales@mitpress
...
edu
...

Printed and bound in the United States of America
...
Cormen
...
—3rd ed
...
cm
...

ISBN 978-0-262-03384-8 (hardcover : alk
...
: alk
...
Computer programming
...
Computer algorithms
...
Cormen, Thomas H
...
6
...
1—dc22
2009008593
10 9 8 7 6 5 4 3 2

Contents

Preface

xiii

I Foundations
Introduction

3

1

The Role of Algorithms in Computing 5
1
...
2 Algorithms as a technology 11

2

Getting Started 16
2
...
2 Analyzing algorithms 23
2
...
1 Asymptotic notation 43
3
...
1 The maximum-subarray problem 68
4
...
3 The substitution method for solving recurrences 83
4
...
5 The master method for solving recurrences 93
4
...
1 The hiring problem 114
5
...
3 Randomized algorithms 122
5
...
1 Heaps 151
6
...
3 Building a heap 156
6
...
5 Priority queues 162

154

Quicksort 170
7
...
2 Performance of quicksort 174
7
...
4 Analysis of quicksort 180
Sorting in Linear Time 191
8
...
2 Counting sort 194
8
...
4 Bucket sort 200

179

191

Medians and Order Statistics 213
9
...
2 Selection in expected linear time 215
9
...
1 Stacks and queues 232
10
...
3 Implementing pointers and objects
10
...
1 Direct-address tables 254
11
...
3 Hash functions 262
11
...
5 Perfect hashing 277

241

Contents

12

?
13

14

vii

Binary Search Trees 286
12
...
2 Querying a binary search tree 289
12
...
4 Randomly built binary search trees 299
Red-Black Trees 308
13
...
2 Rotations 312
13
...
4 Deletion 323

308

Augmenting Data Structures 339
14
...
2 How to augment a data structure
14
...
1 Rod cutting 360
15
...
3 Elements of dynamic programming 378
15
...
5 Optimal binary search trees 397

16

Greedy Algorithms 414
16
...
2 Elements of the greedy strategy 423
16
...
4 Matroids and greedy methods 437
16
...
1 Aggregate analysis 452
17
...
3 The potential method 459
17
...
1 Definition of B-trees 488
18
...
3 Deleting a key from a B-tree 499

19

Fibonacci Heaps 505
19
...
2 Mergeable-heap operations 510
19
...
4 Bounding the maximum degree 523

20

van Emde Boas Trees 531
20
...
2 A recursive structure 536
20
...
1 Disjoint-set operations 561
21
...
3 Disjoint-set forests 568
21
...
1 Representations of graphs 589
22
...
3 Depth-first search 603
22
...
5 Strongly connected components 615

23

Minimum Spanning Trees 624
23
...
2 The algorithms of Kruskal and Prim 631

573

Contents

24

ix

Single-Source Shortest Paths 643
24
...
2 Single-source shortest paths in directed acyclic graphs
24
...
4 Difference constraints and shortest paths 664
24
...
1 Shortest paths and matrix multiplication 686
25
...
3 Johnson’s algorithm for sparse graphs 700

26

655

Maximum Flow 708
26
...
2 The Ford-Fulkerson method 714
26
...
4 Push-relabel algorithms 736
26
...
1 The basics of dynamic multithreading 774
27
...
3 Multithreaded merge sort 797

28

Matrix Operations 813
28
...
2 Inverting matrices 827
28
...
1 Standard and slack forms 850
29
...
3 The simplex algorithm 864
29
...
5 The initial basic feasible solution 886

859

x

Contents

30

Polynomials and the FFT 898
30
...
2 The DFT and FFT 906
30
...
1 Elementary number-theoretic notions 927
31
...
3 Modular arithmetic 939
31
...
5 The Chinese remainder theorem 950
31
...
7 The RSA public-key cryptosystem 958
31
...
9 Integer factorization 975

?
?
32

?
33

String Matching 985
32
...
2 The Rabin-Karp algorithm 990
32
...
4 The Knuth-Morris-Pratt algorithm 1002
Computational Geometry 1014
33
...
2 Determining whether any pair of segments intersects
33
...
4 Finding the closest pair of points 1039

34

NP-Completeness 1048
34
...
2 Polynomial-time verification 1061
34
...
4 NP-completeness proofs 1078
34
...
1 The vertex-cover problem 1108
35
...
3 The set-covering problem 1117
35
...
5 The subset-sum problem 1128

1123

1021

Contents

xi

VIII Appendix: Mathematical Background
Introduction
A

1143

Summations 1145
A
...
2 Bounding summations 1149

1145

B

Sets, Etc
...
1 Sets 1158
B
...
3 Functions 1166
B
...
5 Trees 1173

C

Counting and Probability 1183
C
...
2 Probability 1189
C
...
4 The geometric and binomial distributions 1201
C
...
1 Matrices and matrix operations
D
...
But now that there are computers, there are even more algorithms, and algorithms lie at the heart of computing
...
It presents many algorithms and covers them in considerable
depth, yet makes their design and analysis accessible to all levels of readers
...

Each chapter presents an algorithm, a design technique, an application area, or a
related topic
...
The book contains 244
figures—many with multiple parts—illustrating how the algorithms work
...

The text is intended primarily for use in undergraduate or graduate courses in
algorithms or data structures
...

In this, the third edition, we have once again updated the entire book
...

To the teacher
We have designed this book to be both versatile and complete
...
Because we have provided considerably
more material than can fit in a typical one-term course, you can consider this book
to be a “buffet” or “smorgasbord” from which you can pick and choose the material
that best supports the course you wish to teach
...
We have made chapters relatively self-contained, so that you need not worry
about an unexpected and unnecessary dependence of one chapter on another
...
In an undergraduate course,
you might use only the earlier sections from a chapter; in a graduate course, you
might cover the entire chapter
...
Each section ends with exercises, and each chapter ends with problems
...
Some are simple self-check thought
exercises, whereas others are more substantial and are suitable as assigned homework
...

Departing from our practice in previous editions of this book, we have made
publicly available solutions to some, but by no means all, of the problems and exercises
...
mit
...

You will want to check this site to make sure that it does not contain the solution to
an exercise or problem that you plan to assign
...

We have starred (?) the sections and exercises that are more suitable for graduate
students than for undergraduates
...
Likewise, starred exercises may require an advanced background or
more than average creativity
...
We have attempted to make every algorithm accessible and
interesting
...
We also provide careful explanations
of the mathematics needed to understand the analysis of the algorithms
...

This is a large book, and your class will probably cover only a portion of its
material
...


Preface

xv

What are the prerequisites for reading this book?
You should have some programming experience
...

You should have some facility with mathematical proofs, and especially proofs
by mathematical induction
...
Beyond that, Parts I and VIII of this book teach you all
the mathematical techniques you will need
...
Our Web site, http://mitpress
...
edu/algorithms/, links to solutions for
a few of the problems and exercises
...

We ask, however, that you do not send your solutions to us
...
Because each chapter is relatively self-contained, you can focus in on the
topics that most interest you
...
We therefore
address implementation concerns and other engineering issues
...

If you wish to implement any of the algorithms, you should find the translation of our pseudocode into your favorite programming language to be a fairly
straightforward task
...
Consequently, we do not address error-handling and other
software-engineering issues that require specific assumptions about your programming environment
...

We understand that if you are using this book outside of a course, then you
might be unable to check your solutions to problems and exercises against solutions
provided by an instructor
...
mit
...
Please do not send your solutions to us
...

Each chapter ends with a set of chapter notes that give historical details and references
...
Though it may be hard to believe for a book of this size,
space constraints prevented us from including many interesting algorithms
...

Changes for the third edition
What has changed between the second and third editions of this book? The magnitude of the changes is on a par with the changes between the first and second
editions
...

A quick look at the table of contents shows that most of the second-edition chapters and sections appear in the third edition
...

We kept the hybrid organization from the first two editions
...
It contains technique-based chapters on divide-and-conquer,
dynamic programming, greedy algorithms, amortized analysis, NP-Completeness,
and approximation algorithms
...
We find that
although you need to know how to apply techniques for designing and analyzing algorithms, problems seldom announce to you which techniques are most amenable
to solving them
...

We revised the chapter on recurrences to more broadly cover the divide-andconquer technique, and its first two sections apply divide-and-conquer to solve
two problems
...

We removed two chapters that were rarely taught: binomial heaps and sorting
networks
...
The treatment of Fibonacci heaps no longer relies on
binomial heaps as a precursor
...
Dynamic programming now leads off with a more interesting problem, rod cutting,
than the assembly-line scheduling problem from the second edition
...
In our opening example of greedy algorithms, the activity-selection problem, we get to the greedy
algorithm more directly than we did in the second edition
...
In the first two editions, in certain cases, some other node
would be deleted, with its contents moving into the node passed to the deletion
procedure
...

The material on flow networks now bases flows entirely on edges
...

With the material on matrix basics and Strassen’s algorithm moved to other
chapters, the chapter on matrix operations is smaller than in the second edition
...

We corrected several errors
...

Based on many requests, we changed the syntax (as it were) of our pseudocode
...
Likewise, we have eliminated the keywords do and
then and adopted “/ as our comment-to-end-of-line symbol
...
Our pseudocode remains procedural,
rather than object-oriented
...

We added 100 new exercises and 28 new problems
...

Finally, we went through the entire book and rewrote sentences, paragraphs,
and sections to make the writing clearer and more active
...
mit
...
The Web site links to a list of
known errors, solutions to selected exercises and problems, and (of course) a list
explaining the corny professor jokes, as well as other content that we might add
...

How we produced this book
A
Like the second edition, the third edition was produced in LTEX 2"
...
We thank
Michael Spivak from Publish or Perish, Inc
...
, and Tim Tregubov from Dartmouth College for technical support
...
The PDF files for this
book were created on a MacBook running OS 10
...

We drew the illustrations for the third edition using MacDraw Pro, with some
of the mathematical expressions in illustrations laid in with the psfrag package
A
for LTEX 2"
...
Happily, we still have a couple of Macintoshes
that can run the Classic environment under OS 10
...
Even under the Classic environment, we find MacDraw Pro to
be far easier to use than any other drawing software for the types of illustrations
that accompany computer-science text, and it produces beautiful output
...

We were geographically distributed while producing the third edition, working
in the Dartmouth College Department of Computer Science, the MIT Computer

1 We investigated several drawing programs that run under Mac OS X, but all had significant shortcomings compared with MacDraw Pro
...
We found that it took at least five times as long
to produce each illustration as it took with MacDraw Pro, and the resulting illustrations did not look
as good
...


Preface

xix

Science and Artificial Intelligence Laboratory, and the Columbia University Department of Industrial Engineering and Operations Research
...

Julie Sussman, P
...
A
...
Time
and again, we were amazed at the errors that eluded us, but that Julie caught
...
If there is a Hall of Fame
for technical copyeditors, Julie is a sure-fire, first-ballot inductee
...
Thank you, thank you, thank you, Julie! Priya Natarajan also
found some errors that we were able to correct before this book went to press
...

The treatment for van Emde Boas trees derives from Erik Demaine’s notes,
which were in turn influenced by Michael Bender
...

The chapter on multithreading was based on notes originally written jointly with
Harald Prokop
...
The design of the
multithreaded pseudocode took its inspiration from the MIT Cilk extensions to C
and by Cilk Arts’s Cilk++ extensions to C++
...
We corrected all the
bona fide errors that were reported, and we incorporated as many suggestions as
we could
...

Finally, we thank our wives—Nicole Cormen, Wendy Leiserson, Gail Rivest,
and Rebecca Ivry—and our children—Ricky, Will, Debby, and Katie Leiserson;
Alex and Christopher Rivest; and Molly, Noah, and Benjamin Stein—for their love
and support while we prepared this book
...
We affectionately dedicate this book to them
...
C ORMEN
C HARLES E
...
R IVEST
C LIFFORD S TEIN
February 2009

Lebanon, New Hampshire
Cambridge, Massachusetts
Cambridge, Massachusetts
New York, New York

Introduction to Algorithms
Third Edition

I

Foundations

Introduction
This part will start you thinking about designing and analyzing algorithms
...
Later parts of this book will build upon this base
...
This chapter defines what an algorithm is and lists some examples
...

In Chapter 2, we see our first algorithms, which solve the problem of sorting
a sequence of n numbers
...
The sorting algorithms we examine are insertion sort,
which uses an incremental approach, and merge sort, which uses a recursive technique known as “divide-and-conquer
...
We
determine these running times in Chapter 2, and we develop a useful notation to
express them
...
It
starts by defining several asymptotic notations, which we use for bounding algorithm running times from above and/or below
...


4

Part I Foundations

Chapter 4 delves further into the divide-and-conquer method introduced in
Chapter 2
...
Chapter 4 contains methods for solving recurrences, which are useful for describing
the running times of recursive algorithms
...
Although much of Chapter 4 is devoted to proving the correctness of the master method, you may skip this proof yet still employ the master
method
...
We typically use probabilistic analysis to determine the running time of an algorithm in
cases in which, due to the presence of an inherent probability distribution, the
running time may differ on different inputs of the same size
...
In other cases, the probability
distribution comes not from the inputs but from random choices made during the
course of the algorithm
...
We can use randomized algorithms to enforce a probability distribution
on the inputs—thereby ensuring that no particular input always causes poor performance—or even to bound the error rate of algorithms that are allowed to produce
incorrect results on a limited basis
...
You are likely to have seen much of the material in the
appendix chapters before having read this book (although the specific definitions
and notational conventions we use may differ in some cases from what you have
seen in the past), and so you should think of the Appendices as reference material
...
All the chapters in Part I and the Appendices are written with a tutorial
flavor
...


1
...
An algorithm is thus a sequence of computational steps that transform the
input into the output
...
The statement of the problem specifies in general terms the desired
input/output relationship
...

For example, we might need to sort a sequence of numbers into nondecreasing
order
...
Here is how we
formally define the sorting problem:
Input: A sequence of n numbers ha1 ; a2 ; : : : ; an i
...


For example, given the input sequence h31; 41; 59; 26; 41; 58i, a sorting algorithm
returns as output the sequence h26; 31; 41; 41; 58; 59i
...
In general, an instance of a problem
consists of the input (satisfying whatever constraints are imposed in the problem
statement) needed to compute a solution to the problem
...
As a result, we have a large number of good sorting
algorithms at our disposal
...

An algorithm is said to be correct if, for every input instance, it halts with the
correct output
...
An incorrect algorithm might not halt at all on some input instances, or it
might halt with an incorrect answer
...
We shall see
an example of an algorithm with a controllable error rate in Chapter 31 when we
study algorithms for finding large prime numbers
...

An algorithm can be specified in English, as a computer program, or even as
a hardware design
...

What kinds of problems are solved by algorithms?
Sorting is by no means the only computational problem for which algorithms have
been developed
...
) Practical applications of algorithms are ubiquitous and include the following examples:
The Human Genome Project has made great progress toward the goals of identifying all the 100,000 genes in human DNA, determining the sequences of the
3 billion chemical base pairs that make up human DNA, storing this information in databases, and developing tools for data analysis
...
Although the solutions to the various problems involved are beyond the scope of this book, many methods to solve these
biological problems use ideas from several of the chapters in this book, thereby
enabling scientists to accomplish tasks while using resources efficiently
...

The Internet enables people all around the world to quickly access and retrieve
large amounts of information
...
Examples
of problems that make essential use of algorithms include finding good routes
on which the data will travel (techniques for solving such problems appear in

1
...

Electronic commerce enables goods and services to be negotiated and exchanged electronically, and it depends on the privacy of personal information such as credit card numbers, passwords, and bank statements
...

Manufacturing and other commercial enterprises often need to allocate scarce
resources in the most beneficial way
...
A political candidate
may want to determine where to spend money buying campaign advertising in
order to maximize the chances of winning an election
...
An Internet service provider may wish to determine where to place
additional resources in order to serve its customers more effectively
...

Although some of the details of these examples are beyond the scope of this
book, we do give underlying techniques that apply to these problems and problem
areas
...
The number of possible routes can be huge, even if we
disallow routes that cross over themselves
...
We shall see how to solve this problem efficiently in Chapter 24
...
A subsequence of X is just X with some (or possibly all or none) of
its elements removed
...
The length of a longest common subsequence of X
and Y gives one measure of how similar these two sequences are
...
If X has m symbols
and Y has n symbols, then X and Y have 2m and 2n possible subsequences,

8

Chapter 1 The Role of Algorithms in Computing

respectively
...

We shall see in Chapter 15 how to use a general technique known as dynamic
programming to solve this problem much more efficiently
...
If the design comprises n
parts, then there are nŠ possible orders, where nŠ denotes the factorial function
...
This problem is an instance of topological sorting, and we shall see
in Chapter 22 how to solve this problem efficiently
...
The convex hull is the smallest convex polygon containing the
points
...
The convex hull would be represented by a tight
rubber band that surrounds all the nails
...
(See Figure 33
...
) Any of the 2n subsets of the points might be the vertices
of the convex hull
...
There are many choices, therefore, for the vertices of the convex hull
...

These lists are far from exhaustive (as you again have probably surmised from
this book’s heft), but exhibit two characteristics that are common to many interesting algorithmic problems:
1
...
Finding one that does, or one that is “best,” can
present quite a challenge
...
They have practical applications
...
A transportation firm, such as a
trucking or railroad company, has a financial interest in finding shortest paths
through a road or rail network because taking shorter paths results in lower
labor and fuel costs
...
Or a
person wishing to drive from New York to Boston may want to find driving
directions from an appropriate Web site, or she may use her GPS while driving
...
1 Algorithms

9

Not every problem solved by algorithms has an easily identified set of candidate
solutions
...
The discrete Fourier transform converts the time domain to the frequency domain, producing a set of numerical coefficients, so that we can determine
the strength of various frequencies in the sampled signal
...
Chapter 30 gives
an efficient algorithm, the fast Fourier transform (commonly called the FFT), for
this problem, and the chapter also sketches out the design of a hardware circuit to
compute the FFT
...
A data structure is a way to store
and organize data in order to facilitate access and modifications
...

Technique
Although you can use this book as a “cookbook” for algorithms, you may someday
encounter a problem for which you cannot readily find a published algorithm (many
of the exercises and problems in this book, for example)
...

Different chapters address different aspects of algorithmic problem solving
...
Other chapters address techniques,
such as divide-and-conquer in Chapter 4, dynamic programming in Chapter 15,
and amortized analysis in Chapter 17
...
Our usual measure of efficiency
is speed, i
...
, how long an algorithm takes to produce its result
...
Chapter 34 studies
an interesting subset of these problems, which are known as NP-complete
...
In other words, no one knows
whether or not efficient algorithms exist for NP-complete problems
...
This
relationship among the NP-complete problems makes the lack of efficient solutions
all the more tantalizing
...
Computer
scientists are intrigued by how a small change to the problem statement can cause
a big change to the efficiency of the best known algorithm
...
If you are called upon to produce an efficient
algorithm for an NP-complete problem, you are likely to spend a lot of time in a
fruitless search
...

As a concrete example, consider a delivery company with a central depot
...
At the end of the day, each truck must end up back at the depot
so that it is ready to be loaded for the next day
...
This problem is the well-known “traveling-salesman problem,” and
it is NP-complete
...
Under certain assumptions,
however, we know of efficient algorithms that give an overall distance which is
not too far above the smallest possible
...

Parallelism
For many years, we could count on processor clock speeds increasing at a steady
rate
...
In order
to perform more computations per second, therefore, chips are being designed to
contain not just one but several processing “cores
...
” In order to elicit the best performance from multicore
computers, we need to design algorithms with parallelism in mind
...
This model has advantages from a theoretical standpoint, and it forms the
basis of several successful computer programs, including a championship chess
program
...
2 Algorithms as a technology

11

Exercises
1
...

1
...
1-3
Select a data structure that you have seen previously, and discuss its strengths and
limitations
...
1-4
How are the shortest-path and traveling-salesman problems given above similar?
How are they different?
1
...
Then
come up with one in which a solution that is “approximately” the best is good
enough
...
2 Algorithms as a technology
Suppose computers were infinitely fast and computer memory was free
...

If computers were infinitely fast, any correct method for solving a problem
would do
...

Of course, computers may be fast, but they are not infinitely fast
...
Computing time is therefore a bounded
resource, and so is space in memory
...


12

Chapter 1 The Role of Algorithms in Computing

Efficiency
Different algorithms devised to solve the same problem often differ dramatically in
their efficiency
...

As an example, in Chapter 2, we will see two algorithms for sorting
...
That is, it takes time roughly proportional
to n2
...
Insertion sort typically has a smaller constant factor than merge sort, so that c1 < c2
...
Let’s write insertion sort’s running
time as c1 n n and merge sort’s running time as c2 n lg n
...
(For example, when n D 1000, lg n is approximately 10,
and when n equals one million, lg n is approximately only 20
...
n will more than compensate for the difference in constant factors
...

For a concrete example, let us pit a faster computer (computer A) running insertion sort against a slower computer (computer B) running merge sort
...
(Although 10 million numbers might
seem like a lot, if the numbers are eight-byte integers, then the input occupies
about 80 megabytes, which fits in the memory of even an inexpensive laptop computer many times over
...
To make
the difference even more dramatic, suppose that the world’s craftiest programmer
codes insertion sort in machine language for computer A, and the resulting code
requires 2n2 instructions to sort n numbers
...
To sort 10 million
numbers, computer A takes
2
...
5 hours) ;
1010 instructions/second
while computer B takes

1
...

In general, as the problem size increases, so does the relative advantage of merge
sort
...
Total system performance depends on choosing efficient
algorithms as much as on choosing fast hardware
...

You might wonder whether algorithms are truly that important on contemporary
computers in light of other advanced technologies, such as
advanced computer architectures and fabrication technologies,
easy-to-use, intuitive, graphical user interfaces (GUIs),
object-oriented systems,
integrated Web technologies, and
fast networking, both wired and wireless
...
Although some applications do not explicitly require algorithmic content at the application level (such as some simple, Web-based applications),
many do
...
Its implementation would rely on fast hardware, a
graphical user interface, wide-area networking, and also possibly on object orientation
...

Moreover, even an application that does not require algorithmic content at the
application level relies heavily upon algorithms
...
Does the application rely on
graphical user interfaces? The design of any GUI relies on algorithms
...

Was the application written in a language other than machine code? Then it was
processed by a compiler, interpreter, or assembler, all of which make extensive use

14

Chapter 1 The Role of Algorithms in Computing

of algorithms
...

Furthermore, with the ever-increasing capacities of computers, we use them to
solve larger problems than ever before
...

Having a solid base of algorithmic knowledge and technique is one characteristic
that separates the truly skilled programmers from the novices
...

Exercises
1
...

1
...
For inputs of size n, insertion sort runs in 8n2 steps, while merge
sort runs in 64n lg n steps
...
2-3
What is the smallest value of n such that an algorithm whose running time is 100n2
runs faster than an algorithm whose running time is 2n on the same machine?

Problems
1-1 Comparison of running times
For each function f
...
n/ microseconds
...
Some of the more
practical aspects of algorithm design are discussed by Bentley [42, 43] and Gonnet
[145]
...
Overviews of the algorithms used in computational
biology can be found in textbooks by Gusfield [156], Pevzner [275], Setubal and
Meidanis [310], and Waterman [350]
...
It is self-contained, but
it does include several references to material that we introduce in Chapters 3 and 4
...
)
We begin by examining the insertion sort algorithm to solve the sorting problem
introduced in Chapter 1
...
Having specified the insertion sort algorithm, we then argue that it
correctly sorts, and we analyze its running time
...

Following our discussion of insertion sort, we introduce the divide-and-conquer
approach to the design of algorithms and use it to develop an algorithm called
merge sort
...


2
...

0
0
0
Output: A permutation (reordering) ha1 ; a2 ; : : : ; an i of the input sequence such
0
0
0
Ä an
...
Although conceptually we are sorting a sequence, the input comes to us in the form of an array with n
elements
...
If
you have been introduced to any of these languages, you should have little trouble

2
...
1 Sorting a hand of cards using insertion sort
...
What separates pseudocode from “real” code is that in
pseudocode, we employ whatever expressive method is most clear and concise to
specify a given algorithm
...
Another difference between pseudocode and real code
is that pseudocode is not typically concerned with issues of software engineering
...

We start with insertion sort, which is an efficient algorithm for sorting a small
number of elements
...
We start with an empty left hand and the cards face down on the
table
...
To find the correct position for a card, we compare
it with each of the cards already in the hand, from right to left, as illustrated in
Figure 2
...
At all times, the cards held in the left hand are sorted, and these cards
were originally the top cards of the pile on the table
...
(In the code, the number n of elements in A is denoted
by A:length
...
The input array A contains the sorted output sequence when
the I NSERTION -S ORT procedure is finished
...
2 The operation of I NSERTION -S ORT on the array A D h5; 2; 4; 6; 1; 3i
...

(a)–(e) The iterations of the for loop of lines 1–8
...
Shaded arrows show array values moved one position to the right in line 6, and black arrows
indicate where the key moves to in line 8
...


I NSERTION -S ORT
...


Loop invariants and the correctness of insertion sort
Figure 2
...
The index j indicates the “current card” being inserted into the hand
...
In fact,
elements AŒ1 : : j 1 are the elements originally in positions 1 through j 1, but
now in sorted order
...

We use loop invariants to help us understand why an algorithm is correct
...
1 Insertion sort

19

Initialization: It is true prior to the first iteration of the loop
...

Termination: When the loop terminates, the invariant gives us a useful property
that helps show that the algorithm is correct
...
(Of course, we are free to use established facts other than the loop
invariant itself to prove that the loop invariant remains true before each iteration
...
Here, showing that the invariant holds
before the first iteration corresponds to the base case, and showing that the invariant
holds from iteration to iteration corresponds to the inductive step
...
Typically, we use the loop invariant along with the
condition that caused the loop to terminate
...

Let us see how these properties hold for insertion sort
...
1 The subarray AŒ1 : : j 1, therefore, consists
of just the single element AŒ1, which is in fact the original element in AŒ1
...

Maintenance: Next, we tackle the second property: showing that each iteration
maintains the loop invariant
...
The subarray AŒ1 : : j  then consists of the elements
originally in AŒ1 : : j , but in sorted order
...

A more formal treatment of the second property would require us to state and
show a loop invariant for the while loop of lines 5–7
...
In the case of I NSERTION -S ORT , this time is after assigning 2 to the
variable j but before the first test of whether j Ä A: length
...

Termination: Finally, we examine what happens when the loop terminates
...
Because
each loop iteration increases j by 1, we must have j D n C 1 at that time
...
Observing that the subarray AŒ1 : : n is the entire array, we conclude that
the entire array is sorted
...

We shall use this method of loop invariants to show correctness later in this
chapter and in other chapters as well
...

Indentation indicates block structure
...
Our indentation style applies to
if-else statements2 as well
...
3
The looping constructs while, for, and repeat-until and the if-else conditional
construct have interpretations similar to those in C, C++, Java, Python, and
Pascal
...
Thus, immediately
after a for loop, the loop counter’s value is the value that first exceeded the for
loop bound
...
The for loop header in line 1 is for j D 2 to A:length, and so when
this loop terminates, j D A:length C 1 (or, equivalently, j D n C 1, since
n D A:length)
...
Although we omit the
keyword then, we occasionally refer to the portion executed when the test following if is true as a
then clause
...

3 Each

pseudocode procedure in this book appears on one page so that you will not have to discern
levels of indentation in code that is split across pages
...

Python lacks repeat-until loops, and its for loops operate a little differently from the for loops in
this book
...
1 Insertion sort

21

counter in each iteration, and we use the keyword downto when a for loop
decrements its loop counter
...

The symbol “/ indicates that the remainder of the line is a comment
...

Variables (such as i, j , and key) are local to the given procedure
...

We access array elements by specifying the array name followed by the index in square brackets
...
The notation “: :” is used to indicate a range of values within an array
...

We typically organize compound data into objects, which are composed of
attributes
...
For example, we treat an array as an object
with the attribute length indicating how many elements it contains
...

We treat a variable representing an array or object as a pointer to the data representing the array or object
...
Moreover, if we now set x:f D 3, then afterward not
only does x:f equal 3, but y:f equals 3 as well
...

Our attribute notation can “cascade
...
Then the notation
x:f :g is implicitly parenthesized as
...
In other words, if we had assigned
y D x:f , then x:f :g is the same as y:g
...
In this case, we give it the
special value NIL
...
When objects are passed, the pointer to
the data representing the object is copied, but the object’s attributes are not
...
The assignment
x:f D 3, however, is visible
...

A return statement immediately transfers control back to the point of call in
the calling procedure
...
Our pseudocode differs from many programming languages in that
we allow multiple values to be returned in a single return statement
...
That is, when we
evaluate the expression “x and y” we first evaluate x
...

If, on the other hand, x evaluates to TRUE, we must evaluate y to determine the
value of the entire expression
...
Short-circuiting operators
allow us to write boolean expressions such as “x ¤ NIL and x:f D y” without
worrying about what happens when we try to evaluate x:f when x is NIL
...
The calling procedure is responsible for handling the error, and so we do not specify what action to take
...
1-1
Using Figure 2
...

2
...

2
...

Output: An index i such that
appear in A
...
Using a loop invariant, prove that your algorithm is correct
...

2
...
The sum of the two integers should be stored in binary form in

2
...
n C 1/-element array C
...


2
...
Occasionally, resources such as memory, communication bandwidth, or computer hardware are of primary concern, but most often it is computational time that we want to measure
...
Such analysis may
indicate more than one viable candidate, but we can often discard several inferior
algorithms in the process
...
For most of this book, we shall assume a generic oneprocessor, random-access machine (RAM) model of computation as our implementation technology and understand that our algorithms will be implemented as
computer programs
...

Strictly speaking, we should precisely define the instructions of the RAM model
and their costs
...
Yet we must be careful not to abuse the RAM
model
...
Such a RAM would be unrealistic, since real computers
do not have such instructions
...
The RAM model contains instructions commonly found in real computers:
arithmetic (such as add, subtract, multiply, divide, remainder, floor, ceiling), data
movement (load, store, copy), and control (conditional and unconditional branch,
subroutine call and return)
...

The data types in the RAM model are integer and floating point (for storing real
numbers)
...
We also assume a limit on the size
of each word of data
...

We require c 1 so that each word can hold the value of n, enabling us to index the
individual input elements, and we restrict c to be a constant so that the word size
does not grow arbitrarily
...
)

24

Chapter 2 Getting Started

Real computers contain instructions not listed above, and such instructions represent a gray area in the RAM model
...
In restricted situations, however, exponentiation is
a constant-time operation
...
In most
computers, shifting the bits of an integer by one position to the left is equivalent
to multiplication by 2, so that shifting the bits by k positions to the left is equivalent to multiplication by 2k
...
We will endeavor to
avoid such gray areas in the RAM model, but we will treat computation of 2k as a
constant-time operation when k is a small enough positive integer
...
That is, we do not model caches or virtual
memory
...
A
handful of problems in this book examine memory-hierarchy effects, but for the
most part, the analyses in this book will not consider them
...
Moreover, RAM-model analyses are usually
excellent predictors of performance on actual machines
...
The
mathematical tools required may include combinatorics, probability theory, algebraic dexterity, and the ability to identify the most significant terms in a formula
...

Even though we typically select only one machine model to analyze a given algorithm, we still face many choices in deciding how to express our analysis
...

Analysis of insertion sort
The time taken by the I NSERTION -S ORT procedure depends on the input: sorting a
thousand numbers takes longer than sorting three numbers
...
In general, the time taken
by an algorithm grows with the size of the input, so it is traditional to describe the
running time of a program as a function of the size of its input
...


2
...
For many
problems, such as sorting or computing discrete Fourier transforms, the most natural measure is the number of items in the input—for example, the array size n
for sorting
...
Sometimes, it is more appropriate to describe the size of
the input with two numbers rather than one
...
We shall indicate which input size measure is being used with
each problem we study
...
It is convenient to define the notion of step so
that it is as machine-independent as possible
...
A constant amount of time is required to execute each line of our
pseudocode
...
This viewpoint is in keeping with the RAM model, and it also reflects
how the pseudocode would be implemented on most actual computers
...
This
simpler notation will also make it easy to determine whether one algorithm is more
efficient than another
...
For each
j D 2; 3; : : : ; n, where n D A:length, we let tj denote the number of times the
while loop test in line 5 is executed for that value of j
...
e
...
We assume that comments are not executable
statements, and so they take no time
...
Computational steps that we specify in English are often variants
of a procedure that requires more than just a constant amount of time
...
Also, note that a statement that calls a subroutine takes constant time,
though the subroutine, once invoked, may take more
...
—from the process of executing the subroutine
...
A/
1 for j D 2 to A:length
2
key D AŒj 
3
/ Insert AŒj  into the sorted
/
sequence AŒ1 : : j 1
...
t
Pj D2 j
n

...
6 To compute T
...
n/ D c1 n C c2
...
n

1/ C c5

n
X
j D2

C c7

n
X


...
n

tj C c6

n
X

...
For example, in I NSERTION -S ORT, the best
case occurs if the array is already sorted
...
Thus tj D 1 for
j D 2; 3; : : : ; n, and the best-case running time is
T
...
n 1/ C c4
...
n 1/ C c8
...
c1 C c2 C c4 C c5 C c8 /n
...

If the array is in reverse sorted order—that is, in decreasing order—the worst
case results
...
Noting that

6 This characteristic does not necessarily hold for a resource such as memory
...


2
...
n C 1/
2

27

1

and
n
X

...
n

1/
2

(see Appendix A for a review of how to solve these summations), we find that in
the worst case, the running time of I NSERTION -S ORT is
Ã
Â
n
...
n/ D c1 n C c2
...
n 1/ C c5
2
Ã
Ã
Â
Â
n
...
n 1/
C c7
C c8
...
c2 C c4 C c5 C c8 / :
We can express this worst-case running time as an2 C bn C c for constants a, b,
and c that again depend on the statement costs ci ; it is thus a quadratic function
of n
...

Worst-case and average-case analysis
In our analysis of insertion sort, we looked at both the best case, in which the input
array was already sorted, and the worst case, in which the input array was reverse
sorted
...
We give three reasons for this orientation
...
Knowing it provides a guarantee that the algorithm
will never take any longer
...

For some algorithms, the worst case occurs fairly often
...

In some applications, searches for absent information may be frequent
...
Suppose that we
randomly choose n numbers and apply insertion sort
...
On average, therefore, we check half of the subarray AŒ1 : : j 1, and
so tj is about j=2
...

In some particular cases, we shall be interested in the average-case running time
of an algorithm; we shall see the technique of probabilistic analysis applied to
various algorithms throughout this book
...
Often, we shall assume that all inputs of a given size are
equally likely
...
We explore randomized algorithms
more in Chapter 5 and in several other subsequent chapters
...
First, we ignored the actual cost of each statement, using the
constants ci to represent these costs
...
We thus ignored not only the actual statement costs, but also the abstract
costs ci
...
We therefore consider only the leading term of a formula (e
...
, an2 ), since the lower-order terms are
relatively insignificant for large values of n
...
For insertion sort, when
we ignore the lower-order terms and the leading term’s constant coefficient, we are
left with the factor of n2 from the leading term
...
n2 / (pronounced “theta of n-squared”)
...

We usually consider one algorithm to be more efficient than another if its worstcase running time has a lower order of growth
...
3 Designing algorithms

29

order of growth
...
n2 / algorithm, for example, will
run more quickly in the worst case than a ‚
...

Exercises
2
...


2
...
Then find the second smallest
element of A, and exchange it with AŒ2
...
Write pseudocode for this algorithm, which is known as selection
sort
...

2
...
1-3)
...

2
...
3 Designing algorithms
We can choose from a wide range of algorithm design techniques
...

In this section, we examine an alternative design approach, known as “divideand-conquer,” which we shall explore in more detail in Chapter 4
...
One advantage of divide-and-conquer algorithms is
that their running times are often easily determined using techniques that we will
see in Chapter 4
...
3
...
These algorithms typically follow a divide-and-conquer approach: they
break the problem into several subproblems that are similar to the original problem but smaller in size, solve the subproblems recursively, and then combine these
solutions to create a solution to the original problem
...

Conquer the subproblems by solving them recursively
...

Combine the solutions to the subproblems into the solution for the original problem
...
Intuitively, it operates as follows
...

Conquer: Sort the two subsequences recursively using merge sort
...

The recursion “bottoms out” when the sequence to be sorted has length 1, in which
case there is no work to be done, since every sequence of length 1 is already in
sorted order
...
We merge by calling an auxiliary procedure
M ERGE
...
The procedure assumes that the subarrays AŒp : : q and
AŒq C 1 : : r are in sorted order
...

Our M ERGE procedure takes time ‚
...
Returning to our cardplaying motif, suppose we have two piles of cards face up on a table
...
We wish to merge the two piles into a single
sorted output pile, which is to be face down on the table
...
3 Designing algorithms

31

the output pile
...

Computationally, each basic step takes constant time, since we are comparing just
the two top cards
...
n/
time
...

We place on the bottom of each pile a sentinel card, which contains a special value
that we use to simplify our code
...
But once that happens, all the nonsentinel cards
have already been placed onto the output pile
...

M ERGE
...
Line 1 computes the length n1
of the subarray AŒp : : q, and line 2 computes the length n2 of the subarray
AŒq C 1 : : r
...
The for loop of lines 4–5 copies the subarray AŒp : : q into LŒ1 : : n1 ,
and the for loop of lines 6–7 copies the subarray AŒq C 1 : : r into RŒ1 : : n2 
...
Lines 10–17, illus-

32

Chapter 2 Getting Started

8

9

A … 2
k

10 11 12 13 14 15 16 17

4

5

7

1

2

3

8

6 …

1

L

9

A … 1

2

3

4

5

1

2

3

4

2
i

4

5

7 ∞

R 1
j

2

3

6 ∞

5

10 11 12 13 14 15 16 17

4
k

5

8

9

1

L

2

3

2

4
i

5

5
k

4

5

7 ∞

7

1

2

3

6 …

2

3

4

5

1

2

3

4

2
i

4

5

7 ∞

R 1

2
j

3

6 ∞

5

(b)

10 11 12 13 14 15 16 17

2

1

1

L

(a)

A … 1

7

2

3

8

6 …

1

2

3

R 1

2
j

3

4

9

A … 1
5

6 ∞

(c)

1

L

2

3

2

4
i

5

10 11 12 13 14 15 16 17

2

2

4

5

7 ∞

7
k

1

2

3

6 …

1

2

3

R 1

2

3
j

4

5

6 ∞

(d)

Figure 2
...
A; 9; 12; 16/, when the subarray
AŒ9 : : 16 contains the sequence h2; 4; 5; 7; 1; 2; 3; 6i
...
Lightly shaded positions
in A contain their final values, and lightly shaded positions in L and R contain values that have yet
to be copied back into A
...
Heavily shaded positions in A contain values
that will be copied over, and heavily shaded positions in L and R contain values that have already
been copied back into A
...


trated in Figure 2
...
Moreover, LŒi and RŒj  are the smallest
elements of their arrays that have not been copied back into A
...

Initialization: Prior to the first iteration of the loop, we have k D p, so that the
subarray AŒp : : k 1 is empty
...


2
...
3, continued (i) The arrays and indices at termination
...


Maintenance: To see that each iteration maintains the loop invariant, let us first
suppose that LŒi Ä RŒj 
...
Because AŒp : : k 1 contains the k p smallest elements, after
line 14 copies LŒi into AŒk, the subarray AŒp : : k will contain the k p C 1
smallest elements
...
If instead LŒi > RŒj ,
then lines 16–17 perform the appropriate action to maintain the loop invariant
...
By the loop invariant, the subarray
AŒp : : k 1, which is AŒp : : r, contains the k p D r p C 1 smallest
elements of LŒ1 : : n1 C 1 and RŒ1 : : n2 C 1, in sorted order
...
All but the two
largest have been copied back into A, and these two largest elements are the
sentinels
...
n/ time, where n D r p C 1,
observe that each of lines 1–3 and 8–11 takes constant time, the for loops of
lines 4–7 take ‚
...
n/ time,7 and there are n iterations of the for
loop of lines 12–17, each of which takes constant time
...
The procedure M ERGE -S ORT
...
If p
r, the subarray has at most one element and is therefore
already sorted
...
8
M ERGE -S ORT
...
p C r/=2c
3
M ERGE -S ORT
...
A; q C 1; r/
5
M ERGE
...
A; 1; A:length/, where once again A:length D n
...
4 illustrates the operation of the procedure bottom-up when n is a power of 2
...

2
...
2

Analyzing divide-and-conquer algorithms

When an algorithm contains a recursive call to itself, we can often describe its
running time by a recurrence equation or recurrence, which describes the overall
running time on a problem of size n in terms of the running time on smaller inputs
...


7 We

shall see in Chapter 3 how to formally interpret equations containing ‚-notation
...
These notations are defined in Chapter 3
...
p C r/=2c yields subarrays AŒp : : q and AŒq C 1 : : r of sizes dn=2e and bn=2c,
respectively, is to examine the four cases that arise depending on whether each of p and r is odd or
even
...
3 Designing algorithms

35

sorted sequence
1

2

2

3

4

5

6

7

1

2

3

merge
2

4

5

7

merge
2

merge

5

4

merge
5

7

1

merge
2

6

4

3

2

merge
7

1

6
merge

3

2

6

initial sequence

Figure 2
...
The lengths of the
sorted sequences being merged increase as the algorithm progresses from bottom to top
...
As before, we let T
...
If the problem size is small enough, say n Ä c
for some constant c, the straightforward solution takes constant time, which we
write as ‚
...
Suppose that our division of the problem yields a subproblems,
each of which is 1=b the size of the original
...
) It takes
time T
...
n=b/
to solve a of them
...
n/ time to divide the problem into subproblems
and C
...
1/
if n Ä c ;
T
...
n=b/ C D
...
n/ otherwise :
In Chapter 4, we shall see how to solve common recurrences of this form
...
Each divide step then yields two subsequences of size exactly n=2
...

We reason as follows to set up the recurrence for T
...
Merge sort on just one element takes constant
time
...

Divide: The divide step just computes the middle of the subarray, which takes
constant time
...
n/ D ‚
...

Conquer: We recursively solve two subproblems, each of size n=2, which contributes 2T
...

Combine: We have already noted that the M ERGE procedure on an n-element
subarray takes time ‚
...
n/ D ‚
...

When we add the functions D
...
n/ for the merge sort analysis, we are
adding a function that is ‚
...
1/
...
n/
...
n=2/ term from the “conquer”
step gives the recurrence for the worst-case running time T
...
1/
if n D 1 ;
T
...
1)
2T
...
n/ if n > 1 :
In Chapter 4, we shall see the “master theorem,” which we can use to show
that T
...
n lg n/, where lg n stands for log2 n
...
n lg n/ running time, outperforms insertion sort, whose running
time is ‚
...

We do not need the master theorem to intuitively understand why the solution to
the recurrence (2
...
n/ D ‚
...
Let us rewrite recurrence (2
...
n/ D
(2
...
n=2/ C cn if n > 1 ;
where the constant c represents the time required to solve problems of size 1 as
well as the time per array element of the divide and combine steps
...
We can get around this problem by
letting c be the larger of these times and understanding that our recurrence gives an upper bound on
the running time, or by letting c be the lesser of these times and understanding that our recurrence
gives a lower bound on the running time
...
n lg n/ running time
...
3 Designing algorithms

37

Figure 2
...
2)
...
Part (a) of the figure shows T
...
The cn term
is the root (the cost incurred at the top level of recursion), and the two subtrees of
the root are the two smaller recurrences T
...
Part (c) shows this process carried
one step further by expanding T
...
The cost incurred at each of the two subnodes at the second level of recursion is cn=2
...
Part (d) shows the
resulting recursion tree
...
The top level has total
cost cn, the next level down has total cost c
...
n=2/ D cn, the level after
that has total cost c
...
n=4/ Cc
...
n=4/ D cn, and so on
...
n=2i /, so that
the ith level below the top has total cost 2i c
...
The bottom level has n
nodes, each contributing a cost of c, for a total cost of cn
...
5 is lg n C 1, where
n is the number of leaves, corresponding to the input size
...
The base case occurs when n D 1, in which case the
tree has only one level
...
Now assume as an inductive hypothesis that the number of levels
of a recursion tree with 2i leaves is lg 2i C 1 D i C 1 (since for any value of i,
we have that lg 2i D i)
...
A tree with n D 2i C1 leaves has
one more level than a tree with 2i leaves, and so the total number of levels is

...

To compute the total cost represented by the recurrence (2
...
The recursion tree has lg n C 1 levels, each costing cn,
for a total cost of cn
...
Ignoring the low-order term and
the constant c gives the desired result of ‚
...

Exercises
2
...
4 as a model, illustrate the operation of merge sort on the array
A D h3; 41; 52; 26; 38; 57; 9; 49i
...
3-2
Rewrite the M ERGE procedure so that it does not use sentinels, instead stopping
once either array L or R has had all its elements copied back to A and then copying
the remainder of the other array back into A
...
5 How to construct a recursion tree for the recurrence T
...
n=2/ C cn
...
n/, which progressively expands in (b)–(d) to form the recursion tree
...
e
...
The total cost, therefore, is cn lg n C cn, which is ‚
...


Problems for Chapter 2

39

2
...
n/ D
2T
...
n/ D n lg n
...
3-4
We can express insertion sort as a recursive procedure as follows
...
Write a recurrence for the running time of this recursive version of
insertion sort
...
3-5
Referring back to the searching problem (see Exercise 2
...
The binary search algorithm repeats this procedure, halving the size of the remaining portion of the
sequence each time
...
Argue that the worst-case running time of binary search is ‚
...

2
...
1 uses a linear search to scan (backward) through the sorted subarray
AŒ1 : : j 1
...
3-5) instead to improve
the overall worst-case running time of insertion sort to ‚
...
3-7 ?
Describe a ‚
...


Problems
2-1 Insertion sort on small arrays in merge sort
Although merge sort runs in ‚
...
n2 / worst-case time, the constant factors in insertion sort can make it faster
in practice for small problem sizes on many machines
...
Consider a modification to merge sort in
which n=k sublists of length k are sorted using insertion sort and then merged
using the standard merging mechanism, where k is a value to be determined
...
Show that insertion sort can sort the n=k sublists, each of length k, in ‚
...

b
...
n lg
...

c
...
nk C n lg
...
How should we choose k in practice?
2-2 Correctness of bubblesort
Bubblesort is a popular, but inefficient, sorting algorithm
...

B UBBLESORT
...
Let A0 denote the output of B UBBLESORT
...
To prove that B UBBLESORT is
correct, we need to prove that it terminates and that
A0 Œ1 Ä A0 Œ2 Ä

Ä A0 Œn ;

(2
...
In order to show that B UBBLESORT actually sorts, what
else do we need to prove?
The next two parts will prove inequality (2
...

b
...
Your proof should use the structure of the loop invariant
proof presented in this chapter
...
Using the termination condition of the loop invariant proved in part (b), state
a loop invariant for the for loop in lines 1–4 that will allow you to prove inequality (2
...
Your proof should use the structure of the loop invariant proof
presented in this chapter
...
What is the worst-case running time of bubblesort? How does it compare to the
running time of insertion sort?
2-3 Correctness of Horner’s rule
The following code fragment implements Horner’s rule for evaluating a polynomial
P
...
a1 C x
...
an

1

C xan /

// ;

given the coefficients a0 ; a1 ; : : : ; an and a value for x:
1 y D0
2 for i D n downto 0
3
y D ai C x y
a
...
Write pseudocode to implement the naive polynomial-evaluation algorithm that
computes each term of the polynomial from scratch
...
Consider the following loop invariant:
At the start of each iteration of the for loop of lines 2–3,
X

n
...
Following the structure of
the loop invariant proof presented in this chapter, use this loop invariant to show
Pn
that, at termination, y D kD0 ak x k
...
Conclude by arguing that the given code fragment correctly evaluates a polynomial characterized by the coefficients a0 ; a1 ; : : : ; an
...
If i < j and AŒi > AŒj , then the
pair
...

a
...


42

Chapter 2 Getting Started

b
...
What is the relationship between the running time of insertion sort and the
number of inversions in the input array? Justify your answer
...
Give an algorithm that determines the number of inversions in any permutation
on n elements in ‚
...
(Hint: Modify merge sort
...
The first volume ushered in the modern
study of computer algorithms with a focus on the analysis of running time, and the
full series remains an engaging and worthwhile reference for many of the topics
presented here
...

a
ı
Aho, Hopcroft, and Ullman [5] advocated the asymptotic analysis of algorithms—using notations that Chapter 3 introduces, including ‚-notation—as a
means of comparing relative performance
...

Knuth [211] provides an encyclopedic treatment of many sorting algorithms
...
Knuth’s discussion of insertion
sort encompasses several variations of the algorithm
...
L
...

Merge sort is also described by Knuth
...
J
...

The early history of proving programs correct is described by Gries [153], who
credits P
...
Gries attributes loop invariants to
R
...
Floyd
...


3

Growth of Functions

The order of growth of the running time of an algorithm, defined in Chapter 2,
gives a simple characterization of the algorithm’s efficiency and also allows us to
compare the relative performance of alternative algorithms
...
n lg n/ worst-case running time,
beats insertion sort, whose worst-case running time is ‚
...
Although we can
sometimes determine the exact running time of an algorithm, as we did for insertion
sort in Chapter 2, the extra precision is not usually worth the effort of computing
it
...

When we look at input sizes large enough to make only the order of growth of
the running time relevant, we are studying the asymptotic efficiency of algorithms
...

Usually, an algorithm that is asymptotically more efficient will be the best choice
for all but very small inputs
...
The next section begins by defining several types of “asymptotic notation,” of which we have already seen an example in ‚-notation
...


3
...
Such notations are convenient for describing the worst-case
running-time function T
...

We sometimes find it convenient, however, to abuse asymptotic notation in a va-

44

Chapter 3 Growth of Functions

riety of ways
...
We should
make sure, however, to understand the precise meaning of the notation so that when
we abuse, we do not misuse it
...

Asymptotic notation, functions, and running times
We will use asymptotic notation primarily to describe the running times of algorithms, as when we wrote that insertion sort’s worst-case running time is ‚
...

Asymptotic notation actually applies to functions, however
...
By writing that insertion sort’s running time is ‚
...
Because asymptotic notation applies to functions, what we were writing as ‚
...

In this book, the functions to which we apply asymptotic notation will usually
characterize the running times of algorithms
...

Even when we use asymptotic notation to apply to the running time of an algorithm, we need to understand which running time we mean
...
Often, however, we wish to characterize
the running time no matter what the input
...
We shall see
asymptotic notations that are well suited to characterizing running times no matter
what the input
...
n/ D ‚
...
Let us define what this notation means
...
n/,
we denote by ‚
...
n// the set of functions

...
n// D ff
...
n/ Ä f
...
n/ for all n n0 g :1

1 Within

set notation, a colon means “such that
...
1 Asymptotic notation

45

c2 g
...
n/
f
...
n/

f
...
n/

c1 g
...
n/ D ‚
...
n//
(a)

n0

n
f
...
g
...
n/ D


...
n//

(c)

Figure 3
...
In each part, the value of n0 shown
is the minimum possible value; any greater value would also work
...
We write f
...
g
...
n/ always lies between c1 g
...
n/
inclusive
...
We write
f
...
g
...
n/ always lies on or below cg
...
(c) -notation gives a lower bound for a function to within
a constant factor
...
n/ D
...
n// if there are positive constants n0 and c such that at and
to the right of n0 , the value of f
...
n/
...
n/ belongs to the set ‚
...
n// if there exist positive constants c1
and c2 such that it can be “sandwiched” between c1 g
...
n/, for sufficiently large n
...
g
...
n/ 2 ‚
...
n//”
to indicate that f
...
g
...
Instead, we will usually write
“f
...
g
...
You might be confused because
we abuse equality in this way, but we shall see later in this section that doing so
has its advantages
...
1(a) gives an intuitive picture of functions f
...
n/, where
f
...
g
...
For all values of n at and to the right of n0 , the value of f
...
n/ and at or below c2 g
...
In other words, for all n n0 , the
function f
...
n/ to within a constant factor
...
n/ is an
asymptotically tight bound for f
...

The definition of ‚
...
n// requires that every member f
...
g
...
n/ be nonnegative whenever n is sufficiently large
...
) Consequently, the function g
...
g
...
We shall therefore assume that every
function used within ‚-notation is asymptotically nonnegative
...


46

Chapter 3 Growth of Functions

In Chapter 2, we introduced an informal notion of ‚-notation that amounted
to throwing away lower-order terms and ignoring the leading coefficient of the
highest-order term
...
n2 /
...
Dividing by n2 yields

1 3
Ä c2 :
2 n
We can make the right-hand inequality hold for any value of n 1 by choosing any
1=2
...
Thus, by choosing c1 D 1=14,
c2 D 1=2, and n0 D 7, we can verify that 1 n2 3n D ‚
...
Certainly, other
2
choices for the constants exist, but the important thing is that some choice exists
...
n2 / would usually require different constants
...
n2 /
...
But then dividing by n2 yields n Ä c2 =6, which cannot possibly hold
for arbitrarily large n, since c2 is constant
...
When n is large, even a tiny fraction of the highest-order term suffices to dominate the lower-order terms
...
The coefficient of the highest-order term can likewise be ignored, since it
only changes c1 and c2 by a constant factor equal to the coefficient
...
n/ D an2 C bn C c, where
a, b, and c are constants and a > 0
...
n/ D ‚
...
Formally, to show the same thing, we
p
take the constants c1 D a=4, c2 D 7a=4, and n0 D 2 max
...
You
n0
...
n/ D i D0 ai ni , where the ai are constants and ad > 0, we
have p
...
nd / (see Problem 3-1)
...
n0 /, or ‚
...
This latter notation is a minor abuse, however, because the
c1 Ä

3
...
2 We shall often
use the notation ‚
...

O-notation
The ‚-notation asymptotically bounds a function from above and below
...
For a given function g
...
g
...
g
...
n/ W there exist positive constants c and n0 such that
0 Ä f
...
n/ for all n n0 g :
We use O-notation to give an upper bound on a function, to within a constant
factor
...
1(b) shows the intuition behind O-notation
...
n/ is on or below cg
...

We write f
...
g
...
n/ is a member of the
set O
...
n//
...
n/ D ‚
...
n// implies f
...
g
...
Written set-theoretically, we have

...
n// Â O
...
n//
...
n2 / also shows that any such quadratic function is in O
...

What may be more surprising is that when a > 0, any linear function an C b is
in O
...
1; b=a/
...
n2 /
...
In this book, however, when we write f
...
g
...
n/ is an asymptotic upper
bound on f
...
Distinguishing asymptotic upper bounds from asymptotically tight bounds is standard in the
algorithms literature
...
For example, the doubly
nested loop structure of the insertion sort algorithm from Chapter 2 immediately
yields an O
...
1/ (constant), the indices i

2 The

real problem is that our ordinary notation for functions does not distinguish functions from
values
...
Adopting a more rigorous notation, however, would complicate
algebraic manipulations, and so we choose to tolerate the abuse
...

Since O-notation describes an upper bound, when we use it to bound the worstcase running time of an algorithm, we have a bound on the running time of the algorithm on every input—the blanket statement we discussed earlier
...
n2 /
bound on worst-case running time of insertion sort also applies to its running time
on every input
...
n2 / bound on the worst-case running time of insertion sort,
however, does not imply a ‚
...
For example, we saw in Chapter 2 that when the input is already
sorted, insertion sort runs in ‚
...

Technically, it is an abuse to say that the running time of insertion sort is O
...
When we say “the running time is O
...
n/ that is O
...
n/
...
n2 /
...
For a given function g
...
g
...
g
...
n/ W there exist positive constants c and n0 such that
0 Ä cg
...
n/ for all n n0 g :
Figure 3
...
For all values n at or to the
right of n0 , the value of f
...
n/
...
1-5)
...
1
For any two functions f
...
n/, we have f
...
g
...
n/ D O
...
n// and f
...
g
...

As an example of the application of this theorem, our proof that an2 C bn C c D

...
n2 / and an2 C bn C c D O
...
In practice, rather than using
Theorem 3
...


3
...
g
...
n/, for sufficiently
large n
...
For example, the best-case running time of insertion sort is
...
n/
...
n/ and O
...

Moreover, these bounds are asymptotically as tight as possible: for instance, the
running time of insertion sort is not
...
n/ time (e
...
, when the input is already sorted)
...
n2 /, since there exists an input that causes the algorithm to take
...

Asymptotic notation in equations and inequalities
We have already seen how asymptotic notation can be used within mathematical
formulas
...
n2 /
...
n/
...
n2 /, we have
already defined the equal sign to mean set membership: n 2 O
...
In general,
however, when asymptotic notation appears in a formula, we interpret it as standing for some anonymous function that we do not care to name
...
n/ means that 2n2 C 3n C 1 D 2n2 C f
...
n/ is some function in the set ‚
...
In this case, we let f
...
n/
...
For example, in Chapter 2 we expressed the worst-case
running time of merge sort as the recurrence
T
...
n=2/ C ‚
...
n/, there is no point in
specifying all the lower-order terms exactly; they are all understood to be included
in the anonymous function denoted by the term ‚
...

The number of anonymous functions in an expression is understood to be equal
to the number of times the asymptotic notation appears
...
i/ ;

50

Chapter 3 Growth of Functions

there is only a single anonymous function (a function of i)
...
1/ C O
...
n/, which doesn’t really have a clean
interpretation
...
n/ D ‚
...

Thus, our example means that for any function f
...
n/, there is some function g
...
n2 / such that 2n2 C f
...
n/ for all n
...

We can chain together a number of such relationships, as in
2n2 C 3n C 1 D 2n2 C ‚
...
n2 / :
We can interpret each equation separately by the rules above
...
n/ 2 ‚
...
n/ for all n
...
n/ 2 ‚
...
n/ just mentioned), there is some function h
...
n2 / such
that 2n2 C g
...
n/ for all n
...
n2 /, which is what the chaining of equations intuitively gives
us
...
The bound 2n2 D O
...
n2 / is not
...
We formally define o
...
n// (“little-oh of g of n”) as the set
o
...
n// D ff
...
n/ < cg
...
n2 /, but 2n2 ¤ o
...

The definitions of O-notation and o-notation are similar
...
n/ D O
...
n//, the bound 0 Ä f
...
n/ holds for some constant c > 0, but in f
...
g
...
n/ < cg
...
Intuitively, in o-notation, the function f
...
n/ as n approaches infinity; that is,

3
...
n/
D0:
(3
...
n/
Some authors use this limit as a definition of the o-notation; the definition in this
book also restricts the anonymous functions to be asymptotically nonnegative
...
We use
!-notation to denote a lower bound that is not asymptotically tight
...
n/ 2 !
...
n// if and only if g
...
f
...
g
...
g
...
n/ W for any positive constant c > 0, there exists a constant
n0 > 0 such that 0 Ä cg
...
n/ for all n n0 g :
For example, n2 =2 D !
...
n2 /
...
n/ D !
...
n//
implies that
f
...
n/
if the limit exists
...
n/ becomes arbitrarily large relative to g
...

lim

Comparing functions
Many of the relational properties of real numbers apply to asymptotic comparisons
as well
...
n/ and g
...

Transitivity:
f
...
g
...
n/ D ‚
...
n//

imply

f
...
h
...
n/ D O
...
n// and g
...
h
...
n/ D O
...
n// ;

f
...
h
...
n/ D

f
...
g
...
n/ D o
...
n//

imply

f
...
h
...
n/ D !
...
n// and g
...
h
...
n/ D !
...
n// :


...
n// and g
...
n/ D ‚
...
n// ;
f
...
f
...
n/ D

...
n// :


...
n// ;

52

Chapter 3 Growth of Functions

Symmetry:
f
...
g
...
n/ D ‚
...
n// :
Transpose symmetry:
f
...
g
...
n/ D
f
...
g
...
f
...
n/ D !
...
n// :

Because these properties hold for asymptotic notations, we can draw an analogy
between the asymptotic comparison of two functions f and g and the comparison
of two real numbers a and b:
f
...
g
...
n/ D
...
n//
f
...
g
...
n/ D o
...
n//
f
...
g
...
n/ is asymptotically smaller than g
...
n/ D o
...
n//, and f
...
n/ if f
...
g
...

One property of real numbers, however, does not carry over to asymptotic notation:
Trichotomy: For any two real numbers a and b, exactly one of the following must
hold: a < b, a D b, or a > b
...
That is, for two functions f
...
n/, it may be the case
that neither f
...
g
...
n/ D
...
n// holds
...

Exercises
3
...
n/ and g
...
Using the basic definition of ‚-notation, prove that max
...
n/; g
...
f
...
n//
...
1-2
Show that for any real constants a and b, where b > 0,

...
nb / :

(3
...
2 Standard notations and common functions

53

3
...
n2 /,” is
meaningless
...
1-4
Is 2nC1 D O
...
2n /?
3
...
1
...
1-6
Prove that the running time of an algorithm is ‚
...
n// if and only if its worst-case
running time is O
...
n// and its best-case running time is
...
n//
...
1-7
Prove that o
...
n// \ !
...
n// is the empty set
...
1-8
We can extend our notation to the case of two parameters n and m that can go to
infinity independently at different rates
...
n; m/, we denote
by O
...
n; m// the set of functions
O
...
n; m// D ff
...
n; m/ Ä cg
...
g
...
g
...


3
...
It also illustrates the use of the asymptotic
notations
...
n/ is monotonically increasing if m Ä n implies f
...
n/
...
m/
f
...
A
function f
...
m/ < f
...
m/ > f
...


54

Chapter 3 Growth of Functions

Floors and ceilings
For any real number x, we denote the greatest integer less than or equal to x by bxc
(read “the floor of x”) and the least integer greater than or equal to x by dxe (read
“the ceiling of x”)
...
3)

For any integer n,
dn=2e C bn=2c D n ;
and for any real number x 0 and integers a; b > 0,
lx m
dx=ae
D
;
b
ab
jx k
bx=ac
D
;
b
ab
la m
a C
...
b 1/
:
b
b

(3
...
5)
(3
...
7)

The floor function f
...
x/ D dxe
...
8)

It follows that
0 Ä a mod n < n :

(3
...

If
...
b mod n/, we write a Á b
...
In other words, a Á b
...
Equivalently, a Á b
...
We write a 6Á b
...


3
...
n/
of the form
p
...
A polynomial is asymptotically positive if and only if ad > 0
...
n/ of degree d , we have p
...
nd /
...
We say that a
function f
...
n/ D O
...

Exponentials
For all real a > 0, m, and n, we have the following identities:
a0
a1
a 1

...
am /n
am an

D
D
D
D
D
D

1;
a;
1=a ;
amn ;

...
When
convenient, we shall assume 00 D 1
...
For all real constants a and b such that a > 1,
nb
D0;
n!1 a n
from which we can conclude that
lim

(3
...
an / :
Thus, any exponential function with a base strictly greater than 1 grows faster than
any polynomial function
...
11)
e D1CxC



i D0

56

Chapter 3 Growth of Functions

where “Š” denotes the factorial function defined later in this section
...
12)

where equality holds only when x D 0
...
13)
x

When x ! 0, the approximation of e by 1 C x is quite good:
e x D 1 C x C ‚
...
) We have for all x,
x Án
D ex :
(3
...
lg n/k
lg
...


An important notational convention we shall adopt is that logarithm functions will
apply only to the next term in the formula, so that lg n C k will mean
...
n C k/
...

For all real a > 0, b > 0, c > 0, and n,
a D b logb a ;
logc
...
1=a/ D
1
;
logb a D
loga b
alogb c D c logb a ;
where, in each equation above, logarithm bases are not 1
...
15)

(3
...
2 Standard notations and common functions

57

By equation (3
...
Computer scientists find 2 to be the most natural base for logarithms
because so many algorithms and data structures involve splitting a problem into
two parts
...
1 C x/ when jxj < 1:
ln
...
1 C x/ Ä x ;
1Cx

(3
...

We say that a function f
...
n/ D O
...
We can relate the growth of polynomials and polylogarithms
by substituting lg n for n and 2a for a in equation (3
...
2a /lg n
n!1 na
lim

From this limit, we can conclude that
lgb n D o
...
Thus, any positive polynomial function grows faster than
any polylogarithmic function
...
n 1/Š if n > 0 :

0 as

Thus, nŠ D 1 2 3 n
...
Stirling’s approximation,
Â
 ÃÃ
p
n Án
1
;
(3
...
As Exercise 3
...
nn / ;
nŠ D !
...
nŠ/ D ‚
...
19)

where Stirling’s approximation is helpful in proving equation (3
...
The following
equation also holds for all n 1:
p
n Á n ˛n
e
(3
...
21)
12n C 1
12n
Functional iteration
We use the notation f
...
n/ to denote the function f
...
Formally, let f
...
For nonnegative integers i, we recursively define
(
n
if i D 0 ;
f
...
n/ D

...
n// if i > 0 :
f
...
n/ D 2n, then f
...
n/ D 2i n
...
Let lg
...
n/ D lg n
...
i / n is defined only if lg
...

Be sure to distinguish lg
...

Then we define the iterated logarithm function as
˚
«
lg n D min i 0 W lg
...
265536 /

D
D
D
D
D

1;
2;
3;
4;
5:

3
...

Fibonacci numbers
We define the Fibonacci numbers by the following recurrence:
F0 D 0 ;
F1 D 1 ;
Fi D Fi

(3
...
23)

and are given by the following formulas (see Exercise 3
...
2-7)
...
24)

60

Chapter 3 Growth of Functions

Fi D

i
1
p C
2
5

;

(3
...
Thus, Fibonacci numbers grow exponentially
...
2-1
Show that if f
...
n/ are monotonically increasing functions, then so are
the functions f
...
n/ and f
...
n//, and if f
...
n/ are in addition
nonnegative, then f
...
n/ is monotonically increasing
...
2-2
Prove equation (3
...

3
...
19)
...
2n / and nŠ D o
...

3
...
2-5 ?
Which is asymptotically larger: lg
...
lg n/?
3
...


and its conjugate y both satisfy the equation

3
...


;

3
...
n/ implies k D ‚
...


Problems for Chapter 3

61

Problems
3-1 Asymptotic behavior of polynomials
Let
p
...
Use the
definitions of the asymptotic notations to prove the following properties
...
If k

d , then p
...
nk /
...
If k Ä d , then p
...
nk /
...
If k D d , then p
...
nk /
...
If k > d , then p
...
nk /
...
If k < d , then p
...
nk /
...
A; B/ in the table below, whether A is O, o,
, !, or ‚ of B
...
Your answer
should be in the form of the table with “yes” or “no” written in each box
...


lg
...


!

nsin n

d
...


nk
p
n

O

lg
...

b
...
Rank the following functions by order of growth; that is, find an arrangement
g1 ; g2 ; : : : ; g30 of the functions satisfying g1 D
...
g3 /,
...
g30 /
...
n/ and g
...
n/ D ‚
...
n//
...
lg n/

2lg

n

p

...
lg n/Š

n

n1= lg n


...
nŠ/

22

ln ln n

lg n

n 2n

nlg lg n

ln n

2lg n


...
lg n/

2

2 lg n

1
p

...
Give an example of a single nonnegative function f
...
n/ in part (a), f
...
gi
...
gi
...

3-4 Asymptotic notation properties
Let f
...
n/ be asymptotically positive functions
...

a
...
n/ D O
...
n// implies g
...
f
...

b
...
n/ C g
...
min
...
n/; g
...

c
...
n/ D O
...
n// implies lg
...
n// D O
...
g
...
g
...
n/ 1 for all sufficiently large n
...
f
...
g
...
n/ D O 2g
...

e
...
n/ D O
...
n//2 /
...
f
...
g
...
n/ D


...
n//
...
f
...
f
...

h
...
n/ C o
...
n// D ‚
...
n//
...
We say that f
...
g
...
n/ cg
...

a
...
n/ and g
...
n/ D O
...
n// or f
...
g
...


Problems for Chapter 3

63

b
...


1

instead of

to

Some authors also define O in a slightly different manner; let’s use O 0 for the
alternative definition
...
n/ D O 0
...
n// if and only if jf
...
g
...

c
...
1 if we
substitute O 0 for O but still use ?
e
Some authors define O (read “soft-oh”) to mean O with logarithmic factors ignored:
e
O
...
n// D ff
...
n/ Ä cg
...
n/ for all n n0 g :
e
d
...
Prove the corresponding analog to Theorem 3
...

3-6 Iterated functions
We can apply the iteration operator used in the lg function to any monotonically
increasing function f
...
For a given constant c 2 R, we define the
iterated function fc by
˚
«
fc
...
i /
...
In other words, the quantity fc
...

For each of the following functions f
...
n/
...


f
...


lg n

1

c
...


2
2

f
...


n1=3

2

h
...


1

fc
...
Bachmann in 1892
...
Landau in 1909 for his discussion
of the distribution of prime numbers
...
Many people continue to
use the O-notation where the ‚-notation is more technically precise
...

Not all authors define the asymptotic notations in the same way, although the
various definitions agree in most common situations
...

Equation (3
...
Other properties of elementary mathematical functions can be found in any good mathematical reference, such as
Abramowitz and Stegun [1] or Zwillinger [362], or in a calculus book, such as
Apostol [18] or Thomas et al
...
Knuth [209] and Graham, Knuth, and Patashnik [152] contain a wealth of material on discrete mathematics as used in computer
science
...
3
...
Recall that in divide-and-conquer, we solve a problem recursively, applying three steps at each level of the recursion:
Divide the problem into a number of subproblems that are smaller instances of the
same problem
...
If the subproblem sizes are
small enough, however, just solve the subproblems in a straightforward manner
...

When the subproblems are large enough to solve recursively, we call that the recursive case
...
Sometimes, in addition to subproblems that are smaller instances of the same
problem, we have to solve subproblems that are not quite the same as the original
problem
...

In this chapter, we shall see more algorithms based on divide-and-conquer
...

Then we shall see two divide-and-conquer algorithms for multiplying n n matrices
...
n3 / time, which is no better than the straightforward method of
multiplying square matrices
...
n2:81 /
time, which beats the straightforward method asymptotically
...
A recurrence is an equation or inequality that describes a function in terms

66

Chapter 4 Divide-and-Conquer

of its value on smaller inputs
...
3
...
n/ of the M ERGE -S ORT procedure by the recurrence
(

...
n/ D
(4
...
n=2/ C ‚
...
n/ D ‚
...

Recurrences can take many forms
...
If the divide and
combine steps take linear time, such an algorithm would give rise to the recurrence
T
...
2n=3/ C T
...
n/
...
For example, a recursive version of linear search
(see Exercise 2
...
Each recursive call would take constant time plus the time for the recursive calls it makes, yielding the recurrence
T
...
n 1/ C ‚
...

This chapter offers three methods for solving recurrences—that is, for obtaining
asymptotic “‚” or “O” bounds on the solution:
In the substitution method, we guess a bound and then use mathematical induction to prove our guess correct
...
We use techniques
for bounding summations to solve the recurrence
...
n/ D aT
...
n/ ;

(4
...
n/ is a given function
...
A recurrence of the form in equation (4
...
n/ time
...
We will use the master method to determine the running
times of the divide-and-conquer algorithms for the maximum-subarray problem
and for matrix multiplication, as well as for other algorithms based on divideand-conquer elsewhere in this book
...
n/ Ä 2T
...
n/
...
n/, we will couch its solution using O-notation rather than
‚-notation
...
n/ 2T
...
n/,
then because the recurrence gives only a lower bound on T
...

Technicalities in recurrences
In practice, we neglect certain technical details when we state and solve recurrences
...
Neither size is actually n=2,
because n=2 is not an integer when n is odd
...
1/
if n D 1 ;
T
...
3)
T
...
bn=2c/ C ‚
...

Since the running time of an algorithm on a constant-sized input is a constant,
the recurrences that arise from the running times of algorithms generally have
T
...
1/ for sufficiently small n
...
n/ is constant for small n
...
1)
as
T
...
n=2/ C ‚
...
4)

without explicitly giving values for small n
...
1/ changes the exact solution to the recurrence, the solution typically doesn’t change by more than a constant factor, and so the order of growth is
unchanged
...
We forge ahead without these details and later determine whether
or not they matter
...
Experience helps, and so do some theorems stating that these details do not affect the
asymptotic bounds of many recurrences characterizing divide-and-conquer algorithms (see Theorem 4
...
In this chapter, however, we shall address some of these
details and illustrate the fine points of recurrence solution methods
...
1

Chapter 4 Divide-and-Conquer

The maximum-subarray problem
Suppose that you been offered the opportunity to invest in the Volatile Chemical
Corporation
...
You are allowed to buy one unit
of stock only one time and then sell it at a later date, buying and selling after the
close of trading for the day
...
Your goal is to maximize
your profit
...
1 shows the price of the stock over a 17-day period
...
Of course, you would want to “buy low, sell high”—buy at the lowest
possible price and later on sell at the highest possible price—to maximize your
profit
...
In Figure 4
...

You might think that you can always maximize profit by either buying at the
lowest price or selling at the highest price
...
1, we would
maximize profit by buying at the lowest price, after day 7
...
Figure 4
...
1 Information about the price of stock in the Volatile Chemical Corporation after the close
of trading over a period of 17 days
...
The bottom row of the table gives the change in price from the previous day
...
1 The maximum-subarray problem

11
10
9
8
7
6

69

Day
Price
Change

0

1

2

3

0
10

1
11
1

2
7
4

3
10
3

4
6
4

4

Figure 4
...
Again, the horizontal axis indicates the day, and the vertical axis shows
the price
...
The price of $7 after day 2 is not the lowest price overall, and the price of $10
after day 3 is not the highest price overall
...

A brute-force solution
We can easily devise a brute-force solution to this problem: just try every possible
pair of buy and sell dates in which the buy date precedes the sell date
...
Since n is ‚
...
n2 /
time
...
n2 / running time, we will look at the
input in a slightly different way
...
Instead of looking at the
daily prices, let us instead consider the daily change in price, where the change on
day i is the difference between the prices after day i 1 and after day i
...
1 shows these daily changes in the bottom row
...
3, we now want to find the nonempty, contiguous
subarray of A whose values have the largest sum
...
For example, in the array of Figure 4
...
Thus, you would want to buy
the stock just before day 8 (that is, after day 7) and sell it after day 11, earning a
profit of $43 per share
...
We still need to check
n 1
D ‚
...
Exercise 4
...
3 The change in stock prices as a maximum-subarray problem
...


that although computing the cost of one subarray might take time proportional to
the length of the subarray, when computing all ‚
...
1/ time, given the values
of previously computed subarray sums, so that the brute-force solution takes ‚
...

So let us seek a more efficient solution to the maximum-subarray problem
...

The maximum-subarray problem is interesting only when the array contains
some negative numbers
...

A solution using divide-and-conquer
Let’s think about how we might solve the maximum-subarray problem using
the divide-and-conquer technique
...
Divide-and-conquer suggests that we divide
the subarray into two subarrays of as equal size as possible
...
As Figure 4
...

Therefore, a maximum subarray of AŒlow : : high must lie in exactly one of these
places
...
We can find maximum subarrays of AŒlow : : mid and
AŒmidC1 : : high recursively, because these two subproblems are smaller instances
of the problem of finding a maximum subarray
...
1 The maximum-subarray problem

71

crosses the midpoint
low

mid

AŒmid C 1 : : j 
high

low

i

mid

mid C 1

entirely in AŒlow : : mid

entirely in AŒmid C 1 : : high

high
mid C 1

j

AŒi : : mid

(a)

(b)

Figure 4
...
(b) Any subarray of AŒlow : : high crossing
the midpoint comprises two subarrays AŒi : : mid and AŒmid C 1 : : j , where low Ä i Ä mid and
mid < j Ä high
...

We can easily find a maximum subarray crossing the midpoint in time linear
in the size of the subarray AŒlow : : high
...
As Figure 4
...
Therefore, we just need to find maximum
subarrays of the form AŒi : : mid and AŒmid C 1 : : j  and then combine them
...

F IND -M AX -C ROSSING -S UBARRAY
...
max-left; max-right; left-sum C right-sum/

72

Chapter 4 Divide-and-Conquer

This procedure works as follows
...
Since this subarray must contain AŒmid, the for loop of
lines 3–7 starts the index i at mid and works down to low, so that every subarray
it considers is of the form AŒi : : mid
...
Whenever we find, in line 5, a subarray AŒi : : mid with a sum of
values greater than left-sum, we update left-sum to this subarray’s sum in line 6, and
in line 7 we update the variable max-left to record this index i
...
Here, the for loop of lines 10–14
starts the index j at midC1 and works up to high, so that every subarray it considers
is of the form AŒmid C 1 : : j 
...

If the subarray AŒlow : : high contains n entries (so that n D high low C 1),
we claim that the call F IND -M AX -C ROSSING -S UBARRAY
...
n/ time
...
1/
time, we just need to count up how many iterations there are altogether
...
mid

low C 1/ C
...
A; low; high/
1 if high == low
2
return
...
low C high/=2c
4

...
A; low; mid/
5

...
A; mid C 1; high/
6

...
A; low; mid; high/
7
if left-sum right-sum and left-sum cross-sum
8
return
...
right-low; right-high; right-sum/
11
else return
...
1 The maximum-subarray problem

73

The initial call F IND -M AXIMUM -S UBARRAY
...

Similar to F IND -M AX -C ROSSING -S UBARRAY, the recursive procedure F IND M AXIMUM -S UBARRAY returns a tuple containing the indices that demarcate a
maximum subarray, along with the sum of the values in a maximum subarray
...
A subarray with just one element has only one subarray—itself—and so line 2 returns a
tuple with the starting and ending indices of just the one element, along with its
value
...
Line 3 does the divide part, computing the index mid of the midpoint
...
Because we know
that the subarray AŒlow : : high contains at least two elements, each of the left and
right subarrays must have at least one element
...

Lines 6–11 form the combine part
...
(Recall that because line 6 solves a subproblem that is not a smaller
instance of the original problem, we consider it to be in the combine part
...
Otherwise, line 9 tests whether the right
subarray contains a subarray with the maximum sum, and line 10 returns that maximum subarray
...

Analyzing the divide-and-conquer algorithm
Next we set up a recurrence that describes the running time of the recursive F IND M AXIMUM -S UBARRAY procedure
...
3
...
We denote by T
...
For
starters, line 1 takes constant time
...
1/ D ‚
...
5)

The recursive case occurs when n > 1
...
Each
of the subproblems solved in lines 4 and 5 is on a subarray of n=2 elements (our
assumption that the original problem size is a power of 2 ensures that n=2 is an
integer), and so we spend T
...
Because we have
to solve two subproblems—for the left subarray and for the right subarray—the
contribution to the running time from lines 4 and 5 comes to 2T
...
As we have

74

Chapter 4 Divide-and-Conquer

already seen, the call to F IND -M AX -C ROSSING -S UBARRAY in line 6 takes ‚
...
Lines 7–11 take only ‚
...
For the recursive case, therefore, we have
T
...
1/ C 2T
...
n/ C ‚
...
n=2/ C ‚
...
6)

Combining equations (4
...
6) gives us a recurrence for the running
time T
...
1/
if n D 1 ;
T
...
7)
2T
...
n/ if n > 1 :
This recurrence is the same as recurrence (4
...
As we shall
see from the master method in Section 4
...
n/ D ‚
...
You might also revisit the recursion tree in Figure 2
...
n/ D ‚
...

Thus, we see that the divide-and-conquer method yields an algorithm that is
asymptotically faster than the brute-force method
...
Sometimes it will yield the asymptotically fastest
algorithm for a problem, and other times we can do even better
...
1-5
shows, there is in fact a linear-time algorithm for the maximum-subarray problem,
and it does not use divide-and-conquer
...
1-1
What does F IND -M AXIMUM -S UBARRAY return when all elements of A are negative?
4
...
Your procedure should run in ‚
...

4
...
What problem size n0 gives the crossover
point at which the recursive algorithm beats the brute-force algorithm? Then,
change the base case of the recursive algorithm to use the brute-force algorithm
whenever the problem size is less than n0
...
1-4
Suppose we change the definition of the maximum-subarray problem to allow the
result to be an empty subarray, where the sum of the values of an empty subar-

4
...
How would you change any of the algorithms that do not allow empty
subarrays to permit an empty subarray to be the result?
4
...
Start at the left end of the array, and progress toward
the right, keeping track of the maximum subarray seen so far
...
Determine a maximum subarray of the form AŒi : : j C 1 in
constant time based on knowing a maximum subarray ending at index j
...
2 Strassen’s algorithm for matrix multiplication
If you have seen matrices before, then you probably know how to multiply them
...
1 in Appendix D
...
aij / and
B D
...
8)

kD1

We must compute n2 matrix entries, and each is the sum of n values
...
We assume that each matrix has an attribute rows, giving the number
of rows in the matrix
...
A; B/
1 n D A:rows
2 let C be a new n n matrix
3 for i D 1 to n
4
for j D 1 to n
5
cij D 0
6
for k D 1 to n
7
cij D cij C ai k bkj
8 return C
The S QUARE -M ATRIX -M ULTIPLY procedure works as follows
...
Line 5
initializes cij to 0 as we start computing the sum given in equation (4
...
8)
...
n3 / time
...
n3 /
time, since the natural definition of matrix multiplication requires that many multiplications
...
n3 / time
...
It runs in ‚
...
5
...
n2:81 / time, which is asymptotically better than the simple S QUARE -M ATRIX M ULTIPLY procedure
...
We make this assumption because in each divide step, we will
divide n n matrices into four n=2 n=2 matrices, and by assuming that n is an
exact power of 2, we are guaranteed that as long as n 2, the dimension n=2 is an
integer
...
9)
AD
A21 A22
B21 B22
C21 C22
so that we rewrite the equation C D A B as
à Â
à Â
Ã
Â
A11 A12
B11 B12
C11 C12
D
:
C21 C22
A21 A22
B21 B22

(4
...
10) corresponds to the four equations
C11
C12
C21
C22

D
D
D
D

A11
A11
A21
A21

B11 C A12
B12 C A12
B11 C A22
B12 C A22

B21 ;
B22 ;
B21 ;
B22 :

(4
...
12)
(4
...
14)

Each of these four equations specifies two multiplications of n=2 n=2 matrices
and the addition of their n=2 n=2 products
...
2 Strassen’s algorithm for matrix multiplication

77

S QUARE -M ATRIX -M ULTIPLY-R ECURSIVE
...
9)
6
C11 D S QUARE -M ATRIX -M ULTIPLY-R ECURSIVE
...
A12 ; B21 /
7
C12 D S QUARE -M ATRIX -M ULTIPLY-R ECURSIVE
...
A12 ; B22 /
8
C21 D S QUARE -M ATRIX -M ULTIPLY-R ECURSIVE
...
A22 ; B21 /
9
C22 D S QUARE -M ATRIX -M ULTIPLY-R ECURSIVE
...
A22 ; B22 /
10 return C
This pseudocode glosses over one subtle but important implementation detail
...
n2 / time copying entries
...
The trick is to use index calculations
...
We end up representing a submatrix a little differently from
how we represent the original matrix, which is the subtlety we are glossing over
...
1/ time (although we shall see that it makes no
difference asymptotically to the overall running time whether we copy or partition
in place)
...
Let T
...
In the base case, when n D 1, we perform just the
one scalar multiplication in line 4, and so
T
...
1/ :

(4
...
As discussed, partitioning the matrices in
line 5 takes ‚
...
In lines 6–9, we recursively call
S QUARE -M ATRIX -M ULTIPLY-R ECURSIVE a total of eight times
...
n=2/ to
the overall running time, the time taken by all eight recursive calls is 8T
...
We
also must account for the four matrix additions in lines 6–9
...
n2 / time
...
n2 /
...
1/ time per entry
...
n/ D ‚
...
n=2/ C ‚
...
n=2/ C ‚
...
16)

Notice that if we implemented partitioning by copying matrices, which would cost

...

Combining equations (4
...
16) gives us the recurrence for the running
time of S QUARE -M ATRIX -M ULTIPLY-R ECURSIVE:
(

...
17)
T
...
n=2/ C ‚
...
5, recurrence (4
...
n/ D ‚
...
Thus, this simple divide-and-conquer approach is no
faster than the straightforward S QUARE -M ATRIX -M ULTIPLY procedure
...
16) came from
...
1/ time, but we have two matrices to partition
...
2/ time, the constant of 2
is subsumed by the ‚-notation
...
k/ time
...
n2 =4/ time
...
n2 / time
...
4n2 / time, we say that they take ‚
...

(Of course, you might observe that we could say that the four matrix additions
take ‚
...
) Thus, we end up with two terms
of ‚
...

When we account for the eight recursive calls, however, we cannot just subsume the constant factor of 8
...
n=2/ time, rather than just T
...
You can get a feel for why by looking
back at the recursion tree in Figure 2
...
1) (which is identical to
recurrence (4
...
n/ D 2T
...
n/
...
If we were to ignore

4
...
16) or the factor of 2 in recurrence (4
...

Bear in mind, therefore, that although asymptotic notation subsumes constant
multiplicative factors, recursive notation such as T
...

Strassen’s method
The key to Strassen’s method is to make the recursion tree slightly less bushy
...
The cost of eliminating one matrix multiplication will be
several new additions of n=2 n=2 matrices, but still only a constant number of
additions
...

Strassen’s method is not at all obvious
...
) It has four steps:
1
...
9)
...
1/ time by index calculation, just
as in S QUARE -M ATRIX -M ULTIPLY-R ECURSIVE
...
Create 10 matrices S1 ; S2 ; : : : ; S10 , each of which is n=2 n=2 and is the sum
or difference of two matrices created in step 1
...
n2 / time
...
Using the submatrices created in step 1 and the 10 matrices created in step 2,
recursively compute seven matrix products P1 ; P2 ; : : : ; P7
...

4
...
We can compute all four submatrices in ‚
...

We shall see the details of steps 2–4 in a moment, but we already have enough
information to set up a recurrence for the running time of Strassen’s method
...
When
n > 1, steps 1, 2, and 4 take a total of ‚
...
Hence, we obtain the following
recurrence for the running time T
...
1/
if n D 1 ;
(4
...
n/ D
2
7T
...
n / if n > 1 :

80

Chapter 4 Divide-and-Conquer

We have traded off one matrix multiplication for a constant number of matrix additions
...
By the master method
in Section 4
...
18) has the solution T
...
nlg 7 /
...
In step 2, we create the following 10
matrices:
S1
S2
S3
S4
S5
S6
S7
S8
S9
S10

D
D
D
D
D
D
D
D
D
D

B12 B22 ;
A11 C A12 ;
A21 C A22 ;
B21 B11 ;
A11 C A22 ;
B11 C B22 ;
A12 A22 ;
B21 C B22 ;
A11 A21 ;
B11 C B12 :

Since we must add or subtract n=2 n=2 matrices 10 times, this step does indeed
take ‚
...

In step 3, we recursively multiply n=2 n=2 matrices seven times to compute the
following n=2 n=2 matrices, each of which is the sum or difference of products
of A and B submatrices:
P1
P2
P3
P4
P5
P6
P7

D
D
D
D
D
D
D

A11 S1
S2 B22
S3 B11
A22 S4
S5 S6
S7 S8
S9 S10

D
D
D
D
D
D
D

A11
A11
A21
A22
A11
A12
A11

B12 A11
B22 C A12
B11 C A22
B21 A22
B11 C A11
B21 C A12
B11 C A11

B22 ;
B22 ;
B11 ;
B11 ;
B22 C A22 B11 C A22 B22 ;
B22 A22 B21 A22 B22 ;
B12 A21 B11 A21 B12 :

Note that the only multiplications we need to perform are those in the middle column of the above equations
...

Step 4 adds and subtracts the Pi matrices created in step 3 to construct the four
n=2 n=2 submatrices of the product C
...
2 Strassen’s algorithm for matrix multiplication

81

Expanding out the right-hand side, with the expansion of each Pi on its own line
and vertically aligning terms that cancel out, we see that C11 equals
A11 B11 C A11 B22 C A22 B11 C A22 B22
A22 B11
C A22 B21
A11 B22
A12 B22
A22 B22 A22 B21 C A12 B22 C A12 B21
A11 B11

C A12 B21 ;

which corresponds to equation (4
...

Similarly, we set
C12 D P1 C P2 ;
and so C12 equals
A11 B12

A11 B22
C A11 B22 C A12 B22

A11 B12

C A12 B22 ;

corresponding to equation (4
...

Setting
C21 D P3 C P4
makes C21 equal
A21 B11 C A22 B11
A22 B11 C A22 B21
A21 B11

C A22 B21 ;

corresponding to equation (4
...

Finally, we set
C22 D P5 C P1

P3

P7 ;

so that C22 equals
A11 B11 C A11 B22 C A22 B11 C A22 B22
A11 B22
C A11 B12
A22 B11
A21 B11
A11 B11
A11 B12 C A21 B11 C A21 B12
A22 B22

C A21 B12 ;

82

Chapter 4 Divide-and-Conquer

which corresponds to equation (4
...
Altogether, we add or subtract n=2 n=2
matrices eight times in step 4, and so this step indeed takes ‚
...

Thus, we see that Strassen’s algorithm, comprising steps 1–4, produces the correct matrix product and that recurrence (4
...
Since
we shall see in Section 4
...
n/ D ‚
...
The notes at the end of this chapter discuss some
of the practical aspects of Strassen’s algorithm
...
2-3, 4
...
2-5 are about variants on Strassen’s
algorithm, you should read Section 4
...

4
...

4
...

4
...
nlg 7 /
...
2-4
What is the largest k such that if you can multiply 3 3 matrices using k multiplications (not assuming commutativity of multiplication), then you can multiply
n n matrices in time o
...
2-5
V
...
Which
method yields the best asymptotic running time when used in a divide-and-conquer
matrix-multiplication algorithm? How does it compare to Strassen’s algorithm?

4
...
2-6
How quickly can you multiply a k n n matrix by an n k n matrix, using Strassen’s
algorithm as a subroutine? Answer the same question with the order of the input
matrices reversed
...
2-7
Show how to multiply the complex numbers a C bi and c C d i using only three
multiplications of real numbers
...


4
...
We start in this
section with the “substitution” method
...
Guess the form of the solution
...
Use mathematical induction to find the constants and show that the solution
works
...
” This method
is powerful, but we must be able to guess the form of the answer in order to apply it
...
As an example, let us determine an upper bound on the recurrence
T
...
bn=2c/ C n ;

(4
...
3) and (4
...
We guess that the solution is
T
...
n lg n/
...
n/ Ä
cn lg n for an appropriate choice of the constant c > 0
...
bn=2c/ Ä c bn=2c lg
...
Substituting into the recurrence yields
T
...
c bn=2c lg
...
n=2/ C n
cn lg n cn lg 2 C n
cn lg n cn C n
cn lg n ;

84

Chapter 4 Divide-and-Conquer

where the last step holds as long as c 1
...
Typically, we do so by showing that the boundary conditions are suitable as base cases for the inductive proof
...
19),
we must show that we can choose the constant c large enough so that the bound
T
...
This requirement
can sometimes lead to problems
...
1/ D 1 is the sole boundary condition of the recurrence
...
n/ Ä cn lg n yields T
...
1/ D 1
...

We can overcome this obstacle in proving an inductive hypothesis for a specific boundary condition with only a little more effort
...
19),
for example, we take advantage of asymptotic notation requiring us only to prove
T
...
We
keep the troublesome boundary condition T
...
We do so by first observing that for n > 3, the
recurrence does not depend directly on T
...
Thus, we can replace T
...
2/
and T
...
Note that we
make a distinction between the base case of the recurrence (n D 1) and the base
cases of the inductive proof (n D 2 and n D 3)
...
1/ D 1, we derive from
the recurrence that T
...
3/ D 5
...
n/ Ä cn lg n for some constant c
1 by choosing c large enough
so that T
...
3/ Ä c3 lg 3
...
For most of the recurrences
we shall examine, it is straightforward to extend boundary conditions to make the
inductive assumption work for small n, and we shall not always explicitly work out
the details
...

Guessing a solution takes experience and, occasionally, creativity
...
You
can also use recursion trees, which we shall see in Section 4
...

If a recurrence is similar to one you have seen before, then guessing a similar
solution is reasonable
...
n/ D 2T
...
Intuitively, however, this additional term cannot substantially affect the

4
...
When n is large, the difference between bn=2c and
bn=2c C 17 is not that large: both cut n nearly evenly in half
...
n/ D O
...
3-6)
...
For example, we might
start with a lower bound of T
...
n/ for the recurrence (4
...
n/ D O
...
Then, we can gradually lower the upper bound and raise the
lower bound until we converge on the correct, asymptotically tight solution of
T
...
n lg n/
...
The problem
frequently turns out to be that the inductive assumption is not strong enough to
prove the detailed bound
...

Consider the recurrence
T
...
bn=2c/ C T
...
n/ D O
...
n/ Ä cn for
an appropriate choice of the constant c
...
n/ Ä c bn=2c C c dn=2e C 1
D cn C 1 ;
which does not imply T
...
We might be tempted to try
a larger guess, say T
...
n2 /
...
n/ D O
...
In order to show that it is correct,
however, we must make a stronger inductive hypothesis
...
Nevertheless, mathematical induction does not work unless we
prove the exact form of the inductive hypothesis
...
Our new guess is
T
...
We now have
T
...
c bn=2c d / C
...
As before, we must choose the constant c large enough to handle
the boundary conditions
...
After all, if the math does not work out, we should increase our guess, right?
Not necessarily! When proving an upper bound by induction, it may actually be
more difficult to prove that a weaker upper bound holds, because in order to prove
the weaker bound, we must use the same weaker bound inductively in the proof
...
In the above example, we subtracted out the constant d twice, once for the
T
...
dn=2e/ term
...
n/ Ä cn 2d C 1, and it was easy to find values of d to make cn 2d C 1 be
less than or equal to cn d
...
For example, in the recurrence (4
...
n/ D O
...
n/ Ä cn and
then arguing
T
...
c bn=2c/ C n
Ä cn C n
D O
...
The error is that we have not proved the exact form of the
inductive hypothesis, that is, that T
...
We therefore will explicitly prove
that T
...
n/ D O
...

Changing variables
Sometimes, a little algebraic manipulation can make an unknown recurrence similar to one you have seen before
...
n/ D 2T
which looks difficult
...
For convenience, we shall not worry about rounding off values, such
p
as n, to be integers
...
2m / D 2T
...
m/ D T
...
m/ D 2S
...
3 The substitution method for solving recurrences

87

which is very much like recurrence (4
...
Indeed, this new recurrence has the
same solution: S
...
m lg m/
...
m/ to T
...
n/ D T
...
m/ D O
...
lg n lg lg n/ :
Exercises
4
...
n/ D T
...
n2 /
...
3-2
Show that the solution of T
...
dn=2e/ C 1 is O
...

4
...
n/ D 2T
...
n lg n/
...
n lg n/
...
n lg n/
...
3-4
Show that by making a different inductive hypothesis, we can overcome the difficulty with the boundary condition T
...
19) without adjusting
the boundary conditions for the inductive proof
...
3-5
Show that ‚
...
3) for merge sort
...
3-6
Show that the solution to T
...
bn=2c C 17/ C n is O
...

4
...
5, you can show that the solution to the
recurrence T
...
n=3/ C n is T
...
nlog3 4 /
...
n/ Ä cnlog3 4 fails
...

4
...
5, you can show that the solution to the
recurrence T
...
n=2/ C n2 is T
...
n2 /
...
n/ Ä cn2 fails
...


88

Chapter 4 Divide-and-Conquer

4
...
n/ D 3T
...

Your solution should be asymptotically tight
...


4
...
Drawing out a recursion tree, as we did in our analysis of the merge
sort recurrence in Section 2
...
2, serves as a straightforward way to devise a good
guess
...
We sum the costs within
each level of the tree to obtain a set of per-level costs, and then we sum all the
per-level costs to determine the total cost of all levels of the recursion
...
When using a recursion tree to generate a good guess,
you can often tolerate a small amount of “sloppiness,” since you will be verifying
your guess later on
...
In this section, we will use recursion trees to generate
good guesses, and in Section 4
...

For example, let us see how a recursion tree would provide a good guess for
the recurrence T
...
bn=4c/ C ‚
...
We start by focusing on finding an
upper bound for the solution
...
n/ D 3T
...

Figure 4
...
n/ D 3T
...

For convenience, we assume that n is an exact power of 4 (another example of
tolerable sloppiness) so that all subproblem sizes are integers
...
n/, which we expand in part (b) into an equivalent tree representing the
recurrence
...
Part (c) shows this process carried one step further by expanding each
node with cost T
...
The cost for each of the three children of the
root is c
...
We continue expanding each node in the tree by breaking it into
its constituent parts as determined by the recurrence
...
4 The recursion-tree method for solving recurrences

89

cn2

T
...
1/ T
...
1/ T
...
1/ T
...
1/ T
...
1/ T
...
1/ T
...
1/


...
n2 /

Figure 4
...
n/ D 3T
...
Part (a)
shows T
...
The fully expanded
tree in part (d) has height log4 n (it has log4 n C 1 levels)
...
How far from the root do
we reach one? The subproblem size for a node at depth i is n=4i
...

Thus, the tree has log4 n C 1 levels (at depths 0; 1; 2; : : : ; log4 n)
...
Each level has three times
more nodes than the level above, and so the number of nodes at depth i is 3i
...
n=4i /2
...
n=4i /2 D
...
The bottom level, at
depth log4 n, has 3log4 n D nlog4 3 nodes, each contributing cost T
...
1/, which is ‚
...
1/ is a constant
...
nlog4 3 /
T
...
nlog4 3 /
D
16
i D0
D


...
nlog4 3 /

...
5)) :

This last formula looks somewhat messy until we realize that we can again take
advantage of small amounts of sloppiness and use an infinite decreasing geometric
series as an upper bound
...
6), we
have
Ã
log4 n 1 Â
X
3 i 2
cn C ‚
...
n/ D
16
i D0
1 Â
X 3 Ãi
cn2 C ‚
...
nlog4 3 /

...
nlog4 3 /
D
13
D O
...
n/ D O
...
n/ D 3T
...
n2 /
...
6), the sum of these coefficients

4
...
n lg n/

Figure 4
...
n/ D T
...
2n=3/ C cn
...
Since the root’s contribution to the
total cost is cn2 , the root contributes a constant fraction of the total cost
...

In fact, if O
...
Why? The first recursive call contributes
a cost of ‚
...
n2 / must be a lower bound for the recurrence
...
n/ D O
...
n/ D
3T
...
n2 /
...
n/ Ä d n2 for some constant d > 0
...
n/ Ä 3T
...
n=4/2 C cn2
3
d n2 C cn2
D
16
Ä d n2 ;
where the last step holds as long as d
...

In another, more intricate, example, Figure 4
...
n/ D T
...
2n=3/ C O
...
) As before, we let c
represent the constant factor in the O
...
When we add the values across the
levels of the recursion tree shown in the figure, we get a value of cn for every level
...
2=3/n !
...
Since
...

Intuitively, we expect the solution to the recurrence to be at most the number
of levels times the cost of each level, or O
...
n lg n/
...
6
shows only the top levels of the recursion tree, however, and not every level in the
tree contributes a cost of cn
...
If this recursion tree
were a complete binary tree of height log3=2 n, there would be 2log3=2 n D nlog3=2 2
leaves
...
nlog3=2 2 / which, since log3=2 2 is a constant strictly greater than 1,
is !
...
This recursion tree is not a complete binary tree, however, and so
it has fewer than nlog3=2 2 leaves
...
Consequently, levels toward the bottom of the
recursion tree contribute less than cn to the total cost
...
Let us tolerate the sloppiness and attempt
to show that a guess of O
...

Indeed, we can use the substitution method to verify that O
...
We show that T
...
We have
T
...
n=3/ C T
...
n=3/ lg
...
2n=3/ lg
...
d
...
n=3/ lg 3/
C
...
2n=3/ lg n d
...
3=2// C cn
D d n lg n d
...
2n=3/ lg
...
n=3/ lg 3 C
...
2n=3/ lg 2/ C cn
D d n lg n d n
...
lg 3
...
Thus, we did not need to perform a more accurate
accounting of costs in the recursion tree
...
4-1
Use a recursion tree to determine a good asymptotic upper bound on the recurrence
T
...
bn=2c/ C n
...

4
...
n/ D T
...
Use the substitution method to verify your answer
...
5 The master method for solving recurrences

93

4
...
n/ D 4T
...
Use the substitution method to verify your answer
...
4-4
Use a recursion tree to determine a good asymptotic upper bound on the recurrence
T
...
n 1/ C 1
...

4
...
n/ D T
...
n=2/Cn
...

4
...
n/ D T
...
2n=3/Ccn, where c
is a constant, is
...

4
...
n/ D 4T
...
Verify your bound by the substitution method
...
4-8
Use a recursion tree to give an asymptotically tight solution to the recurrence
T
...
n a/ C T
...

4
...
n/ D T
...
1 ˛/n/ C cn, where ˛ is a constant in the range 0 < ˛ < 1
and c > 0 is also a constant
...
5 The master method for solving recurrences
The master method provides a “cookbook” method for solving recurrences of the
form
T
...
n=b/ C f
...
20)

where a
1 and b > 1 are constants and f
...
To use the master method, you will need to memorize three cases, but
then you will be able to solve many recurrences quite easily, often without pencil
and paper
...
20) describes the running time of an algorithm that divides a
problem of size n into a subproblems, each of size n=b, where a and b are positive
constants
...
n=b/
...
n/ encompasses the cost of dividing the problem and combining the
results of the subproblems
...
n/ D ‚
...

As a matter of technical correctness, the recurrence is not actually well defined,
because n=b might not be an integer
...
n=b/ with
either T
...
dn=be/ will not affect the asymptotic behavior of the recurrence, however
...
) We normally
find it convenient, therefore, to omit the floor and ceiling functions when writing
divide-and-conquer recurrences of this form
...

Theorem 4
...
n/ be a function, and let T
...
n/ D aT
...
n/ ;
where we interpret n=b to mean either bn=bc or dn=be
...
n/ has the following asymptotic bounds:
1
...
n/ D O
...
n/ D ‚
...


2
...
n/ D ‚
...
n/ D ‚
...

3
...
n/ D
...
n=b/ Ä cf
...
n/ D ‚
...
n//
...
In each of the three cases, we compare the
function f
...
Intuitively, the larger of the two functions
determines the solution to the recurrence
...
n/ D ‚
...
If, as in case 3, the function f
...
n/ D ‚
...
n//
...
n/ D ‚
...
f
...

Beyond this intuition, you need to be aware of some technicalities
...
n/ be smaller than nlogb a , it must be polynomially smaller
...
5 The master method for solving recurrences

95

That is, f
...
In the third case, not only must f
...
n=b/ Ä cf
...
This condition is satisfied by most of the polynomially bounded
functions that we shall encounter
...
n/
...
n/ is smaller than nlogb a but not polynomially smaller
...
n/ is larger
than nlogb a but not polynomially larger
...
n/ falls into one of these
gaps, or if the regularity condition in case 3 fails to hold, you cannot use the master
method to solve the recurrence
...

As a first example, consider
T
...
n=3/ C n :
For this recurrence, we have a D 9, b D 3, f
...
n2 )
...
n/ D O
...
n/ D ‚
...

Now consider
T
...
2n=3/ C 1;
in which a D 1, b D 3=2, f
...
Case 2
applies, since f
...
nlogb a / D ‚
...
n/ D ‚
...

For the recurrence
T
...
n=4/ C n lg n ;
we have a D 3, b D 4, f
...
n0:793 /
...
n/ D
...
n/
...
n=b/ D 3
...
n=4/ Ä
...
n/ for c D 3=4
...
n/ D ‚
...

The master method does not apply to the recurrence
T
...
n=2/ C n lg n ;
even though it appears to have the proper form: a D 2, b D 2, f
...
You might mistakenly think that case 3 should apply, since

96

Chapter 4 Divide-and-Conquer

f
...
The problem is that it
is not polynomially larger
...
n/=nlogb a D
...
Consequently, the recurrence falls
into the gap between case 2 and case 3
...
6-2 for a solution
...
1
and 4
...
Recurrence (4
...
n/ D 2T
...
n/ ;
characterizes the running times of the divide-and-conquer algorithm for both the
maximum-subarray problem and merge sort
...
) Here, we have a D 2, b D 2, f
...
n/, and
thus we have that nlogb a D nlog2 2 D n
...
n/ D ‚
...
n/ D ‚
...

Recurrence (4
...
n/ D 8T
...
n2 / ;
describes the running time of the first divide-and-conquer algorithm that we saw
for matrix multiplication
...
n/ D ‚
...
Since n3 is polynomially larger than f
...
n/ D O
...
n/ D ‚
...

Finally, consider recurrence (4
...
n/ D 7T
...
n2 / ;
which describes the running time of Strassen’s algorithm
...
n/ D ‚
...
Rewriting log2 7 as lg 7 and
recalling that 2:80 < lg 7 < 2:81, we see that f
...
nlg 7 / for D 0:8
...
n/ D ‚
...

Exercises
4
...

a
...
n/ D 2T
...

p
b
...
n/ D 2T
...

c
...
n/ D 2T
...

d
...
n/ D 2T
...


4
...
5-2
Professor Caesar wishes to develop a matrix-multiplication algorithm that is
asymptotically faster than Strassen’s algorithm
...
n2 / time
...
If his algorithm creates a subproblems, then the recurrence for the running
time T
...
n/ D aT
...
n2 /
...
5-3
Use the master method to show that the solution to the binary-search recurrence
T
...
n=2/ C ‚
...
n/ D ‚
...
(See Exercise 2
...
)
4
...
n/ D 4T
...

4
...
n=b/ Ä cf
...
Give an example of constants a 1
and b > 1 and a function f
...


? 4
...
1)
...

The proof appears in two parts
...
20), under the simplifying assumption that T
...
This part gives all the intuition
needed to understand why the master theorem is true
...

In this section, we shall sometimes abuse our asymptotic notation slightly by
using it to describe the behavior of functions that are defined only over exact
powers of b
...
Since we could make new asymptotic notations that apply only to the set
fb i W i D 0; 1; 2; : : :g, instead of to the nonnegative numbers, this abuse is minor
...
For example, proving that
T
...
n/ when n is an exact power of 2 does not guarantee that T
...
n/
...
n/ could be defined as
(
n if n D 1; 2; 4; 8; : : : ;
T
...
n/ D O
...

Because of this sort of drastic consequence, we shall never use asymptotic notation
over a limited domain without making it absolutely clear from the context that we
are doing so
...
6
...
20)
T
...
n=b/ C f
...
We break the analysis into three lemmas
...
The second determines bounds on this
summation
...

Lemma 4
...
n/ be a nonnegative function defined
on exact powers of b
...
n/ on exact powers of b by the recurrence
(

...
n/ D
aT
...
n/ if n D b i ;
where i is a positive integer
...
n/ D ‚
...
n=b j / :

(4
...
7
...
n/,
and it has a children, each with cost f
...
(It is convenient to think of a as being

4
...
n/

f
...
n=b/



f
...
n=b/

a

af
...
1/ ‚
...
1/ ‚
...
1/ ‚
...
1/ ‚
...
1/ ‚
...
n=b 2 / f
...
n=b 2 /
a




a


a2 f
...
n=b 2 / f
...
n=b 2 / f
...
n=b 2 /…f
...
nlogb a /


...
1/ ‚
...
nlogb a / C

aj f
...
7 The recursion tree generated by T
...
n=b/ C f
...
The tree is a complete a-ary
tree with nlogb a leaves and height logb n
...
21)
...
) Each of these children has a children, making a2 nodes at depth 2,
and each of the a children has cost f
...
In general, there are aj nodes at
depth j , and each has cost f
...
The cost of each leaf is T
...
1/, and
each leaf is at depth logb n, since n=b logb n D 1
...

We can obtain equation (4
...
The cost for all internal nodes at depth j is
aj f
...
n=b j / :

j D0

In the underlying divide-and-conquer algorithm, this sum represents the costs of
dividing problems into subproblems and then recombining the subproblems
...
nlogb a /
...

The summation in equation (4
...
The next lemma provides asymptotic bounds on the summation’s growth
...
3
Let a 1 and b > 1 be constants, and let f
...
A function g
...
n/ D

aj f
...
22)

j D0

has the following asymptotic bounds for exact powers of b:
1
...
n/ D O
...
n/ D O
...


2
...
n/ D ‚
...
n/ D ‚
...

3
...
n=b/ Ä cf
...
n/ D ‚
...
n//
...
n/ D O
...
n=b j / D
O
...
Substituting into equation (4
...
23)
a
g
...
b /j

j D0

Â
D n

logb a

b

logb n

b

1
1

Ã

4
...
n / D
O
...
Substituting this expression for the summation in equation (4
...
n/ D O
...

Because case 2 assumes that f
...
nlogb a /, we have that f
...
n=b j /logb a /
...
22) yields
!
logb n 1
X
n Álogb a
j
a
:
(4
...
n/ D ‚
bj
j D0
We bound the summation within the ‚-notation as in case 1, but this time we do not
obtain a geometric series
...
24) yields
g
...
nlogb a logb n/
D ‚
...

We prove case 3 similarly
...
n/ appears in the definition (4
...
n/
and all terms of g
...
n/ D
...
n// for
exact powers of b
...
n=b/ Ä cf
...
We rewrite this assumption
as f
...
c=a/f
...
n=b j / Ä
...
n/ or,
equivalently, aj f
...
n/, where we assume that the values we iterate
on are sufficiently large
...

Substituting into equation (4
...
We use an O
...
n/ D

aj f
...
n/ C O
...
n/

1
X

c j C O
...
n/

1

1 c
D O
...
n// ;

Ã
C O
...
Thus, we can conclude that g
...
f
...
With case 3 proved, the proof of the lemma is complete
...

Lemma 4
...
n/ be a nonnegative function defined
on exact powers of b
...
n/ on exact powers of b by the recurrence
(

...
n/ D
aT
...
n/ if n D b i ;
where i is a positive integer
...
n/ has the following asymptotic bounds for
exact powers of b:
1
...
n/ D O
...
n/ D ‚
...


2
...
n/ D ‚
...
n/ D ‚
...

3
...
n/ D
...
n=b/ Ä cf
...
n/ D ‚
...
n//
...
3 to evaluate the summation (4
...
2
...
n/ D ‚
...
nlogb a /
D ‚
...
6 Proof of the master theorem

103

and for case 2,
T
...
nlogb a / C ‚
...
nlogb a lg n/ :
For case 3,
T
...
nlogb a / C ‚
...
n//
D ‚
...
n// ;
because f
...
nlogb aC /
...
6
...
Obtaining
a lower bound on
T
...
dn=be/ C f
...
25)

and an upper bound on
T
...
bn=bc/ C f
...
26)

is routine, since we can push through the bound dn=be n=b in the first case to
yield the desired result, and we can push through the bound bn=bc Ä n=b in the
second case
...
26)
as to upper-bound the recurrence (4
...

We modify the recursion tree of Figure 4
...
8
...
27)

104

Chapter 4 Divide-and-Conquer

f
...
n/
a
f
...
n1 /

f
...
n1 /

a

blogb nc

a


f
...
n2 /
a


f
...
n2 / … f
...
1/ ‚
...
1/ ‚
...
1/ ‚
...
1/ ‚
...
1/ ‚
...
n2 /
a




a2 f
...
n2 / … f
...
n2 /


...
1/ ‚
...
1/


...
nlogb a / C

aj f
...
8 The recursion tree generated by T
...
dn=be/Cf
...
The recursive argument nj
is given by equation (4
...


Our first goal is to determine the depth k such that nk is a constant
...
6 Proof of the master theorem

Ä

X 1
n
C
bj
bi
i D0

<

X 1
n
C
bj
bi
i D0

D

105

n
b
:
C
j
b
b 1

j 1

nj

1

Letting j D blogb nc, we obtain
nblogb nc <

n
b blogb nc

C

b
b

n

C
b logb n 1
b
b
n
C
D
n=b
b 1
b
D bC
b 1
D O
...

From Figure 4
...
n/ D ‚
...
nj / ;

(4
...
21), except that n is an arbitrary integer and
not restricted to be an exact power of b
...
n/ D

aj f
...
29)

j D0

from equation (4
...
3
...
dn=be/ Ä cf
...
b 1/, where c < 1 is a constant,
then it follows that aj f
...
n/
...
29) just as in Lemma 4
...
For case 2, we have f
...
nlogb a /
...
nj / D O
...
n=b j /logb a /, then the proof for case 2
of Lemma 4
...
Observe that j Ä blogb nc implies b j =n Ä 1
...
n/ D O
...
nj / Ä
D
D
Ä
D

since c
...
b 1//logb a is a constant
...
The proof
of case 1 is almost identical
...
nj / D O
...

We have now proved the upper bounds in the master theorem for all integers n
...

Exercises
4
...
27) for the case in which b
is a positive integer instead of an arbitrary real number
...
6-2 ?
Show that if f
...
nlogb a lgk n/, where k 0, then the master recurrence has
solution T
...
nlogb a lgkC1 n/
...

4
...
n=b/ Ä cf
...
n/ D
...


Problems for Chapter 4

107

Problems
4-1 Recurrence examples
Give asymptotic upper and lower bounds for T
...
Assume that T
...
Make your bounds as tight as
possible, and justify your answers
...
T
...
n=2/ C n4
...
T
...
7n=10/ C n
...
T
...
n=4/ C n2
...
T
...
n=3/ C n2
...
T
...
n=2/ C n2
...
T
...
n=4/ C n
...
T
...
n

2/ C n2
...
This assumption
is valid in most systems because a pointer to the array is passed, not the array itself
...
An array is passed by pointer
...
1/
...
An array is passed by copying
...
N /, where N is the size of the array
...
An array is passed by copying only the subrange that might be accessed by the
called procedure
...
q p C 1/ if the subarray AŒp : : q is passed
...
Consider the recursive binary search algorithm for finding a number in a sorted
array (see Exercise 2
...
Give recurrences for the worst-case running times
of binary search when arrays are passed using each of the three methods above,
and give good upper bounds on the solutions of the recurrences
...

b
...
3
...


108

Chapter 4 Divide-and-Conquer

4-3 More recurrence examples
Give asymptotic upper and lower bounds for T
...
Assume that T
...
Make your bounds
as tight as possible, and justify your answers
...
T
...
n=3/ C n lg n
...
T
...
n=3/ C n= lg n
...
T
...
n=2/ C n2 n
...
T
...
n=3

2/ C n=2
...
T
...
n=2/ C n= lg n
...
T
...
n=2/ C T
...
n=8/ C n
...
T
...
n

1/ C 1=n
...
T
...
n

1/ C lg n
...
T
...
n 2/ C 1= lg n
...
T
...
n/ C n
...
22)
...
Define the generating function (or formal power series) F as
F
...

a
...
´/ D ´ C ´F
...


;

Problems for Chapter 4

109

b
...
´/ D
D
D

1

´
´ ´2
´


...
1 y´/
Ã
Â
1
1
1
;
p
´ 1 y´
5 1

where
p
1C 5
D 1:61803 : : :
D
2
and
yD

1

p
5
D
2

0:61803 : : : :

c
...

F
...
Use part (c) to proveˇthat Fi D
ˇ
(Hint: Observe that ˇ yˇ < 1
...


4-5 Chip testing
Professor Diogenes has n supposedly identical integrated-circuit chips that in principle are capable of testing each other
...
When the jig is loaded, each chip tests the other and reports whether
it is good or bad
...
Thus, the four
possible outcomes of a test are as follows:
Chip A says
B is good
B is good
B is bad
B is bad

Chip B says
A is good
A is bad
A is good
A is bad

Conclusion
both are good, or both are bad
at least one is bad
at least one is bad
at least one is bad

a
...
Assume that the bad chips can conspire to fool the professor
...
Consider the problem of finding a single good chip from among n chips, assuming that more than n=2 of the chips are good
...

c
...
n/ pairwise tests, assuming
that more than n=2 of the chips are good
...

4-6 Monge arrays
An m n array A of real numbers is a Monge array if for all i, j , k, and l such
that 1 Ä i < k Ä m and 1 Ä j < l Ä n, we have
AŒi; j  C AŒk; l Ä AŒi; l C AŒk; j  :
In other words, whenever we pick two rows and two columns of a Monge array and
consider the four elements at the intersections of the rows and the columns, the sum
of the upper-left and lower-right elements is less than or equal to the sum of the
lower-left and upper-right elements
...
Prove that an array is Monge if and only if for all i D 1; 2; :::; m
j D 1; 2; :::; n 1, we have

1 and

AŒi; j  C AŒi C 1; j C 1 Ä AŒi; j C 1 C AŒi C 1; j  :
(Hint: For the “if” part, use induction separately on rows and columns
...
The following array is not Monge
...
(Hint: Use part (a)
...
Let f
...
Prove that f
...
2/ Ä
Ä f
...

d
...

Recursively determine the leftmost minimum for each row of A0
...

Explain how to compute the leftmost minimum in the odd-numbered rows of A
(given that the leftmost minimum of the even-numbered rows is known) in
O
...

e
...
Show that its solution is O
...


Chapter notes
Divide-and-conquer as a technique for designing algorithms dates back to at least
1962 in an article by Karatsuba and Ofman [194]
...
F
...

The maximum-subarray problem in Section 4
...

Strassen’s algorithm [325] caused much excitement when it was published
in 1969
...
The asymptotic
upper bound for matrix multiplication has been improved since then
...
n2:376 /
...
n2 / bound (obvious because we must fill in n2
elements of the product matrix)
...
The constant factor hidden in the ‚
...
n3 /-time S QUARE -M ATRIX M ULTIPLY procedure
...
When the matrices are sparse, methods tailored for sparse matrices are faster
...
Strassen’s algorithm is not quite as numerically stable as S QUARE -M ATRIX M ULTIPLY
...

4
...

The latter two reasons were mitigated around 1990
...
Bailey, Lee, and Simon [32] discuss techniques for
reducing the memory requirements for Strassen’s algorithm
...
The exact value of the crossover point is highly system dependent
...
[186])
...
They found crossover points on various systems ranging from n D 400
to n D 2150, and they could not find a crossover point on a couple of systems
...
Fibonacci, for whom the Fibonacci numbers are named
...
De Moivre introduced the method of generating
functions (see Problem 4-4) for solving recurrences
...
6-2
...
Purdom and Brown [287] and Graham,
Knuth, and Patashnik [152] contain extended discussions of recurrence solving
...
We describe the result of Akra
and Bazzi here, as modified by Leighton [228]
...
1/
if 1 Ä x Ä x0 ;
(4
...
x/ D Pk
i D1 ai T
...
x/ if x > x0 ;
where
x

1 is a real number,

x0 is a constant such that x0

1=bi and x0

ai is a positive constant for i D 1; 2; : : : ; k,

1=
...
x/ is a nonnegative function that satisfies the polynomial-growth condi1, for
tion: there exist positive constants c1 and c2 such that for all x
i D 1; 2; : : : ; k, and for all u such that bi x Ä u Ä x, we have c1 f
...
u/ Ä c2 f
...
(If jf 0
...
x/ satisfies the polynomial-growth condition
...
x/ D x ˛ lgˇ x
satisfies this condition for any real constants ˛ and ˇ
...
n/ D
T
...
b2n=3c/ C O
...
To solve the rePk
currence (4
...

(Such a p always exists
...
u/
p
du
:
T
...
The master method is simpler to use, but it applies only when subproblem sizes are equal
...
If you
are unfamiliar with the basics of probability theory, you should read Appendix C,
which reviews this material
...


5
...
Your previous attempts at
hiring have been unsuccessful, and you decide to use an employment agency
...
You interview that person
and then decide either to hire that person or not
...
To actually hire an applicant is more
costly, however, since you must fire your current office assistant and pay a substantial hiring fee to the employment agency
...
Therefore, you decide that, after interviewing
each applicant, if that applicant is better qualified than the current office assistant,
you will fire the current office assistant and hire the new applicant
...

The procedure H IRE -A SSISTANT, given below, expresses this strategy for hiring
in pseudocode
...
The procedure assumes that you are able to, after interviewing
candidate i, determine whether candidate i is the best candidate you have seen so
far
...


5
...
n/
1 best D 0
/ candidate 0 is a least-qualified dummy candidate
/
2 for i D 1 to n
3
interview candidate i
4
if candidate i is better than candidate best
5
best D i
6
hire candidate i
The cost model for this problem differs from the model described in Chapter 2
...
On the surface, analyzing the cost of this algorithm may seem very different from analyzing the running time of, say, merge sort
...
In either case, we are counting the number of times certain
basic operations are executed
...
Letting m be the number of people hired, the total cost associated with this algorithm
is O
...
No matter how many people we hire, we always interview n
candidates and thus always incur the cost ci n associated with interviewing
...
This quantity varies with
each run of the algorithm
...
We often need to find the maximum or minimum value in a sequence by examining each
element of the sequence and maintaining a current “winner
...

Worst-case analysis
In the worst case, we actually hire every candidate that we interview
...
ch n/
...
In
fact, we have no idea about the order in which they arrive, nor do we have any
control over this order
...

Probabilistic analysis
Probabilistic analysis is the use of probability in the analysis of problems
...
Sometimes we use it to analyze other quantities, such as the hiring cost

116

Chapter 5 Probabilistic Analysis and Randomized Algorithms

in procedure H IRE -A SSISTANT
...

Then we analyze our algorithm, computing an average-case running time, where
we take the average over the distribution of the possible inputs
...
When reporting such a
running time, we will refer to it as the average-case running time
...
For some
problems, we may reasonably assume something about the set of all possible inputs, and then we can use probabilistic analysis as a technique for designing an
efficient algorithm and as a means for gaining insight into a problem
...

For the hiring problem, we can assume that the applicants come in a random
order
...
(See Appendix B for the definition of a total order
...
i/ to denote the rank of applicant i, and adopt the convention that a
higher rank corresponds to a better qualified applicant
...
1/;
rank
...
n/i is a permutation of the list h1; 2; : : : ; ni
...

Alternatively, we say that the ranks form a uniform random permutation; that is,
each of the possible nŠ permutations appears with equal probability
...
2 contains a probabilistic analysis of the hiring problem
...
In many cases, we know very little about the input distribution
...
Yet we often can use probability and randomness
as a tool for algorithm design and analysis, by making the behavior of part of the
algorithm random
...

Thus, in order to develop a randomized algorithm for the hiring problem, we must
have greater control over the order in which we interview the candidates
...
We say that the employment agency has n
candidates, and they send us a list of the candidates in advance
...
Although we know nothing about

5
...
Instead
of relying on a guess that the candidates come to us in a random order, we have
instead gained control of the process and enforced a random order
...
We shall assume that we have at our disposal a random-number generator
R ANDOM
...
a; b/ returns an integer between a and b, inclusive, with each such integer being equally likely
...
0; 1/
produces 0 with probability 1=2, and it produces 1 with probability 1=2
...
3; 7/ returns either 3, 4, 5, 6, or 7, each with probability 1=5
...

You may imagine R ANDOM as rolling a
...
(In practice, most programming environments offer a pseudorandom-number
generator: a deterministic algorithm returning numbers that “look” statistically
random
...
We distinguish these algorithms from those in which the input
is random by referring to the running time of a randomized algorithm as an expected running time
...

Exercises
5
...

5
...
a; b/ that only makes calls
to R ANDOM
...
What is the expected running time of your procedure, as a
function of a and b?
5
...

At your disposal is a procedure B IASED -R ANDOM , that outputs either 0 or 1
...
Give an algorithm that uses B IASED -R ANDOM
as a subroutine, and returns an unbiased answer, returning 0 with probability 1=2

118

Chapter 5 Probabilistic Analysis and Randomized Algorithms

and 1 with probability 1=2
...
2

Indicator random variables
In order to analyze many algorithms, including the hiring problem, we use indicator
random variables
...
Suppose we are given a sample
space S and an event A
...
1)
0 if A does not occur :
As a simple example, let us determine the expected number of heads that we
obtain when flipping a fair coin
...
We can then define an indicator random variable XH , associated
with the coin coming up heads, which is the event H
...
We write
XH

D I fH g
(
1 if H occurs ;
D
0 if T occurs :

The expected number of heads obtained in one flip of the coin is simply the expected value of our indicator variable XH :
E ŒXH  D
D
D
D

E ŒI fH g
1 Pr fH g C 0 Pr fT g
1
...
1=2/
1=2 :

Thus the expected number of heads obtained by one flip of a fair coin is 1=2
...

Lemma 5
...

Then E ŒXA  D Pr fAg
...
2 Indicator random variables

119

Proof By the definition of an indicator random variable from equation (5
...


Although indicator random variables may seem cumbersome for an application
such as counting the expected number of heads on a flip of a single coin, they are
useful for analyzing situations in which we perform repeated random trials
...
37)
...
The simpler method proposed in equation (C
...
Making this argument more explicit, we let Xi be the indicator
random variable associated with the event in which the ith flip comes up heads:
Xi D I fthe ith flip results in the event H g
...
By Lemma 5
...
By equation (C
...
Linearity of expectation makes the use of indicator random variables a
powerful analytical technique; it applies even when there is dependence among the
random variables
...
37), indicator random variables
greatly simplify the calculation
...

Analysis of the hiring problem using indicator random variables
Returning to the hiring problem, we now wish to compute the expected number of
times that we hire a new office assistant
...
(We shall see in Section 5
...
) Let X be the
random variable whose value equals the number of times we hire a new office assistant
...
20)
to obtain
E ŒX  D

n
X

x Pr fX D xg ;

xD1

but this calculation would be cumbersome
...

To use indicator random variables, instead of computing E ŒX  by defining one
variable associated with the number of times we hire a new office assistant, we
define n variables related to whether or not each particular candidate is hired
...
Thus,
Xi

D I fcandidate i is hiredg
(
1 if candidate i is hired ;
D
0 if candidate i is not hired ;

and
X D X1 C X2 C

C Xn :

(5
...
2 Indicator random variables

121

By Lemma 5
...

Candidate i is hired, in line 6, exactly when candidate i is better than each of
candidates 1 through i 1
...
Any one of
these first i candidates is equally likely to be the best-qualified so far
...
By Lemma 5
...
3)

Now we can compute E ŒX :
#
" n
X
Xi
(by equation (5
...
4)

i D1

D

n
X

E ŒXi 

(by linearity of expectation)

1=i

(by equation (5
...
1/ (by equation (A
...


(5
...
We summarize this result in the following lemma
...
2
Assuming that the candidates are presented in a random order, algorithm H IRE A SSISTANT has an average-case total hiring cost of O
...

Proof The bound follows immediately from our definition of the hiring cost
and equation (5
...

The average-case hiring cost is a significant improvement over the worst-case
hiring cost of O
...


122

Chapter 5 Probabilistic Analysis and Randomized Algorithms

Exercises
5
...
2-2
In H IRE -A SSISTANT, assuming that the candidates are presented in a random order, what is the probability that you hire exactly twice?
5
...

5
...
Each of n customers gives a hat to a hat-check person at a
restaurant
...
What is the expected number of customers who get back their own hat?
5
...
If i < j and AŒi > AŒj , then
the pair
...
(See Problem 2-4 for more on inversions
...
Use indicator random variables to compute the expected number of
inversions
...
3

Randomized algorithms
In the previous section, we showed how knowing a distribution on the inputs can
help us to analyze the average-case behavior of an algorithm
...
As mentioned
in Section 5
...

For a problem such as the hiring problem, in which it is helpful to assume that
all permutations of the input are equally likely, a probabilistic analysis can guide
the development of a randomized algorithm
...
In particular, before running the algorithm,
we randomly permute the candidates in order to enforce the property that every
permutation is equally likely
...
But now we expect

5
...

Let us further explore the distinction between probabilistic analysis and randomized algorithms
...
2, we claimed that, assuming that the candidates arrive in a random order, the expected number of times we hire a new office assistant
is about ln n
...
Furthermore,
the number of times we hire a new office assistant differs for different inputs, and it
depends on the ranks of the various candidates
...
e
...
1/; rank
...
n/i
...
Given the list of ranks A2 D h10; 9; 8; 7; 6; 5; 4; 3; 2; 1i,
a new office assistant is hired only once, in the first iteration
...
Recalling that the cost
of our algorithm depends on how many times we hire a new office assistant, we
see that there are expensive inputs such as A1 , inexpensive inputs such as A2 , and
moderately expensive inputs such as A3
...
In this case, we randomize in
the algorithm, not in the input distribution
...
The first time we run the algorithm on A3 ,
it may produce the permutation A1 and perform 10 updates; but the second time
we run the algorithm, we may produce the permutation A2 and perform only one
update
...

Each time we run the algorithm, the execution depends on the random choices
made and is likely to differ from the previous execution of the algorithm
...
Even your worst enemy cannot produce a bad input array,
since the random permutation makes the input order irrelevant
...

For the hiring problem, the only change needed in the code is to randomly permute the array
...
n/
1 randomly permute the list of candidates
2 best D 0
/ candidate 0 is a least-qualified dummy candidate
/
3 for i D 1 to n
4
interview candidate i
5
if candidate i is better than candidate best
6
best D i
7
hire candidate i
With this simple change, we have created a randomized algorithm whose performance matches that obtained by assuming that the candidates were presented in a
random order
...
3
The expected hiring cost of the procedure R ANDOMIZED -H IRE -A SSISTANT is
O
...

Proof After permuting the input array, we have achieved a situation identical to
that of the probabilistic analysis of H IRE -A SSISTANT
...
2 and 5
...
In Lemma 5
...
In Lemma 5
...
To remain consistent with our terminology, we
couched Lemma 5
...
3 in
terms of the expected hiring cost
...

Randomly permuting arrays
Many randomized algorithms randomize the input by permuting the given input
array
...
) Here, we shall discuss two
methods for doing so
...
Our goal is to produce a random
permutation of the array
...
For example, if our initial array is A D h1; 2; 3; 4i and we choose random priorities
P D h36; 3; 62; 19i, we would produce an array B D h2; 4; 1; 3i, since the second
priority is the smallest, followed by the fourth, then the first, and finally the third
...
3 Randomized algorithms

125

P ERMUTE -B Y-S ORTING
...
1; n3 /
5 sort A, using P as sort keys
Line 4 chooses a random number between 1 and n3
...
(Exercise 5
...
3-6 asks how to implement the algorithm even if two or more priorities
are identical
...

The time-consuming step in this procedure is the sorting in line 5
...
n lg n/ time
...
n lg n/
time
...
n lg n/ time in Part II
...
3-4 asks you to solve the very similar problem of sorting numbers in the
range 0 to n3 1 in O
...
) After sorting, if P Œi is the j th smallest priority,
then AŒi lies in position j of the output
...
It
remains to prove that the procedure produces a uniform random permutation, that
is, that the procedure is equally likely to produce every permutation of the numbers
1 through n
...
4
Procedure P ERMUTE - BY-S ORTING produces a uniform random permutation of the
input, assuming that all priorities are distinct
...
We shall show that this permutation
occurs with probability exactly 1=nŠ
...
Then we wish to compute the
probability that for all i, event Ei occurs, which is
Pr fE1 \ E2 \ E3 \

\ En

1

\ En g :

Using Exercise C
...
Next, we observe

126

Chapter 5 Probabilistic Analysis and Randomized Algorithms

that Pr fE2 j E1 g D 1=
...
In general, for i D 2; 3; : : : ; n, we have that
Pr fEi j Ei 1 \ Ei 2 \ \ E1 g D 1=
...
i 1/ elements has an equal chance of having the ith smallest priority
...

We can extend this proof to work for any permutation of priorities
...
1/;
...
n/i of the set f1; 2; : : : ; ng
...
If we define Ei as the event in which
element AŒi receives the
...
i /, the same proof
still applies
...

You might think that to prove that a permutation is a uniform random permutation, it suffices to show that, for each element AŒi, the probability that the element
winds up in position j is 1=n
...
3-4 shows that this weaker condition is,
in fact, insufficient
...
The procedure R ANDOMIZE -I N -P LACE does so in O
...
In
its ith iteration, it chooses the element AŒi randomly from among elements AŒi
through AŒn
...

R ANDOMIZE -I N -P LACE
...
i; n/
We shall use a loop invariant to show that procedure R ANDOMIZE -I N -P LACE
produces a uniform random permutation
...
(See
Appendix C
...
n k/Š such possible k-permutations
...
3 Randomized algorithms

127

Lemma 5
...

Proof

We use the following loop invariant:

Just prior to the ith iteration of the for loop of lines 2–3, for each possible

...
i 1/-permutation with probability
...

We need to show that this invariant is true prior to the first loop iteration, that each
iteration of the loop maintains the invariant, and that the invariant provides a useful
property to show correctness when the loop terminates
...
The loop invariant says that for each possible 0-permutation, the subarray AŒ1 : : 0 contains this 0-permutation with probability
...
The subarray AŒ1 : : 0 is an empty subarray, and a 0-permutation
has no elements
...

Maintenance: We assume that just before the ith iteration, each possible

...
n i C 1/Š=nŠ, and we shall show that after the ith iteration, each possible
i-permutation appears in the subarray AŒ1 : : i with probability
...

Incrementing i for the next iteration then maintains the loop invariant
...
Consider a particular i-permutation, and denote the elements in it by hx1 ; x2 ; : : : ; xi i
...
i 1/-permutation hx1 ; : : : ; xi 1 i followed by the value xi that the algorithm
places in AŒi
...
i 1/-permutation hx1 ; : : : ; xi 1 i in AŒ1 : : i 1
...
n i C 1/Š=nŠ
...
The i-permutation hx1 ; : : : ; xi i appears in AŒ1 : : i precisely when both E1 and E2 occur, and so we wish to compute Pr fE2 \ E1 g
...
14), we have
Pr fE2 \ E1 g D Pr fE2 j E1 g Pr fE1 g :
The probability Pr fE2 j E1 g equals 1=
...
Thus, we
have

128

Chapter 5 Probabilistic Analysis and Randomized Algorithms

Pr fE2 \ E1 g D Pr fE2 j E1 g Pr fE1 g

...
n i/Š
:
D

Termination: At termination, i D n C 1, and we have that the subarray AŒ1 : : n
is a given n-permutation with probability
...
nC1/C1/=nŠ D 0Š=nŠ D 1=nŠ
...

A randomized algorithm is often the simplest and most efficient way to solve a
problem
...

Exercises
5
...
5
...
He reasons that we could just
as easily declare that an empty subarray contains no 0-permutations
...
Rewrite the procedure
R ANDOMIZE -I N -P LACE so that its associated loop invariant applies to a nonempty
subarray prior to the first iteration, and modify the proof of Lemma 5
...

5
...
He proposes the following procedure:
P ERMUTE -W ITHOUT-I DENTITY
...
i C 1; n/
Does this code do what Professor Kelp intends?
5
...
3 Randomized algorithms

129

P ERMUTE -W ITH -A LL
...
1; n/
Does this code produce a uniform random permutation? Why or why not?
5
...
A/
1 n D A:length
2 let BŒ1 : : n be a new array
3 offset D R ANDOM
...
Then show that Professor Armstrong is mistaken by showing that
the resulting permutation is not uniformly random
...
3-5 ?
Prove that in the array P in procedure P ERMUTE -B Y-S ORTING, the probability
that all elements are unique is at least 1 1=n
...
3-6
Explain how to implement the algorithm P ERMUTE -B Y-S ORTING to handle the
case in which two or more priorities are identical
...

5
...
One way would be to set AŒi D i for i D 1; 2; 3; : : : ; n,
call R ANDOMIZE -I N -P LACE
...

This method would make n calls to the R ANDOM procedure
...
Show that

130

Chapter 5 Probabilistic Analysis and Randomized Algorithms

the following recursive procedure returns a random m-subset S of f1; 2; 3; : : : ; ng,
in which each m-subset is equally likely, while making only m calls to R ANDOM:
R ANDOM -S AMPLE
...
m
4
i D R ANDOM
...
4 Probabilistic analysis and further uses of indicator random variables
This advanced section further illustrates probabilistic analysis by way of four examples
...
The second example examines what happens when
we randomly toss balls into bins
...
The final example analyzes a variant of the hiring problem in which you have to make decisions without actually interviewing all the
candidates
...
4
...
How many people must there be in a
room before there is a 50% chance that two of them were born on the same day of
the year? The answer is surprisingly few
...

To answer this question, we index the people in the room with the integers
1; 2; : : : ; k, where k is the number of people in the room
...
For i D 1; 2; : : : ; k,
let bi be the day of the year on which person i’s birthday falls, where 1 Ä bi Ä n
...

The probability that two given people, say i and j , have matching birthdays
depends on whether the random selection of birthdays is independent
...
4 Probabilistic analysis and further uses of indicator random variables

131

and j ’s birthday both fall on day r is
Pr fbi D r and bj D rg D Pr fbi D rg Pr fbj D rg
D 1=n2 :
Thus, the probability that they both fall on the same day is
Pr fbi D bj g D
D

n
X
rD1
n
X

Pr fbi D r and bj D rg

...
6)

More intuitively, once bi is chosen, the probability that bj is chosen to be the same
day is 1=n
...
Notice,
however, that this coincidence depends on the assumption that the birthdays are
independent
...
The probability that at least two
of the birthdays match is 1 minus the probability that all the birthdays are different
...
Since we can write Bk D Ak \ Bk 1 , we obtain from equation (C
...
7)

where we take Pr fB1 g D Pr fA1 g D 1 as an initial condition
...

If b1 ; b2 ; : : : ; bk 1 are distinct, the conditional probability that bk ¤ bi for
i D 1; 2; : : : ; k 1 is Pr fAk j Bk 1 g D
...
k 1/ days are not taken
...
7) to obtain

132

Chapter 5 Probabilistic Analysis and Randomized Algorithms

Pr fBk g D Pr fBk 1 g Pr fAk j Bk 1 g
D Pr fBk 2 g Pr fAk 1 j Bk 2 g Pr fAk j Bk 1 g
:
:
:
D Pr fB1 g Pr fA2 j B1 g Pr fA3 j B2 g Pr fAk j Bk 1 g
ÃÂ
à Â
Ã
Â
n 2
n kC1
n 1
D 1
n
n
n
ÃÂ
à Â
Ã
Â
2
k 1
1
1
1
:
D 1 1
n
n
n
Inequality (3
...
k 1/=n

1

D e i D1 i=n
D e k
...
k 1/=2n Ä ln
...
The probability that all k birthdays are distinct
is at most 1=2 when k
...
Thus, if at
k

...
8 ln 2/n/=2
...
On Mars, a year is 669 Martian days long; it therefore
takes 31 Martians to get the same effect
...
For each pair
...
6), the probability that two people have matching birthdays is 1=n,
and thus by Lemma 5
...
4 Probabilistic analysis and further uses of indicator random variables

XD

k
k
X X

133

Xij :

i D1 j Di C1

Taking expectations of both sides and applying linearity of expectation, we obtain
" k
#
k
X X
E ŒX  D E
Xij
i D1 j Di C1

D

k
X

k
X

E ŒXij 

i D1 j Di C1

D
D

!
k 1
2 n

k
...
k 1/ 2n, therefore, the expected number of pairs of people with the
p
same birthday is at least 1
...
For n D 365, if k D 28, the
expected number of pairs with the same birthday is
...
2 365/ 1:0356
...
On Mars, where a year is 669 Martian days long, we need at least 38 Martians
...
Although
the exact numbers of people differ for the two situations, they are the same asympp
totically: ‚
...

5
...
2 Balls and bins
Consider a process in which we randomly toss identical balls into b bins, numbered
1; 2; : : : ; b
...
The probability that a tossed ball lands in any given bin is 1=b
...
4)
with a probability 1=b of success, where success means that the ball falls in the
given bin
...

(Problem C-1 asks additional questions about balls and bins
...
kI n; 1=b/
...
37)
tells us that the expected number of balls that fall in the given bin is n=b
...
32), the expected number of
tosses until success is 1=
...

How many balls must we toss until every bin contains at least one ball? Let us
call a toss in which a ball falls into an empty bin a “hit
...

Using the hits, we can partition the n tosses into stages
...
i 1/st hit until the ith hit
...
For each toss
during the ith stage, i 1 bins contain balls and b i C 1 bins are empty
...
b i C 1/=b
...
Thus, the number of tosses
Pb
required to get b hits is n D i D1 ni
...
b i C 1/=b and thus, by equation (C
...
ln b C O
...
7))
...
This problem is also known as the coupon collector’s problem, which
says that a person trying to collect each of b different coupons expects to acquire
approximately b ln b randomly obtained coupons in order to succeed
...
4 Probabilistic analysis and further uses of indicator random variables

135

5
...
3 Streaks
Suppose you flip a fair coin n times
...
lg n/, as the following analysis
shows
...
lg n/
...
Let Ai k be the event that a
streak of heads of length at least k begins with the ith coin flip or, more precisely,
the event that the k consecutive coin flips i; i C 1; : : : ; i C k 1 yield only heads,
where 1 Ä k Ä n and 1 Ä i Ä n k C1
...
8)

For k D 2 dlg ne,
Pr fAi;2dlg ne g D 1=22dlg ne
Ä 1=22 lg n
D 1=n2 ;
and thus the probability that a streak of heads of length at least 2 dlg ne begins in
position i is quite small
...
The probability that a streak of heads of length at least 2 dlg ne
begins anywhere is therefore
)
(n 2dlg neC1
n 2dlg neC1
[
X
Ai;2dlg ne
1=n2
Ä
Pr
i D1

i D1

<

n
X

1=n2

i D1

D 1=n ;

(5
...
19), the probability of a union of events is at most
the sum of the probabilities of the individual events
...
)
We now use inequality (5
...
For
j D 0; 1; 2; : : : ; n, let Lj be the event that the longest streak of heads has length exactly j , and let L be the length of the longest streak
...
10)

136

Chapter 5 Probabilistic Analysis and Randomized Algorithms

We could try to evaluate this sum using upper bounds on each Pr fLj g similar to
those computed in inequality (5
...
Unfortunately, this method would yield weak
bounds
...
Informally, we observe that for no individual term in the summation in equation (5
...
Why? When
j 2 dlg ne, then Pr fLj g is very small, and when j < 2 dlg ne, then j is fairly
small
...
By inequality (5
...

P2dlg ne 1
Pn
Pr fLj g Ä 1
...
2 dlg ne/ Pr fLj g C

j D0

j D2dlg ne

X

2dlg ne 1

D 2 dlg ne

n Pr fLj g

Pr fLj g C n

j D0

n
X

Pr fLj g

j D2dlg ne

< 2 dlg ne 1 C n
...
lg n/ :
The probability that a streak of heads exceeds r dlg ne flips diminishes quickly
with r
...

As an example, for n D 1000 coin flips, the probability of having a streak of at
least 2 dlg ne D 20 heads is at most 1=n D 1=1000
...

We now prove a complementary lower bound: the expected length of the longest
streak of heads in n coin flips is
...
To prove this bound, we look for streaks

5
...
If we choose s D b
...
lg n/
...
lg n/
...
lg n/=2cc groups of b
...

By equation (5
...
lg n/=2c g D 1=2b
...
lg n/=2c does not begin
p
in position i is therefore at most 1 1= n
...
lg n/=2cc groups are
formed from mutually exclusive, independent coin flips, the probability that every
one of these groups fails to be a streak of length b
...
lg n/=2c 1
p bn=b
...
2n= lg n 1/=
D O
...
1=n/ :

p
n

For this argument, we used inequality (3
...
2n= lg n 1/= n lg n for sufficiently large n
...
lg n/=2c is
n
X

Pr fLj g

1

O
...
11)

j Db
...
10) and proceeding in a manner similar to our analysis
of the upper bound:

138

Chapter 5 Probabilistic Analysis and Randomized Algorithms

E ŒL D

n
X

j Pr fLj g

j D0

X

n
X

b
...
lg n/=2c

0 Pr fLj g C

j D0

n
X

X

Pr fLj g C b
...
lg n/=2c Pr fLj g

j Db
...
lg n/=2c

D 0

j Pr fLj g

j Db
...
lg n/=2c
...
lg n/ :

n
X

Pr fLj g

j Db
...
1=n//

(by inequality (5
...
We let Xi k D I fAi k g be the indicator random
variable associated with a streak of heads of length at least k beginning with the
ith coin flip
...
If this number is large (much greater than 1), then we expect
many streaks of length k to occur and the probability that one occurs is high
...
4 Probabilistic analysis and further uses of indicator random variables

139

this number is small (much less than 1), then we expect few streaks of length k to
occur and the probability that one occurs is low
...
c lg n 1/=n
D
nc 1
nc 1
c 1
D ‚
...
On the other hand, if c D 1=2, then we obtain
E ŒX  D ‚
...
n1=2 /, and we expect that there are a large number
of streaks of length
...
Therefore, one streak of such a length is likely to
occur
...
lg n/
...
4
...
Suppose now that
we do not wish to interview all the candidates in order to find the best one
...
Instead, we
are willing to settle for a candidate who is close to the best, in exchange for hiring
exactly once
...
What is the trade-off between minimizing the amount of interviewing
and maximizing the quality of the candidate hired?
We can model this problem in the following way
...
i/ denote the score we give to the ith
applicant, and assume that no two applicants receive the same score
...
We
decide to adopt the strategy of selecting a positive integer k < n, interviewing and
then rejecting the first k applicants, and hiring the first applicant thereafter who has
a higher score than all preceding applicants
...
We
formalize this strategy in the procedure O N -L INE -M AXIMUM
...


140

Chapter 5 Probabilistic Analysis and Randomized Algorithms

O N -L INE -M AXIMUM
...
i/ > bestscore
4
bestscore D score
...
i/ > bestscore
7
return i
8 return n
We wish to determine, for each possible value of k, the probability that we
hire the most qualified applicant
...
For the moment, assume that k is
fixed
...
j / D max1Äi Äj fscore
...
Let S be the event that we succeed in choosing the bestqualified applicant, and let Si be the event that we succeed when the best-qualified
applicant is the ith one interviewed
...
Noting that we never succeed when the best-qualified
applicant is one of the first k, we have that Pr fSi g D 0 for i D 1; 2; : : : ; k
...
12)

i DkC1

We now compute Pr fSi g
...
First, the best-qualified applicant must be
in position i, an event which we denote by Bi
...
j / < bestscore in line 6
...
j / D bestscore
...
k C 1/ through score
...
k/; if any are greater than M
...
We use Oi to denote the event that none of the applicants in
position k C 1 through i 1 are chosen
...
The event Oi depends only on the relative ordering of the values
in positions 1 through i 1, whereas Bi depends only on whether the value in
position i is greater than the values in all other positions
...
Thus we can apply equation (C
...
4 Probabilistic analysis and further uses of indicator random variables

141

Pr fSi g D Pr fBi \ Oi g D Pr fBi g Pr fOi g :
The probability Pr fBi g is clearly 1=n, since the maximum is equally likely to
be in any one of the n positions
...
Consequently, Pr fOi g D k=
...
n
...
Using equation (5
...
i

i DkC1
n
X

k
n

i DkC1

1/
1

i

1

kX1
:
n
i
n 1

D

i Dk

We approximate by integrals to bound this summation from above and below
...
12), we have
Z n
n 1
X1 Z n 11
1
dx Ä
Ä
dx :
i
k x
k 1 x
i Dk

Evaluating these definite integrals gives us the bounds
k

...
ln
...
k

1// ;

which provide a rather tight bound for Pr fSg
...
(Besides, the lower-bound expression is easier to maximize
than the upper-bound expression
...
k=n/
...
ln n ln k 1/ :
n
Setting this derivative equal to 0, we see that we maximize the lower bound on the
probability when ln k D ln n 1 D ln
...
Thus,
if we implement our strategy with k D n=e, we succeed in hiring our best-qualified
applicant with probability at least 1=e
...
4-1
How many people must there be in a room before the probability that someone
has the same birthday as you do is at least 1=2? How many people must there be
before the probability that at least two people have a birthday on July 4 is greater
than 1=2?
5
...
Each toss
is independent, and each ball is equally likely to end up in any bin
...
4-3 ?
For the analysis of the birthday paradox, is it important that the birthdays be mutually independent, or is pairwise independence sufficient? Justify your answer
...
4-4 ?
How many people should be invited to a party in order to make it likely that there
are three people with the same birthday?
5
...
4-6 ?
Suppose that n balls are tossed into n bins, where each toss is independent and the
ball is equally likely to end up in any bin
...
4-7 ?
Sharpen the lower bound on streak length by showing that in n flips of a fair coin,
the probability is less than 1=n that no streak longer than lg n 2 lg lg n consecutive
heads occurs
...
With R
...

We let a counter value of i represent a count of ni for i D 0; 1; : : : ; 2b 1, where
the ni form an increasing sequence of nonnegative values
...
The I NCREMENT
operation works on a counter containing the value i in a probabilistic manner
...
Otherwise, the I NCRE MENT operation increases the counter by 1 with probability 1=
...
ni C1 ni /
...
More
If we select ni D i for all i
interesting situations arise if we select, say, ni D 2i 1 for i > 0 or ni D Fi (the
ith Fibonacci number—see Section 3
...

For this problem, assume that n2b 1 is large enough that the probability of an
overflow error is negligible
...
Show that the expected value represented by the counter after n I NCREMENT
operations have been performed is exactly n
...
The analysis of the variance of the count represented by the counter depends
on the sequence of the ni
...
Estimate the variance in the value represented by the register after n
I NCREMENT operations have been performed
...

Consider the following randomized strategy: pick a random index i into A
...
We continue picking random indices into A until we find an
index j such that AŒj  D x or until we have checked every element of A
...

a
...
Be sure that your algorithm terminates when all indices into A have
been picked
...
Suppose that there is exactly one index i such that AŒi D x
...
Generalizing your solution to part (b), suppose that there are k
1 indices i
such that AŒi D x
...

d
...
What is the expected
number of indices into A that we must pick before we have checked all elements
of A and R ANDOM -S EARCH terminates?
Now consider a deterministic linear search algorithm, which we refer to as
D ETERMINISTIC -S EARCH
...
Assume that all possible permutations of the input array are
equally likely
...
Suppose that there is exactly one index i such that AŒi D x
...
Generalizing your solution to part (e), suppose that there are k
1 indices i
such that AŒi D x
...

g
...
What is the average-case
running time of D ETERMINISTIC -S EARCH? What is the worst-case running
time of D ETERMINISTIC -S EARCH?
Finally, consider a randomized algorithm S CRAMBLE -S EARCH that works by
first randomly permuting the input array and then running the deterministic linear search given above on the resulting permuted array
...
Letting k be the number of indices i such that AŒi D x, give the worst-case and
expected running times of S CRAMBLE -S EARCH for the cases in which k D 0
and k D 1
...

i
...


Notes for Chapter 5

145

Chapter notes
Bollob´ s [53], Hofri [174], and Spencer [321] contain a wealth of advanced proba
abilistic techniques
...
The textbook by Motwani and Raghavan
[262] gives an extensive treatment of randomized algorithms
...
These problems
are more commonly referred to as “secretary problems
...


II

Sorting and Order Statistics

Introduction
This part presents several algorithms that solve the following sorting problem:
Input: A sequence of n numbers ha1 ; a2 ; : : : ; an i
...

that a1 Ä a2 Ä

The input sequence is usually an n-element array, although it may be represented
in some other fashion, such as a linked list
...
Each is usually part
of a collection of data called a record
...
The remainder of the record consists of satellite data, which are
usually carried around with the key
...
If each record includes a large
amount of satellite data, we often permute an array of pointers to the records rather
than the records themselves in order to minimize data movement
...
A sorting algorithm describes the method by which we
determine the sorted order, regardless of whether we are sorting individual numbers
or large records containing many bytes of satellite data
...

Translating an algorithm for sorting numbers into a program for sorting records

148

Part II Sorting and Order Statistics

is conceptually straightforward, although in a given engineering situation other
subtleties may make the actual programming task a challenge
...
There are several reasons:
Sometimes an application inherently needs to sort information
...

Algorithms often use sorting as a key subroutine
...
We shall see numerous algorithms in this text that
use sorting as a subroutine
...
In fact, many important techniques used throughout algorithm design appear in the body of sorting algorithms that have been
developed over the years
...

We can prove a nontrivial lower bound for sorting (as we shall do in Chapter 8)
...
Moreover, we can use
the lower bound for sorting to prove lower bounds for certain other problems
...
The fastest sorting program for a particular situation may depend on
many factors, such as prior knowledge about the keys and satellite data, the
memory hierarchy (caches and virtual memory) of the host computer, and the
software environment
...

Sorting algorithms
We introduced two algorithms that sort n real numbers in Chapter 2
...
n2 / time in the worst case
...
(Recall that a sorting
algorithm sorts in place if only a constant number of elements of the input array are ever stored outside the array
...
n lg n/, but the M ERGE procedure it uses does not operate in place
...
Heapsort, presented in Chapter 6, sorts n numbers in place in O
...

It uses an important data structure, called a heap, with which we can also implement a priority queue
...
n2 /
...
n lg n/, however, and it generally
outperforms heapsort in practice
...
It is a popular algorithm
for sorting large input arrays
...
Chapter 8 begins by introducing the decision-tree model in order to study the performance limitations of comparison sorts
...
n lg n/
on the worst-case running time of any comparison sort on n inputs, thus showing
that heapsort and merge sort are asymptotically optimal comparison sorts
...
n lg n/
if we can gather information about the sorted order of the input by means other
than comparing elements
...
By using array indexing as a tool
for determining relative order, counting sort can sort n numbers in ‚
...

Thus, when k D O
...
A related algorithm, radix sort, can be used to extend the range of
counting sort
...
d
...
When d is a constant and k is O
...
A third algorithm, bucket sort, requires knowledge of the probabilistic
distribution of numbers in the input array
...
n/ time
...
As usual, n denotes the number of items to sort
...
For radix sort, each item
is a d -digit number, where each digit takes on k possible values
...
The rightmost column gives the average-case or expected running
time, indicating which it gives when it differs from the worst-case running time
...


150

Part II Sorting and Order Statistics

Algorithm

Worst-case
running time

Average-case/expected
running time

Insertion sort
Merge sort
Heapsort
Quicksort
Counting sort
Radix sort
Bucket sort


...
n lg n/
O
...
n2 /

...
d
...
n2 /


...
n lg n/


...
k C n/

...
n C k//

...

We can, of course, select the ith order statistic by sorting the input and indexing
the ith element of the output
...
n lg n/ time, as the lower bound proved in Chapter 8 shows
...
n/ time,
even when the elements are arbitrary real numbers
...
n2 / time in the worst case, but whose
expected running time is O
...
We also give a more complicated algorithm that
runs in O
...

Background
Although most of this part does not rely on difficult mathematics, some sections
do require mathematical sophistication
...
The analysis of the worst-case linear-time algorithm for order statistics involves somewhat more sophisticated mathematics than the other worst-case
analyses in this part
...
Like merge sort,
but unlike insertion sort, heapsort’s running time is O
...
Like insertion sort,
but unlike merge sort, heapsort sorts in place: only a constant number of array
elements are stored outside the input array at any time
...

Heapsort also introduces another algorithm design technique: using a data structure, in this case one we call a “heap,” to manage information
...
The
heap data structure will reappear in algorithms in later chapters
...
Our heap data structure is not garbage-collected storage,
and whenever we refer to heaps in this book, we shall mean a data structure rather
than an aspect of garbage collection
...
1 Heaps
The (binary) heap data structure is an array object that we can view as a
nearly complete binary tree (see Section B
...
3), as shown in Figure 6
...
Each
node of the tree corresponds to an element of the array
...
An array A that represents a heap is an object with two attributes: A:length, which (as usual) gives the number of elements in the array, and
A:heap-size, which represents how many elements in the heap are stored within
array A
...
The root of the tree is AŒ1, and given the index i of a node, we
can easily compute the indices of its parent, left child, and right child:

152

Chapter 6 Heapsort

1

16
2

3

14

10

4

5

9

4

9

3

1

2

3

4

5

6

7

8

9

10

16 14 10 8

7

9

3

2

4

1

10

2

7

7

8
8

6

1
(a)

(b)

Figure 6
...
The number within the circle
at each node in the tree is the value stored at that node
...
Above and below the array are lines showing parent-child relationships; parents
are always to the left of their children
...


PARENT
...
i/
1 return 2i
R IGHT
...
Similarly, the
R IGHT procedure can quickly compute 2i C 1 by shifting the binary representation
of i left by one bit position and then adding in a 1 as the low-order bit
...
Good
implementations of heapsort often implement these procedures as “macros” or “inline” procedures
...
In both kinds,
the values in the nodes satisfy a heap property, the specifics of which depend on
the kind of heap
...
i/

AŒi ;

that is, the value of a node is at most the value of its parent
...
1 Heaps

153

values no larger than that contained at the node itself
...
i/ Ä AŒi :
The smallest element in a min-heap is at the root
...
Min-heaps commonly implement priority queues, which we discuss in Section 6
...
We shall be precise in
specifying whether we need a max-heap or a min-heap for any particular application, and when properties apply to either max-heaps or min-heaps, we just use the
term “heap
...
Since a heap of n elements is based on a complete binary tree, its height is ‚
...
1-2)
...
lg n/ time
...

The M AX -H EAPIFY procedure, which runs in O
...

The B UILD -M AX -H EAP procedure, which runs in linear time, produces a maxheap from an unordered input array
...
n lg n/ time, sorts an array in
place
...
lg n/ time, allow the heap
data structure to implement a priority queue
...
1-1
What are the minimum and maximum numbers of elements in a heap of height h?
6
...

6
...


154

Chapter 6 Heapsort

6
...
1-5
Is an array that is in sorted order a min-heap?
6
...
1-7
Show that, with the array representation for storing an n-element heap, the leaves
are the nodes indexed by bn=2c C 1; bn=2c C 2; : : : ; n
...
2

Maintaining the heap property
In order to maintain the max-heap property, we call the procedure M AX -H EAPIFY
...
When it is called, M AX H EAPIFY assumes that the binary trees rooted at L EFT
...
i/ are maxheaps, but that AŒi might be smaller than its children, thus violating the max-heap
property
...

M AX -H EAPIFY
...
i/
2 r D R IGHT
...
A; largest/
Figure 6
...
At each step, the largest of
the elements AŒi, AŒL EFT
...
i/ is determined, and its index is
stored in largest
...
Otherwise, one of the two children has the
largest element, and AŒi is swapped with AŒlargest, which causes node i and its

6
...
2 The action of M AX -H EAPIFY
...
(a) The initial configuration, with AŒ2 at node i D 2 violating the max-heap property since it is not larger than
both children
...
The recursive call M AX -H EAPIFY
...
After swapping AŒ4 with AŒ9, as shown in (c), node 4 is fixed up, and the recursive call
M AX -H EAPIFY
...


children to satisfy the max-heap property
...
Consequently, we call M AX -H EAPIFY recursively on that
subtree
...
1/ time to fix up the relationships among the elements AŒi,
AŒL EFT
...
i/, plus the time to run M AX -H EAPIFY on a subtree
rooted at one of the children of node i (assuming that the recursive call occurs)
...
n/ Ä T
...
1/ :

156

Chapter 6 Heapsort

The solution to this recurrence, by case 2 of the master theorem (Theorem 4
...
n/ D O
...
Alternatively, we can characterize the running time of M AX H EAPIFY on a node of height h as O
...

Exercises
6
...
2 as a model, illustrate the operation of M AX -H EAPIFY
...

6
...
A; i/, which performs the corresponding manipulation on a minheap
...
2-3
What is the effect of calling M AX -H EAPIFY
...
2-4
What is the effect of calling M AX -H EAPIFY
...
2-5
The code for M AX -H EAPIFY is quite efficient in terms of constant factors, except
possibly for the recursive call in line 10, which might cause some compilers to
produce inefficient code
...

6
...
lg n/
...
)

6
...
By Exercise 6
...
bn=2cC1/ : : n are all leaves of the tree, and so each is

6
...
The procedure B UILD -M AX -H EAP goes through
the remaining nodes of the tree and runs M AX -H EAPIFY on each one
...
A/
1 A:heap-size D A:length
2 for i D bA:length=2c downto 1
3
M AX -H EAPIFY
...
3 shows an example of the action of B UILD -M AX -H EAP
...

We need to show that this invariant is true prior to the first loop iteration, that each
iteration of the loop maintains the invariant, and that the invariant provides a useful
property to show correctness when the loop terminates
...
Each node
bn=2c C 1; bn=2c C 2; : : : ; n is a leaf and is thus the root of a trivial max-heap
...
By the loop invariant, therefore, they are both roots of max-heaps
...
A; i/ to make node i a max-heap root
...
Decrementing i in the for loop update reestablishes
the loop invariant for the next iteration
...
By the loop invariant, each node 1; 2; : : : ; n
is the root of a max-heap
...

We can compute a simple upper bound on the running time of B UILD -M AX H EAP as follows
...
lg n/ time, and B UILD M AX -H EAP makes O
...
Thus, the running time is O
...
This
upper bound, though correct, is not asymptotically tight
...
Our tighter analysis relies on the properties that an n-element heap
˙
has height blg nc (see Exercise 6
...
3-3)
...
h/,
and so we can express the total cost of B UILD -M AX -H EAP as being bounded from
above by

158

Chapter 6 Heapsort

A 4

1

3

2 16 9 10 14 8

7

1

1

4

4

2

3

1

3

4

5

2

2

1

6

7

9

i 16

3

10

3

4

i

5

9

10

8

9

8

7

14

8

9

10

10

14

7

16

2

8

6

7

(a)

(b)

1

1

4

4

2

3

1

3

2

i

i

3

1

10

4

5

6

7

4

5

6

7

14

16

9

10

14

16

9

3

8

9

10

2

8

8

7

9

10

2

8

7

(c)
1

i

(d)
1

4

16

2

3

2

3

16

10

14

10

4

5

9

8

9

3

10

2

7

7

14
8

6

1

4

5

4

9

3

10

2
(e)

9

7

7

8
8

6

1
(f)

Figure 6
...
(a) A 10-element input array A and the binary tree it represents
...
A; i/
...
The loop index i for the next iteration
refers to node 4
...
Observe that
whenever M AX -H EAPIFY is called on a node, the two subtrees of that node are both max-heaps
...


6
...
h/ D O n
2h
blg nc

!
:

hD0

We evalaute the last summation by substituting x D 1=2 in the formula (A
...
1

1=2
1=2/2

D 2:
Thus, we can bound the running time of B UILD -M AX -H EAP as
!
!
blg nc
1
X h
X h
D O n
O n
2h
2h
hD0

hD0

D O
...

We can build a min-heap by the procedure B UILD -M IN -H EAP, which is the
same as B UILD -M AX -H EAP but with the call to M AX -H EAPIFY in line 3 replaced
by a call to M IN -H EAPIFY (see Exercise 6
...
B UILD -M IN -H EAP produces a
min-heap from an unordered linear array in linear time
...
3-1
Using Figure 6
...

6
...
3-3
˙
Show that there are at most n=2hC1 nodes of height h in any n-element heap
...
4 The heapsort algorithm
The heapsort algorithm starts by using B UILD -M AX -H EAP to build a max-heap
on the input array AŒ1 : : n, where n D A:length
...
If we now discard node n from the heap—and we
can do so by simply decrementing A:heap-size—we observe that the children of
the root remain max-heaps, but the new root element might violate the max-heap
property
...
A; 1/, which leaves a max-heap in AŒ1 : : n 1
...
(See Exercise 6
...
)
H EAPSORT
...
A/
2 for i D A:length downto 2
3
exchange AŒ1 with AŒi
4
A:heap-size D A:heap-size
5
M AX -H EAPIFY
...
4 shows an example of the operation of H EAPSORT after line 1 has built
the initial max-heap
...

The H EAPSORT procedure takes time O
...
n/ and each of the n 1 calls to M AX -H EAPIFY takes
time O
...

Exercises
6
...
4 as a model, illustrate the operation of H EAPSORT on the array
A D h5; 13; 2; 25; 7; 17; 20; 8; 4i
...
4-2
Argue the correctness of H EAPSORT using the following loop invariant:
At the start of each iteration of the for loop of lines 2–5, the subarray
AŒ1 : : i is a max-heap containing the i smallest elements of AŒ1 : : n, and
the subarray AŒi C 1 : : n contains the n i largest elements of AŒ1 : : n,
sorted
...
4-3
What is the running time of H EAPSORT on an array A of length n that is already
sorted in increasing order? What about decreasing order?
6
...
n lg n/
...
4 The heapsort algorithm

161

16

14

14

10

8
2

7
4

9

10

8
3

10

4

1

2

7
1

9

8
3

16 i

9

4

7

1

2

(a)

(b)

(c)

9

8

7

8

3

4
i 10

7
14

1

7
2

3

4

16

10

2
14

1

4
i 9

16

3

1
10

2
14

8 i

(e)

(f)

4

3

2

2

10

3
i 7

14

9

16

(d)

1

3

i
14 16

8

2
9

1

i 4

16

10

7
14

(g)

8

1
9

16

4
10

(h)

3 i
7

14

8

9

16
(i)

1
i 2

3
A

4
10

7
14

8

1

2 3

4 7

8 9 10 14 16

9

16
(j)

(k)

Figure 6
...
(a) The max-heap data structure just after B UILD -M AX H EAP has built it in line 1
...
Only lightly shaded nodes remain in the heap
...


162

Chapter 6 Heapsort

6
...
n lg n/
...
5

Priority queues
Heapsort is an excellent algorithm, but a good implementation of quicksort, presented in Chapter 7, usually beats it in practice
...
In this section, we present one of the most popular applications of a heap: as an efficient priority queue
...
We will focus
here on how to implement max-priority queues, which are in turn based on maxheaps; Exercise 6
...

A priority queue is a data structure for maintaining a set S of elements, each
with an associated value called a key
...
S; x/ inserts the element x into the set S, which is equivalent to the operation S D S [ fxg
...
S/ returns the element of S with the largest key
...
S/ removes and returns the element of S with the largest key
...
S; x; k/ increases the value of element x’s key to the new value k,
which is assumed to be at least as large as x’s current key value
...
The max-priority queue keeps track of the jobs to
be performed and their relative priorities
...
The scheduler can add a new job to the queue at any time by
calling I NSERT
...
A min-priority queue can be used in an
event-driven simulator
...
The events must be
simulated in order of their time of occurrence, because the simulation of an event
can cause other events to be simulated in the future
...
As new events are
produced, the simulator inserts them into the min-priority queue by calling I NSERT
...
5 Priority queues

163

We shall see other uses for min-priority queues, highlighting the D ECREASE -K EY
operation, in Chapters 23 and 24
...
In a given application, such as job scheduling or event-driven simulation, elements of a priority
queue correspond to objects in the application
...

When we use a heap to implement a priority queue, therefore, we often need to
store a handle to the corresponding application object in each heap element
...
Similarly, we need to store a handle to the corresponding heap element
in each application object
...

Because heap elements change locations within the array during heap operations,
an actual implementation, upon relocating a heap element, would also have to update the array index in the corresponding application object
...

Now we discuss how to implement the operations of a max-priority queue
...
1/ time
...
A/
1 return AŒ1
The procedure H EAP -E XTRACT-M AX implements the E XTRACT-M AX operation
...

H EAP -E XTRACT-M AX
...
A; 1/
7 return max

1

The running time of H EAP -E XTRACT-M AX is O
...
lg n/ time for M AX -H EAPIFY
...
An index i into the array identifies the priority-queue element whose key we
wish to increase
...
Because increasing the key of AŒi might violate the max-heap property,

164

Chapter 6 Heapsort

the procedure then, in a manner reminiscent of the insertion loop (lines 5–7) of
I NSERTION -S ORT from Section 2
...
As H EAP -I NCREASE K EY traverses this path, it repeatedly compares an element to its parent, exchanging their keys and continuing if the element’s key is larger, and terminating if the element’s key is smaller, since the max-heap property now holds
...
5-5
for a precise loop invariant
...
A; i; key/
1 if key < AŒi
2
error “new key is smaller than current key”
3 AŒi D key
4 while i > 1 and AŒPARENT
...
i/
6
i D PARENT
...
5 shows an example of a H EAP -I NCREASE -K EY operation
...
lg n/, since the path
traced from the node updated in line 3 to the root has length O
...

The procedure M AX -H EAP -I NSERT implements the I NSERT operation
...
The procedure first expands the max-heap by adding to the tree a new leaf whose key is 1
...

M AX -H EAP -I NSERT
...
A; A:heap-size; key/
The running time of M AX -H EAP -I NSERT on an n-element heap is O
...

In summary, a heap can support any priority-queue operation on a set of size n
in O
...

Exercises
6
...


6
...
5 The operation of H EAP -I NCREASE -K EY
...
4(a) with a
node whose index is i heavily shaded
...
(c) After one
iteration of the while loop of lines 4–6, the node and its parent have exchanged keys, and the index i
moves up to the parent
...
At this point,
AŒPARENT
...
The max-heap property now holds and the procedure terminates
...
5-2
Illustrate the operation of M AX -H EAP -I NSERT
...

6
...

6
...
5-5
Argue the correctness of H EAP -I NCREASE -K EY using the following loop invariant:
At the start of each iteration of the while loop of lines 4–6, the subarray
AŒ1 : : A:heap-size satisfies the max-heap property, except that there may
be one violation: AŒi may be larger than AŒPARENT
...

You may assume that the subarray AŒ1 : : A:heap-size satisfies the max-heap property at the time H EAP -I NCREASE -K EY is called
...
5-6
Each exchange operation on line 5 of H EAP -I NCREASE -K EY typically requires
three assignments
...

6
...
Show
how to implement a stack with a priority queue
...
1
...
5-8
The operation H EAP -D ELETE
...
Give
an implementation of H EAP -D ELETE that runs in O
...

6
...
n lg k/-time algorithm to merge k sorted lists into one sorted list,
where n is the total number of elements in all the input lists
...
)

Problems
6-1 Building a heap using insertion
We can build a heap by repeatedly calling M AX -H EAP -I NSERT to insert the elements into the heap
...
A/
1 A:heap-size D 1
2 for i D 2 to A:length
3
M AX -H EAP -I NSERT
...
Do the procedures B UILD -M AX -H EAP and B UILD -M AX -H EAP 0 always create
the same heap when run on the same input array? Prove that they do, or provide
a counterexample
...
Show that in the worst case, B UILD -M AX -H EAP 0 requires ‚
...

6-2 Analysis of d -ary heaps
A d -ary heap is like a binary heap, but (with one possible exception) non-leaf
nodes have d children instead of 2 children
...
How would you represent a d -ary heap in an array?
b
...
Give an efficient implementation of E XTRACT-M AX in a d -ary max-heap
...

d
...
Analyze its
running time in terms of d and n
...
Give an efficient implementation of I NCREASE -K EY
...
Analyze its running time in terms of d and n
...
Some of the entries of a Young tableau may be 1, which we
treat as nonexistent elements
...

a
...

b
...
Argue that Y
is full (contains mn elements) if Y Œm; n < 1
...
Give an algorithm to implement E XTRACT-M IN on a nonempty m n Young
tableau that runs in O
...
Your algorithm should use a recursive subroutine that solves an m n problem by recursively solving either
an
...
n 1/ subproblem
...
) Define T
...
Give and solve a recurrence
for T
...
m C n/ time bound
...
Show how to insert a new element into a nonfull m
O
...


n Young tableau in

e
...
n3 / time
...
Give an O
...


Chapter notes
The heapsort algorithm was invented by Williams [357], who also described how
to implement a priority queue with a heap
...

We use min-heaps to implement min-priority queues in Chapters 16, 23, and 24
...

If the data are b-bit integers, and the computer memory consists of addressable
b-bit words, Fredman and Willard [115] showed how to implement M INIMUM in
p
O
...
lg n/ time
...
lg n/ bound to O
...
This bound uses an amount of
space unbounded in n, but it can be implemented in linear space by using randomized hashing
...
This case
arises in several important applications, such as Dijkstra’s single-source shortestpaths algorithm, which we discuss in Chapter 24, and in discrete-event simulation
...
For the monotone case, if the data are integers in the range 1; 2; : : : ; C , Ahuja, Mehlhorn, Orlin, and Tarjan [8] describe

Notes for Chapter 6

169

how to implement E XTRACT-M IN and I NSERT in O
...
1/ time,
using pdata structure called a radix heap
...
lg C / bound can be improved
a
to O
...
Cherkassky, Goldberg, and Silverstein [65] further improved the bound to
O
...
Raman [291]
further improved these results to obtain a bound of O
...
lg1=4C C; lg1=3C n//,
for any fixed > 0
...
n2 / on an input array
of n numbers
...
n lg n/, and the constant factors hidden in the ‚
...
It also has the advantage of sorting in place (see page 17),
and it works well even in virtual-memory environments
...
1 describes the algorithm and an important subroutine used by quicksort for partitioning
...
2 and postpone its precise
analysis to the end of the chapter
...
3 presents a version of quicksort that
uses random sampling
...
Section 7
...
n2 / time in the worst case and, assuming
distinct elements, in expected O
...


7
...
3
...
Here is the three-step divide-and-conquer process for sorting a
typical subarray AŒp : : r:
Divide: Partition (rearrange) the array AŒp : : r into two (possibly empty) subarrays AŒp : : q 1 and AŒq C 1 : : r such that each element of AŒp : : q 1 is
less than or equal to AŒq, which is, in turn, less than or equal to each element
of AŒq C 1 : : r
...

Conquer: Sort the two subarrays AŒp : : q
to quicksort
...
1 Description of quicksort

171

Combine: Because the subarrays are already sorted, no work is needed to combine
them: the entire array AŒp : : r is now sorted
...
A; p; r/
1 if p < r
2
q D PARTITION
...
A; p; q 1/
4
Q UICKSORT
...
A; 1; A:length/
...

PARTITION
...
1 shows how PARTITION works on an 8-element array
...
As the procedure runs, it partitions the array into four (possibly
empty) regions
...
2
...
If p Ä k Ä i, then AŒk Ä x
...
If i C 1 Ä k Ä j 1, then AŒk > x
...
If k D r, then AŒk D x
...
1 The operation of PARTITION on a sample array
...
Lightly shaded array elements are all in the first partition with values no greater than x
...
The unshaded elements have not yet been put in one of the first two partitions, and the final white element is the
pivot x
...
None of the elements have been placed in either
of the first two partitions
...
(c)–(d) The values 8 and 7 are added to the partition of larger values
...
(f) The values 3 and 7 are swapped, and the smaller
partition grows
...
(i) In
lines 7–8, the pivot element is swapped so that it lies between the two partitions
...

We need to show that this loop invariant is true prior to the first iteration, that
each iteration of the loop maintains the invariant, and that the invariant provides a
useful property to show correctness when the loop terminates
...
1 Description of quicksort

p

i

≤x

173

j

>x

r
x
unrestricted

Figure 7
...
The
values in AŒp : : i are all less than or equal to x, the values in AŒi C 1 : : j 1 are all greater than x,
and AŒr D x
...


Initialization: Prior to the first iteration of the loop, i D p 1 and j D p
...
The assignment in line 1 satisfies the third condition
...
3 shows, we consider two cases, depending on the
outcome of the test in line 4
...
3(a) shows what happens when AŒj  > x;
the only action in the loop is to increment j
...
Figure 7
...
Because of the swap, we now have that AŒi Ä x, and
condition 1 is satisfied
...

Termination: At termination, j D r
...

The final two lines of PARTITION finish up by swapping the pivot element with
the leftmost element greater than x, thereby moving the pivot into its correct place
in the partitioned array, and then returning the pivot’s new index
...
In fact, it
satisfies a slightly stronger condition: after line 2 of Q UICKSORT, AŒq is strictly
less than every element of AŒq C 1 : : r
...
n/, where
n D r p C 1 (see Exercise 7
...

Exercises
7
...
1 as a model, illustrate the operation of PARTITION on the array
A D h13; 19; 9; 5; 12; 8; 7; 4; 21; 2; 6; 11i
...
3 The two cases for one iteration of procedure PARTITION
...
(b) If AŒj  Ä x, index i is incremented,
AŒi and AŒj  are swapped, and then j is incremented
...


7
...
p C r/=2c when all
elements in the array AŒp : : r have the same value
...
1-3
Give a brief argument that the running time of PARTITION on a subarray of size n
is ‚
...

7
...
2

Performance of quicksort
The running time of quicksort depends on whether the partitioning is balanced or
unbalanced, which in turn depends on which elements are used for partitioning
...
2 Performance of quicksort

175

sort
...
In this section, we shall informally investigate how quicksort
performs under the assumptions of balanced versus unbalanced partitioning
...
(We prove
this claim in Section 7
...
1
...
The partitioning costs ‚
...
Since the recursive call
on an array of size 0 just returns, T
...
1/, and the recurrence for the running
time is
T
...
n
D T
...
0/ C ‚
...
n/ :

Intuitively, if we sum the costs incurred at each level of the recursion, we get
an arithmetic series (equation (A
...
n2 /
...
n/ D
T
...
n/ has the solution T
...
n2 /
...
2-1
...
n2 /
...
Moreover, the ‚
...
n/ time
...
In this
case, quicksort runs much faster
...
n/ D 2T
...
n/ ;
where we tolerate the sloppiness from ignoring the floor and ceiling and from subtracting 1
...
1), this recurrence has the
solution T
...
n lg n/
...

Balanced partitioning
The average-case running time of quicksort is much closer to the best case than to
the worst case, as the analyses in Section 7
...
The key to understand-

176

Chapter 7 Quicksort

n

1
10

log10 n

1
100

n

cn

9
10

n

9
100

n

9
100

n

81
100

n

log10=9 n
1

cn

81
1000

n

n

729
1000

cn

n

cn

Ä cn

1

Ä cn
O
...
4 A recursion tree for Q UICKSORT in which PARTITION always produces a 9-to-1 split,
yielding a running time of O
...
Nodes show subproblem sizes, with per-level costs on the right
...
n/ term
...

Suppose, for example, that the partitioning algorithm always produces a 9-to-1
proportional split, which at first blush seems quite unbalanced
...
n/ D T
...
n=10/ C cn ;
on the running time of quicksort, where we have explicitly included the constant c
hidden in the ‚
...
Figure 7
...

Notice that every level of the tree has cost cn, until the recursion reaches a boundary condition at depth log10 n D ‚
...

The recursion terminates at depth log10=9 n D ‚
...
The total cost of quicksort is therefore O
...
Thus, with a 9-to-1 proportional split at every level of
recursion, which intuitively seems quite unbalanced, quicksort runs in O
...
Indeed,
even a 99-to-1 split yields an O
...
In fact, any split of constant
proportionality yields a recursion tree of depth ‚
...
n/
...
n lg n/ whenever the split has constant
proportionality
...
2 Performance of quicksort

177

n
Θ(n)
0

Θ(n)

n

n–1
(n–1)/2
(n–1)/2 – 1

(n–1)/2

(n–1)/2

(a)

(b)

Figure 7
...
The partitioning at the root costs n
and produces a “bad” split: two subarrays of sizes 0 and n 1
...
n 1/=2 1 and
...

(b) A single level of a recursion tree that is very well balanced
...
n/
...


Intuition for the average case
To develop a clear notion of the randomized behavior of quicksort, we must make
an assumption about how frequently we expect to encounter the various inputs
...
As
in our probabilistic analysis of the hiring problem in Section 5
...

When we run quicksort on a random input array, the partitioning is highly unlikely to happen in the same way at every level, as our informal analysis has assumed
...
For example, Exercise 7
...

In the average case, PARTITION produces a mix of “good” and “bad” splits
...
Suppose, for the sake of intuition,
that the good and bad splits alternate levels in the tree, and that the good splits
are best-case splits and the bad splits are worst-case splits
...
5(a) shows
the splits at two consecutive levels in the recursion tree
...
At the next level, the subarray of size n 1 undergoes best-case
partitioning into subarrays of size
...
n 1/=2
...


178

Chapter 7 Quicksort

The combination of the bad split followed by the good split produces three subarrays of sizes 0,
...
n 1/=2 at a combined partitioning cost
of ‚
...
n 1/ D ‚
...
Certainly, this situation is no worse than that in
Figure 7
...
n 1/=2, at a cost of ‚
...
Yet this latter situation is balanced! Intuitively,
the ‚
...
n/ cost of the good
split, and the resulting split is good
...
n lg n/, but with a slightly larger constant hidden by the O-notation
...
4
...

Exercises
7
...
n/ D T
...
n/
has the solution T
...
n2 /, as claimed at the beginning of Section 7
...

7
...
2-3
Show that the running time of Q UICKSORT is ‚
...

7
...
People usually write checks in order by check number, and
merchants usually cash them with reasonable dispatch
...
Argue that the procedure I NSERTION -S ORT would
tend to beat the procedure Q UICKSORT on this problem
...
2-5
Suppose that the splits at every level of quicksort are in the proportion 1 ˛ to ˛,
where 0 < ˛ Ä 1=2 is a constant
...
1 ˛/
...
)

7
...
2-6 ?
Argue that for any constant 0 < ˛ Ä 1=2, the probability is approximately 1 2˛
that on a random input array, PARTITION produces a split more balanced than 1 ˛
to ˛
...
3 A randomized version of quicksort
In exploring the average-case behavior of quicksort, we have made an assumption
that all permutations of the input numbers are equally likely
...
(See Exercise 7
...
) As we saw in Section 5
...
Many people regard the resulting randomized version of quicksort as the sorting algorithm
of choice for large enough inputs
...
3, we randomized our algorithm by explicitly permuting the input
...
Instead of always using AŒr
as the pivot, we will select a randomly chosen element from the subarray AŒp : : r
...
By randomly sampling the range p; : : : ; r, we ensure that the pivot
element x D AŒr is equally likely to be any of the r p C 1 elements in the
subarray
...

The changes to PARTITION and Q UICKSORT are small
...
A; p; r/
1 i D R ANDOM
...
A; p; r/
The new quicksort calls R ANDOMIZED -PARTITION in place of PARTITION:
R ANDOMIZED -Q UICKSORT
...
A; p; r/
3
R ANDOMIZED -Q UICKSORT
...
A; q C 1; r/
We analyze this algorithm in the next section
...
3-1
Why do we analyze the expected running time of a randomized algorithm and not
its worst-case running time?
7
...


7
...
2 gave some intuition for the worst-case behavior of quicksort and for
why we expect it to run quickly
...
We begin with a worst-case analysis, which applies to either
Q UICKSORT or R ANDOMIZED -Q UICKSORT, and conclude with an analysis of the
expected running time of R ANDOMIZED -Q UICKSORT
...
4
...
2 that a worst-case split at every level of recursion in quicksort
produces a ‚
...
We now prove this assertion
...
3), we can show that the running
time of quicksort is O
...
Let T
...
We have the recurrence
T
...
T
...
n
0ÄqÄn 1

q

1// C ‚
...
1)

where the parameter q ranges from 0 to n 1 because the procedure PARTITION
produces two subproblems with total size n 1
...
n/ Ä cn2 for
some constant c
...
1), we obtain
T
...
cq 2 C c
...
q 2 C
...
n/
1/2 / C ‚
...
n q 1/2 achieves a maximum over the parameter’s
range 0 Ä q Ä n 1 at either endpoint
...
4-3)
...
4 Analysis of quicksort

181

observation gives us the bound max0ÄqÄn 1
...
n q 1/2 / Ä
...
Continuing with our bounding of T
...
n/ Ä cn2 c
...
n/

since we can pick the constant c large enough so that the c
...
n/ term
...
n/ D O
...
We saw in Section 7
...
n2 / time: when partitioning is unbalanced
...
4-1 asks you to show that recurrence (7
...
n/ D
...
Thus, the (worst-case) running time of quicksort is ‚
...

7
...
2 Expected running time
We have already seen the intuition behind why the expected running time of
R ANDOMIZED -Q UICKSORT is O
...
lg n/, and O
...
Even if we add a few new levels with the most unbalanced split possible between these levels, the total time remains O
...
We
can analyze the expected running time of R ANDOMIZED -Q UICKSORT precisely
by first understanding how the partitioning procedure operates and then using this
understanding to derive an O
...
This
upper bound on the expected running time, combined with the ‚
...
2, yields a ‚
...
We assume
throughout that the values of the elements being sorted are distinct
...
We can therefore
couch our analysis of R ANDOMIZED -Q UICKSORT by discussing the Q UICKSORT
and PARTITION procedures, but with the assumption that pivot elements are selected randomly from the subarray passed to R ANDOMIZED -PARTITION
...
Each time the PARTITION procedure is called, it selects a pivot
element, and this element is never included in any future recursive calls to Q UICK SORT and PARTITION
...
One call to PARTITION takes O
...
Each iteration of this for loop performs a comparison in
line 4, comparing the pivot element to another element of the array A
...

Lemma 7
...
Then the running time of
Q UICKSORT is O
...

Proof By the discussion above, the algorithm makes at most n calls to PARTI TION , each of which does a constant amount of work and then executes the for
loop some number of times
...

Our goal, therefore, is to compute X , the total number of comparisons performed
in all calls to PARTITION
...
Rather, we will derive an overall bound on the
total number of comparisons
...
For ease of analysis, we
rename the elements of the array A as ´1 ; ´2 ; : : : ; ´n , with ´i being the ith smallest
element
...

When does the algorithm compare ´i and ´j ? To answer this question, we first
observe that each pair of elements is compared at most once
...

Our analysis uses indicator random variables (see Section 5
...
We define
Xij D I f´i is compared to ´j g ;
where we are considering whether the comparison takes place at any time during
the execution of the algorithm, not just during one iteration or one call of PARTI TION
...
1, we obtain
#
"n 1 n
X X
Xij
E ŒX  D E
i D1 j Di C1

7
...
2)

i D1 j Di C1

It remains to compute Pr f´i is compared to ´j g
...

Let us think about when two items are not compared
...
Then the first call to PARTITION separates the numbers into two
sets: f1; 2; 3; 4; 5; 6g and f8; 9; 10g
...
g
...
g
...

In general, because we assume that element values are distinct, once a pivot x
is chosen with ´i < x < ´j , we know that ´i and ´j cannot be compared at any
subsequent time
...
Similarly,
if ´j is chosen as a pivot before any other item in Zij , then ´j will be compared to
each item in Zij , except for itself
...
In contrast, 2 and 9 will
never be compared because the first pivot element chosen from Z2;9 is 7
...

We now compute the probability that this event occurs
...
Therefore, any element of Zij is equally likely to be the first
one chosen as a pivot
...
j i C 1/
...
3)

184

Chapter 7 Quicksort

The second line follows because the two events are mutually exclusive
...
2) and (7
...
7):
E ŒX  D

n 1
n
X X
i D1 j Di C1

D

n 1 n i
XX
i D1 kD1

j

i) and the bound

2
i C1

2
kC1

n 1 n
XX 2
<
k
i D1
kD1

D

n 1
X

O
...
n lg n/ :

(7
...
n lg n/ when element values are distinct
...
4-1
Show that in the recurrence
T
...
T
...
n
0ÄqÄn 1

T
...
n/ ;


...


7
...
4-3
Show that the expression q 2 C
...



...


1/2 achieves a maximum over q D

7
...
n lg n/
...
4-5
We can improve the running time of quicksort in practice by taking advantage of the
fast running time of insertion sort when its input is “nearly” sorted
...
After the top-level call to quicksort returns, run insertion sort
on the entire array to finish the sorting process
...
nk C n lg
...
How should we pick k, both in theory
and in practice?
7
...
Approximate the probability of getting at worst an ˛-to-
...


Problems
7-1 Hoare partition correctness
The version of PARTITION given in this chapter is not the original partitioning
algorithm
...
A
...
Hoare:
H OARE -PARTITION
...
Demonstrate the operation of H OARE -PARTITION on the array A D h13; 19; 9;
5; 12; 8; 7; 4; 11; 2; 6; 21i, showing the values of the array and auxiliary values
after each iteration of the while loop in lines 4–13
...
Assuming that the subarray AŒp : : r contains at
least two elements, prove the following:
b
...

c
...

d
...

The PARTITION procedure in Section 7
...
The H OARE -PARTITION procedure, on
the other hand, always places the pivot value (originally in AŒp) into one of the
two partitions AŒp : : j  and AŒj C 1 : : r
...

e
...

7-2 Quicksort with equal element values
The analysis of the expected running time of randomized quicksort in Section 7
...
2
assumes that all element values are distinct
...

a
...
What would be randomized quicksort’s running time in this case?
b
...
Modify the PARTITION procedure to produce a procedure
PARTITION 0
...

Like PARTITION, your PARTITION 0 procedure should take ‚
...


c
...
Then modify the
Q UICKSORT procedure to produce a procedure Q UICKSORT 0
...

d
...
4
...

a
...
Use this to define indicator random variables
Xi D I fith smallest element is chosen as the pivotg
...
Let T
...
Argue that
#
" n
X
Xq
...
q 1/ C T
...
n// :
(7
...
n/ D E
qD1

c
...
5) as
2X
E ŒT
...
n/ :
n qD2
n 1

E ŒT
...
6)

d
...
)

(7
...
Using the bound from equation (7
...
6)
has the solution E ŒT
...
n lg n/
...
n/ Ä an lg n for sufficiently large n and for some positive constant a
...
1 contains two recursive calls to itself
...
The second recursive call in Q UICKSORT
is not really necessary; we can avoid it by using an iterative control structure
...

Consider the following version of quicksort, which simulates tail recursion:
TAIL -R ECURSIVE -Q UICKSORT
...

/
3
q D PARTITION
...
A; p; q
5
p D qC1

1/

a
...
A; 1; A:length/ correctly sorts the
array A
...
The
information for the most recent call is at the top of the stack, and the information
for the initial call is at the bottom
...
Since we
assume that array parameters are represented by pointers, the information for each
procedure call on the stack requires O
...
The stack depth is the maximum amount of stack space used at any time during a computation
...
Describe a scenario in which TAIL -R ECURSIVE -Q UICKSORT’s stack depth is

...

c
...
lg n/
...
n lg n/ expected running time of the
algorithm
...
One common approach is the median-of-3 method: choose
the pivot as the median (middle element) of a set of 3 elements randomly selected
from the subarray
...
4-6
...
We denote the

Problems for Chapter 7

189

sorted output array by A0 Œ1 : : n
...

a
...
)

1
...
By what amount have we increased the likelihood of choosing the pivot as
x D A0 Œb
...

c
...
)
d
...
n lg n/ running time of quicksort, the median-of-3 method
affects only the constant factor
...
Instead, for each number, we know an interval on the real line to which it belongs
...
We
wish to fuzzy-sort these intervals, i
...
, to produce a permutation hi1 ; i2 ; : : : ; in i of
the intervals such that for j D 1; 2; : : : ; n, there exist cj 2 Œaij ; bij  satisfying
c1 Ä c2 Ä
Ä cn
...
Design a randomized algorithm for fuzzy-sorting n intervals
...
(As the intervals overlap more and more, the problem of fuzzy-sorting the intervals becomes progressively easier
...
)
b
...
n lg n/ in general, but runs
in expected time ‚
...
e
...
Your algorithm should not be checking
for this case explicitly; rather, its performance should naturally improve as the
amount of overlap increases
...
The PARTITION procedure given in Section 7
...
Lomuto
...
4 is due to Avrim Blum
...

McIlroy [248] showed how to engineer a “killer adversary” that produces an
array on which virtually any implementation of quicksort takes ‚
...
If the
implementation is randomized, the adversary produces the array after seeing the
random choices of the quicksort algorithm
...
n lg n/
time
...
Moreover, for each of these algorithms, we can produce a
sequence of n input numbers that causes the algorithm to run in
...

These algorithms share an interesting property: the sorted order they determine
is based only on comparisons between the input elements
...
All the sorting algorithms introduced thus far are
comparison sorts
...
1, we shall prove that any comparison sort must make
...
Thus, merge sort and heapsort
are asymptotically optimal, and no comparison sort exists that is faster by more
than a constant factor
...
2, 8
...
4 examine three sorting algorithms—counting sort, radix
sort, and bucket sort—that run in linear time
...
Consequently,
the
...


8
...
That is, given two elements
aj , or
ai and aj , we perform one of the tests ai < aj , ai Ä aj , ai D aj , ai
ai > aj to determine their relative order
...

In this section, we assume without loss of generality that all the input elements
are distinct
...
We also note that
aj , ai > aj , and ai < aj are all equivalent in that
the comparisons ai Ä aj , ai

192

Chapter 8 Sorting in Linear Time

1:2


>

>



2:3

1:3


〈1,2,3〉

〈2,1,3〉

1:3

〈1,3,2〉

>
〈3,1,2〉

>
2:3

〈2,3,1〉

>
〈3,2,1〉

Figure 8
...
An internal node annotated by i:j indicates a comparison between ai and aj
...
n/
...
1/;
...
n/i indicates the ordering a
...
2/ Ä
indicates the decisions made when sorting the input sequence ha1 D 6; a2 D 8; a3 D 5i; the
permutation h3; 1; 2i at the leaf indicates that the sorted ordering is a3 D 5 Ä a1 D 6 Ä a2 D 8
...


they yield identical information about the relative order of ai and aj
...

The decision-tree model
We can view comparison sorts abstractly in terms of decision trees
...
Control, data movement, and all other aspects of the algorithm are ignored
...
1 shows the decision tree corresponding to the insertion sort algorithm
from Section 2
...

In a decision tree, we annotate each internal node by i:j for some i and j in the
range 1 Ä i; j Ä n, where n is the number of elements in the input sequence
...
1/;
...
n/i
...
1
for background on permutations
...

Each internal node indicates a comparison ai Ä aj
...
When we come to a leaf, the sortÄ a
...
Because
ing algorithm has established the ordering a
...
2/ Ä
any correct sorting algorithm must be able to produce each permutation of its input,
each of the nŠ permutations on n elements must appear as one of the leaves of the
decision tree for a comparison sort to be correct
...
1 Lower bounds for sorting

193

execution of the comparison sort
...
”)
Thus, we shall consider only decision trees in which each permutation appears as
a reachable leaf
...
Consequently, the worst-case number of
comparisons for a given comparison sort algorithm equals the height of its decision
tree
...
The following theorem establishes such a lower bound
...
1
Any comparison sort algorithm requires


...


Proof From the preceding discussion, it suffices to determine the height of a
decision tree in which each permutation appears as a reachable leaf
...
Because each of the nŠ permutations of the input appears as
some leaf, we have nŠ Ä l
...
nŠ/
(since the lg function is monotonically increasing)
D

...
19))
...
2
Heapsort and merge sort are asymptotically optimal comparison sorts
...
n lg n/ upper bounds on the running times for heapsort and merge
sort match the
...
1
...
1-1
What is the smallest possible depth of a leaf in a decision tree for a comparison
sort?

194

Chapter 8 Sorting in Linear Time

8
...
nŠ/ without using Stirling’s approxiPn
mation
...
2
...
1-3
Show that there is no comparison sort whose running time is linear for at least half
of the nŠ inputs of length n
...
1-4
Suppose that you are given a sequence of n elements to sort
...
The elements in a given
subsequence are all smaller than the elements in the succeeding subsequence and
larger than the elements in the preceding subsequence
...
Show an
...
(Hint: It is not rigorous to
simply combine the lower bounds for the individual subsequences
...
2

Counting sort
Counting sort assumes that each of the n input elements is an integer in the range
0 to k, for some integer k
...
n/, the sort runs in ‚
...

Counting sort determines, for each input element x, the number of elements less
than x
...
For example, if 17 elements are less than x, then x belongs in output
position 18
...

In the code for counting sort, we assume that the input is an array AŒ1 : : n, and
thus A:length D n
...


8
...
2 The operation of C OUNTING -S ORT on an input array AŒ1 : : 8, where each element
of A is a nonnegative integer no larger than k D 5
...
(b) The array C after line 8
...
Only the lightly shaded elements of
array B have been filled in
...


C OUNTING -S ORT
...

/
7 for i D 1 to k
8
C Œi D C Œi C C Œi 1
9 / C Œi now contains the number of elements less than or equal to i
...
2 illustrates counting sort
...
If the
value of an input element is i, we increment C Œi
...
Lines 7–8
determine for each i D 0; 1; : : : ; k how many input elements are less than or equal
to i by keeping a running sum of the array C
...
If all n elements are distinct, then when we
first enter line 10, for each AŒj , the value C ŒAŒj  is the correct final position
of AŒj  in the output array, since there are C ŒAŒj  elements less than or equal
to AŒj 
...
Decrementing C ŒAŒj  causes the
next input element with a value equal to AŒj , if one exists, to go to the position
immediately before AŒj  in the output array
...
k/, the for loop of lines 4–5 takes time ‚
...
k/, and the for loop of lines 10–12 takes time ‚
...
Thus, the overall time
is ‚
...
In practice, we usually use counting sort when we have k D O
...
n/
...
n lg n/ proved in Section 8
...
In fact, no comparisons between input elements occur
anywhere in the code
...
The
...

An important property of counting sort is that it is stable: numbers with the same
value appear in the output array in the same order as they do in the input array
...
Normally, the property of
stability is important only when satellite data are carried around with the element
being sorted
...
As we shall see in the next section,
in order for radix sort to work correctly, counting sort must be stable
...
2-1
Using Figure 8
...

8
...

8
...
Is the modified algorithm stable?

8
...
2-4
Describe an algorithm that, given n integers in the range 0 to k, preprocesses its
input and then answers any query about how many of the n integers fall into a
range Œa : : b in O
...
Your algorithm should use ‚
...


8
...
The cards have 80 columns, and in each column a machine can
punch a hole in one of 12 places
...
An operator can then
gather the cards bin by bin, so that cards with the first place punched are on top of
cards with the second place punched, and so on
...
(The other two places
are reserved for encoding nonnumeric characters
...
Since the card sorter can look at only one column
at a time, the problem of sorting n cards on a d -digit number requires a sorting
algorithm
...
Unfortunately,
since the cards in 9 of the 10 bins must be put aside to sort each of the bins, this
procedure generates many intermediate piles of cards that you would have to keep
track of
...
3-5
...
The algorithm then combines the cards into a single
deck, with the cards in the 0 bin preceding the cards in the 1 bin preceding the
cards in the 2 bin, and so on
...
The process continues
until the cards have been sorted on all d digits
...
Thus, only d passes through the deck are
required to sort
...
3 shows how radix sort operates on a “deck” of seven
3-digit numbers
...
The sort
performed by a card sorter is stable, but the operator has to be wary about not
changing the order of the cards as they come out of a bin, even though all the cards
in a bin have the same digit in the chosen column
...
3 The operation of radix sort on a list of seven 3-digit numbers
...
The remaining columns show the list after successive sorts on increasingly significant digit
positions
...


In a typical computer, which is a sequential random-access machine, we sometimes use radix sort to sort records of information that are keyed by multiple fields
...
We
could run a sorting algorithm with a comparison function that, given two dates,
compares years, and if there is a tie, compares months, and if another tie occurs,
compares days
...

The code for radix sort is straightforward
...

R ADIX -S ORT
...
3
Given n d -digit numbers in which each digit can take on up to k possible values,
R ADIX -S ORT correctly sorts these numbers in ‚
...
n C k// time if the stable sort
it uses takes ‚
...

Proof The correctness of radix sort follows by induction on the column being
sorted (see Exercise 8
...
The analysis of the running time depends on the stable
sort used as the intermediate sorting algorithm
...
Each pass over n d -digit numbers then takes time ‚
...

There are d passes, and so the total time for radix sort is ‚
...
n C k//
...
n/, we can make radix sort run in linear time
...


8
...
4
Given n b-bit numbers and any positive integer r Ä b, R ADIX -S ORT correctly sorts
these numbers in ‚
...
n C 2r // time if the stable sort it uses takes ‚
...

Proof For a value r Ä b, we view each key as having d D db=re digits of r bits
each
...
(For example, we can view a 32-bit word as having four 8-bit
digits, so that b D 32, r D 8, k D 2r 1 D 255, and d D b=r D 4
...
n C k/ D ‚
...
d
...
b=r/
...

For given values of n and b, we wish to choose the value of r, with r Ä b,
that minimizes the expression
...
n C 2r /
...
n C 2r / D ‚
...
Thus, choosing r D b yields a running
time of
...
n C 2b / D ‚
...
If b
blg nc,
then choosing r D blg nc gives the best time to within a constant factor, which
we can see as follows
...
bn= lg n/
...
bn= lg n/
...
n/
...
lg n/, as is often the case, and we choose r lg n, then radix sort’s
running time is ‚
...
n lg n/
...

Although radix sort may make fewer passes than quicksort over the n keys, each
pass of radix sort may take significantly longer
...
g
...
Moreover, the version of radix sort that uses counting sort as the
intermediate stable sort does not sort in place, which many of the ‚
...
Thus, when primary memory storage is at a premium, we
might prefer an in-place algorithm such as quicksort
...
3-1
Using Figure 8
...


200

Chapter 8 Sorting in Linear Time

8
...
How much additional time and space does your scheme entail?
8
...
Where does your proof need the
assumption that the intermediate sort is stable?
8
...
n/ time
...
3-5 ?
In the first card-sorting algorithm in this section, exactly how many sorting passes
are needed to sort d -digit decimal numbers in the worst case? How many piles of
cards would an operator need to keep track of in the worst case?

8
...
n/
...
Whereas counting sort assumes that the input
consists of integers in a small range, bucket sort assumes that the input is generated
by a random process that distributes elements uniformly and independently over
the interval Œ0; 1/
...
2 for a definition of uniform distribution
...
Since the inputs are uniformly and independently distributed over Œ0; 1/, we do not expect many numbers
to fall into each bucket
...

Our code for bucket sort assumes that the input is an n-element array A and
that each element AŒi in the array satisfies 0 Ä AŒi < 1
...
(Section 10
...
)

8
...
78

...
39

...
72

...
21

...
23

...
12

...
39


...
23


...
68

...
78

8
9


...
4 The operation of B UCKET-S ORT for n D 10
...
(b) The
array BŒ0 : : 9 of sorted lists (buckets) after line 8 of the algorithm
...
i C 1/=10/
...


B UCKET-S ORT
...
4 shows the operation of bucket sort on an input array of 10 numbers
...
Assume
without loss of generality that AŒi Ä AŒj 
...
If AŒi and AŒj  go into the same bucket, then the for loop of lines 7–8 puts
them into the proper order
...
Therefore, bucket sort works correctly
...
n/ time
in the worst case
...


202

Chapter 8 Sorting in Linear Time

To analyze the cost of the calls to insertion sort, let ni be the random variable
denoting the number of elements placed in bucket BŒi
...
2), the running time of bucket sort is
T
...
n/ C

n 1
X

O
...
Taking expectations of both sides and using linearity of expectation,
we have
#
"
n 1
X
O
...
n/ D E ‚
...
n/ C

n 1
X

E O
...
22))
...
n/ C

n 1
X

(8
...
2)

for i D 0; 1; : : : ; n 1
...

i
To prove equation (8
...
Thus,

Xij :

j D1

To compute E Œn2 , we expand the square and regroup terms:
i

8
...
3)

1Äj Än 1ÄkÄn
k¤j

where the last line follows by linearity of expectation
...
Indicator random variable Xij is 1 with probability 1=n and 0
otherwise, and therefore
Ã
Â
1
2
2 1
2
C0
1
E Xij D 1
n
n
1
:
D
n
When k ¤ j , the variables Xij and Xi k are independent, and hence
E ŒXij Xi k  D E ŒXij  E ŒXi k 
1 1
D
n n
1
:
D
n2
Substituting these two expected values in equation (8
...
n 1/ 2
n
n
n 1
D 1C
n
1
;
D 2
n
which proves equation (8
...

D n

204

Chapter 8 Sorting in Linear Time

Using this expected value in equation (8
...
n/ C n O
...
n/
...
As long as the input has the property that the sum of the squares
of the bucket sizes is linear in the total number of elements, equation (8
...

Exercises
8
...
4 as a model, illustrate the operation of B UCKET-S ORT on the array
A D h:79; :13; :16; :64; :39; :20; :89; :53; :71; :42i
...
4-2
Explain why the worst-case running time for bucket sort is ‚
...
What simple
change to the algorithm preserves its linear average-case running time and makes
its worst-case running time O
...
4-3
Let X be a random variable that is equal to the number of heads in two flips of a
fair coin
...
4-4 ?
We are given n points in the unit circle, pi D
...
Suppose that the points are uniformly distributed; that is, the
probability of finding a point in any region of the circle is proportional to the area
of that region
...
n/ to
an
sort the n points by their distances di D xi2 C yi2 from the origin
...
)
8
...
x/ for a random variable X is defined
by P
...
Suppose that we draw a list of n random variables
X1 ; X2 ; : : : ; Xn from a continuous probability distribution function P that is computable in O
...
Give an algorithm that sorts these numbers in linear averagecase time
...
n lg n/ lower bound on the running time
of any deterministic or randomized comparison sort on n distinct input elements
...

We assume that every permutation of A’s inputs is equally likely
...
Suppose that each leaf of TA is labeled with the probability that it is reached
given a random input
...

b
...
T / denote the external path length of a decision tree T ; that is, D
...
Let T be a decision tree with
k > 1 leaves, and let LT and RT be the left and right subtrees of T
...
T / D D
...
RT/ C k
...
Let d
...
T / over all decision trees T with k > 1
leaves
...
k/ D min1Äi Äk 1 fd
...
k i/ C kg
...
Let i0 be the number
of leaves in LT and k i0 the number of leaves in RT
...
Prove that for a given value of k > 1 and i in the range 1 Ä i Ä k 1, the
function i lg i C
...
k i/ is minimized at i D k=2
...
k/ D
...

e
...
TA / D
...
nŠ//, and conclude that the average-case time to
sort n elements is
...

Now, consider a randomized comparison sort B
...
A randomization node models a
random choice of the form R ANDOM
...

f
...


206

Chapter 8 Sorting in Linear Time

8-2 Sorting in place in linear time
Suppose that we have an array of n data records to sort and that the key of each
record has the value 0 or 1
...
The algorithm runs in O
...

2
...

3
...

a
...

b
...

c
...

d
...
bn/ time? Explain how or why not
...
Suppose that the n records have keys in the range from 1 to k
...
n C k/ time
...
k/ storage outside the input array
...
You are given an array of integers, where different integers may have different
numbers of digits, but the total number of digits over all the integers in the array
is n
...
n/ time
...
You are given an array of strings, where different strings may have different
numbers of characters, but the total number of characters over all the strings
is n
...
n/ time
...
)
8-4 Water jugs
Suppose that you are given n red and n blue water jugs, all of different shapes and
sizes
...
Moreover,
for every red jug, there is a blue jug that holds the same amount of water, and vice
versa
...
To do so, you may perform the following operation: pick
a pair of jugs in which one is red and one is blue, fill the red jug with water, and
then pour the water into the blue jug
...
Assume
that such a comparison takes one time unit
...
Remember
that you may not directly compare two red jugs or two blue jugs
...
Describe a deterministic algorithm that uses ‚
...

b
...
n lg n/ for the number of comparisons that an algorithm solving this problem must make
...
Give a randomized algorithm whose expected number of comparisons is
O
...
What is the worst-case number of comparisons for your algorithm?
8-5 Average sorting
Suppose that, instead of sorting an array, we just require that the elements increase
on average
...
What does it mean for an array to be 1-sorted?
b
...

c
...

d
...
n lg
...

We can also show a lower bound on the time to produce a k-sorted array, when k
is a constant
...
Show that we can sort a k-sorted array of length n in O
...
(Hint:
Use the solution to Exercise 6
...
)
f
...
n lg n/
time
...
)

208

Chapter 8 Sorting in Linear Time

8-6 Lower bound on merging sorted lists
The problem of merging two sorted lists arises frequently
...
3
...
In this problem, we will
prove a lower bound of 2n 1 on the worst-case number of comparisons required
to merge two sorted lists, each containing n items
...
n/ comparisons by using a decision
tree
...
Given 2n numbers, compute the number of possible ways to divide them into
two sorted lists, each with n numbers
...
Using a decision tree and your answer to part (a), show that any algorithm that
correctly merges two sorted lists must perform at least 2n o
...

Now we will show a slightly tighter 2n

1 bound
...
Show that if two elements are consecutive in the sorted order and from different
lists, then they must be compared
...
Use your answer to the previous part to show a lower bound of 2n
isons for merging two sorted lists
...
A; i; j /
1 if AŒi > AŒj 
2
exchange AŒi with AŒj 
After the compare-exchange operation, we know that AŒi Ä AŒj 
...
The indices of the positions compared
in the sequence must be determined in advance, and although they can depend
on the number of elements being sorted, they cannot depend on the values being
sorted, nor can they depend on the result of any prior compare-exchange operation
...
A/
1 for j D 2 to A:length
2
for i D j 1 downto 1
3
C OMPARE -E XCHANGE
...
It states that if an oblivious compare-exchange algorithm correctly sorts all input sequences consisting of
only 0s and 1s, then it correctly sorts all inputs containing arbitrary values
...
Assume that an oblivious compare-exchange algorithm X fails to correctly sort the array AŒ1 : : n
...
Define an
array BŒ1 : : n of 0s and 1s as follows:
(
0 if AŒi Ä AŒp ;
BŒi D
1 if AŒi > AŒp :
a
...

b
...

Now you will use the 0-1 sorting lemma to prove that a particular sorting algorithm works correctly
...
The array has r rows and s columns (so that n D rs), subject to
three restrictions:
r must be even,
s must be a divisor of r, and
r

2s 2
...

Columnsort operates in eight steps, regardless of the value of n
...
Each even step is a fixed permutation
...
Sort each column
...
Transpose the array, but reshape it back to r rows and s columns
...

3
...

4
...


210

Chapter 8 Sorting in Linear Time
10
8
12
16
4
18

14
7
1
9
15
3
(a)

5
17
6
11
2
13

1
2
3
5
6
7

4
8
9
10
13
15
(f)

11
12
14
16
17
18

4
8
10
12
16
18

1
2
3

1
3
7
9
14
15
(b)

5
6
7
4
8
9

10
13
15
11
12
14
(g)

2
5
6
11
13
17

4
12
1
9
2
11

16
17
18
1
2
3

8
16
3
14
5
13
(c)

4
5
6
7
8
9

10
11
12
13
14
15
(h)

10
18
7
15
6
17

16
17
18

1
2
4
9
11
12

3
5
8
13
14
16
(d)

6
7
10
15
17
18

1
2
3
4
5
6

7
8
9
10
11
12
(i)

13
14
15
16
17
18

1
3
6
2
5
7

4
8
10
9
13
15
(e)

11
14
17
12
16
18

Figure 8
...
(a) The input array with 6 rows and 3 columns
...
(c) After transposing and reshaping in step 2
...
(e) After performing step 4, which inverts the permutation from step 2
...
(g) After shifting by half a column in step 6
...
(i) After performing step 8, which inverts the permutation from step 6
...


5
...

6
...
Leave the top half of the leftmost column empty
...

7
...

8
...

Figure 8
...

(Even though this example violates the requirement that r
2s 2 , it happens to
work
...
Argue that we can treat columnsort as an oblivious compare-exchange algorithm, even if we do not know what sorting method the odd steps use
...
The 0-1 sorting lemma applies
because we can treat columnsort as an oblivious compare-exchange algorithm
...
We say that an area
of an array is clean if we know that it contains either all 0s or all 1s
...
From here on, assume that
the input array contains only 0s and 1s, and that we can treat it as an array with r
rows and s columns
...
Prove that after steps 1–3, the array consists of some clean rows of 0s at the top,
some clean rows of 1s at the bottom, and at most s dirty rows between them
...
Prove that after step 4, the array, read in column-major order, starts with a clean
area of 0s, ends with a clean area of 1s, and has a dirty area of at most s 2
elements in the middle
...
Prove that steps 5–8 produce a fully sorted 0-1 output
...

g
...
Prove that after steps 1–3, the array
consists of some clean rows of 0s at the top, some clean rows of 1s at the
bottom, and at most 2s 1 dirty rows between them
...
Suggest a simple change to step 1 that allows us to maintain the requirement
that r
2s 2 even when s does not divide r, and prove that with your change,
columnsort correctly sorts
...
Knuth’s comprehensive treatise on sorting [211] covers many
variations on the sorting problem, including the information-theoretic lower bound
on the complexity of sorting given here
...

Knuth credits H
...
Seward with inventing counting sort in 1954, as well as with
the idea of combining counting sort with radix sort
...
According to Knuth, the first published reference to the method is a 1929 document by L
...
Comrie describing punched-card
equipment
...
J
...
C
...

Munro and Raman [263] give a stable sorting algorithm that performs O
...
Although

212

Chapter 8 Sorting in Linear Time

any of the O
...
n/ times and operates in place
...
n lg n/ time has been considered by
many researchers
...
All the results assume that the computer memory is divided into
addressable b-bit words
...
n lg n= lg lg n/ time
...
n lg n/ time by Andersson [16]
...
Andersson, Hagerup,
Nilsson, and Raman [17] have shown how to sort n integers in O
...
Using multiplicative hashing, we can reduce the storage
needed to O
...
n lg lg n/ worst-case bound on the running time
becomes an expected-time bound
...
n
...
Combining
these techniques with some new ideas, Han [158] improved the bound for sorting
to O
...
Although these algorithms are important theoretical
breakthroughs, they are all fairly complicated and at the present time seem unlikely
to compete with existing sorting algorithms in practice
...


9

Medians and Order Statistics

The ith order statistic of a set of n elements is the ith smallest element
...
A median, informally, is
the “halfway point” of the set
...
n C 1/=2
...
Thus, regardless of the parity of n, medians occur at i D b
...
n C 1/=2e (the upper median)
...

This chapter addresses the problem of selecting the ith order statistic from a
set of n distinct numbers
...
We formally specify the selection problem
as follows:
Input: A set A of n (distinct) numbers and an integer i, with 1 Ä i Ä n
...


We can solve the selection problem in O
...
This chapter presents faster algorithms
...
1, we examine the problem of selecting the minimum and maximum of a set of elements
...
Section 9
...
n/ expected running time, assuming distinct elements
...
3 contains an algorithm of more theoretical interest that
achieves the O
...


214

9
...
In the following procedure, we assume that the set resides in array A, where
A:length D n
...
A/
1 min D AŒ1
2 for i D 2 to A:length
3
if min > AŒi
4
min D AŒi
5 return min
We can, of course, find the maximum with n 1 comparisons as well
...
Think of any algorithm
that determines the minimum as a tournament among the elements
...

Observing that every element except the winner must lose at least one match, we
conclude that n 1 comparisons are necessary to determine the minimum
...

Simultaneous minimum and maximum
In some applications, we must find both the minimum and the maximum of a set
of n elements
...
x; y/
data to fit onto a rectangular display screen or other graphical output device
...

At this point, it should be obvious how to determine both the minimum and the
maximum of n elements using ‚
...

In fact, we can find both the minimum and the maximum using at most 3 bn=2c
comparisons
...
Rather than processing each element of the input by comparing it
against the current minimum and maximum, at a cost of 2 comparisons per element,

9
...
We compare pairs of elements from the input first
with each other, and then we compare the smaller with the current minimum and
the larger to the current maximum, at a cost of 3 comparisons for every 2 elements
...
If n is odd, we set both the minimum and maximum
to the value of the first element, and then we process the rest of the elements in
pairs
...

Let us analyze the total number of comparisons
...
If n is even, we perform 1 initial comparison followed by
3
...
Thus, in either case, the total
number of comparisons is at most 3 bn=2c
...
1-1
Show that the second smallest of n elements can be found with n C dlg ne
comparisons in the worst case
...
)

2

9
...
(Hint: Consider how many numbers
are potentially either the maximum or minimum, and investigate how a comparison
affects these counts
...
2 Selection in expected linear time
The general selection problem appears more difficult than the simple problem of
finding a minimum
...
n/
...
The algorithm R ANDOMIZED -S ELECT is modeled after
the quicksort algorithm of Chapter 7
...
But unlike quicksort, which recursively processes both sides of the
partition, R ANDOMIZED -S ELECT works on only one side of the partition
...
n lg n/, the expected running time of R ANDOMIZED -S ELECT is ‚
...


216

Chapter 9 Medians and Order Statistics

R ANDOMIZED -S ELECT uses the procedure R ANDOMIZED -PARTITION introduced in Section 7
...
Thus, like R ANDOMIZED -Q UICKSORT, it is a randomized algorithm, since its behavior is determined in part by the output of a random-number
generator
...

R ANDOMIZED -S ELECT
...
A; p; r/
4 k D q pC1
/ the pivot value is the answer
/
5 if i == k
6
return AŒq
7 elseif i < k
8
return R ANDOMIZED -S ELECT
...
A; q C 1; r; i k/
The R ANDOMIZED -S ELECT procedure works as follows
...
In this case, i must equal 1, and we simply return AŒp in line 2 as the
ith smallest element
...
As in quicksort,
we will refer to AŒq as the pivot element
...
Line 5 then checks whether AŒq is
the ith smallest element
...
Otherwise, the algorithm
determines in which of the two subarrays AŒp : : q 1 and AŒq C 1 : : r the ith
smallest element lies
...
If i > k, however,
then the desired element lies on the high side of the partition
...
i k/th smallest element
of AŒq C 1 : : r, which line 9 finds recursively
...
2-1 asks you to show that this
situation cannot happen
...
n2 /, even to find
the minimum, because we could be extremely unlucky and always partition around
the largest remaining element, and partitioning takes ‚
...
We will see that

9
...

To analyze the expected running time of R ANDOMIZED -S ELECT, we let the running time on an input array AŒp : : r of n elements be a random variable that we
denote by T
...
n/ as follows
...
Therefore, for each k such that 1 Ä k Ä n, the subarray AŒp : : q has k elements (all less than or equal to the pivot) with probability 1=n
...
1)

When we call R ANDOMIZED -S ELECT and choose AŒq as the pivot element, we
do not know, a priori, if we will terminate immediately with the correct answer,
recurse on the subarray AŒp : : q 1, or recurse on the subarray AŒq C 1 : : r
...

Assuming that T
...
In other words, to obtain an upper bound, we assume that the ith
element is always on the side of the partition with the greater number of elements
...
When Xk D 1, the
two subarrays on which we might recurse have sizes k 1 and n k
...
n/ Ä

n
X

Xk
...
max
...
n//

kD1

D

n
X
kD1

Xk T
...
k

1; n

k// C O
...
n/
" n
X
Xk T
...
k
Ä E

#
1; n

k// C O
...
max
...
max
...
max
...
n/

1; n

1; n

(by linearity of expectation)

k// C O
...
24))

k// C O
...
1))
...
24), we rely on Xk and T
...
k 1; n k// being
independent random variables
...
2-2 asks you to justify this assertion
...
k 1; n k/
...
k 1; n k/ D
n k if k Ä dn=2e :
If n is even, each term from T
...
n 1/ appears exactly twice in
the summation, and if n is odd, all these terms appear twice and T
...
Thus, we have
n 1
2 X
E ŒT
...
n/ :
E ŒT
...
n/ D O
...
Assume that E ŒT
...
We assume
that T
...
1/ for n less than some constant; we shall pick this constant later
...
n/ term above
(which describes the non-recursive component of the running time of the algorithm) is bounded from above by an for all n > 0
...
n/ Ä

n 1
2 X
ck C an
n
kDbn=2c

D

2c
n

n 1
X
kD1

X

bn=2c 1

k

kD1

!
k C an

9
...
n 1/n
...
n 1/n
...
n=2 1/
C an
Ä
n
2
2
Â
Ã
2c n2 n n2 =4 3n=2 C 2
C an
D
n
2
2
 2
Ã
n
c 3n
C
2 C an
D
n
4
2
Ã
Â
1 2
3n
C
C an
D c
4
2 n
3cn c
C C an
Ä
4
2
Á
cn c
an :
D cn
4
2
In order to complete the proof, we need to show that for sufficiently large n, this
last expression is at most cn or, equivalently, that cn=4 c=2 an
0
...
c=4 a/ c=2
...
e
...
n/ D O
...
c 4a/, then E ŒT
...
n/
...

Exercises
9
...

9
...
max
...

9
...


1; n

k//

220

Chapter 9 Medians and Order Statistics

9
...
Describe a sequence of partitions that results
in a worst-case performance of R ANDOMIZED -S ELECT
...
3

Selection in worst-case linear time
We now examine a selection algorithm whose running time is O
...
Like R ANDOMIZED -S ELECT, the algorithm S ELECT finds the desired element by recursively partitioning the input array
...
S ELECT uses the deterministic partitioning
algorithm PARTITION from quicksort (see Section 7
...

The S ELECT algorithm determines the ith smallest of an input array of n > 1
distinct elements by executing the following steps
...
)
1
...

2
...

3
...
(If there are an even number of medians, then by our convention, x is
the lower median
...
Partition the input array around the median-of-medians x using the modified
version of PARTITION
...

5
...
Otherwise, use S ELECT recursively to find the ith
smallest element on the low side if i < k, or the
...

To analyze the running time of S ELECT, we first determine a lower bound on the
number of elements that are greater than the partitioning element x
...
1
helps us to visualize this bookkeeping
...
3 Selection in worst-case linear time

221

x

Figure 9
...
The n elements are represented by small circles,
and each group of 5 elements occupies a column
...
(When finding the median of an even number of elements, we use
the lower median
...
The elements known to be greater than x appear on a shaded
background
...
1 Thus, at least half
of the dn=5e groups contribute at least 3 elements that are greater than x, except
for the one group that has fewer than 5 elements if 5 does not divide n exactly, and
the one group containing x itself
...
Thus, in the worst case,
step 5 calls S ELECT recursively on at most 7n=10 C 6 elements
...
n/ of the
algorithm S ELECT
...
n/ time
...
n/
calls of insertion sort on sets of size O
...
) Step 3 takes time T
...
7n=10 C 6/, assuming that T is monotonically increasing
...
1/ time; the origin of the magic constant 140 will be
clear shortly
...


222

Chapter 9 Medians and Order Statistics

(
T
...
1/
if n < 140 ;
T
...
7n=10 C 6/ C O
...
More specifically, we will
show that T
...
We begin by
assuming that T
...
We also pick a constant a such that the function described by the O
...

Substituting this inductive hypothesis into the right-hand side of the recurrence
yields
T
...
7n=10 C 6/ C an
cn=5 C c C 7cn=10 C 6c C an
9cn=10 C 7c C an
cn C
...
2)

Inequality (9
...
n=
...

Because we assume that n
140, we have n=
...
2)
...
) The worst-case running time of S ELECT is therefore
linear
...
1), S ELECT and R ANDOMIZED -S ELECT
determine information about the relative order of elements only by comparing elements
...
n lg n/ time in the comparison model, even on average (see Problem 8-1)
...
In contrast, the linear-time selection algorithms in this chapter do not require any assumptions about the input
...
n lg n/ lower bound because they manage to solve
the selection problem without sorting
...


9
...
3-1
In the algorithm S ELECT, the input elements are divided into groups of 5
...

9
...

9
...
n lg n/ time in the worst case, assuming that all elements are distinct
...
3-4 ?
Suppose that an algorithm uses only comparisons to find the ith smallest element
in a set of n elements
...

9
...

Give a simple, linear-time algorithm that solves the selection problem for an arbitrary order statistic
...
3-6
The kth quantiles of an n-element set are the k 1 order statistics that divide the
sorted set into k equal-sized sets (to within 1)
...
n lg k/-time algorithm
to list the kth quantiles of a set
...
3-7
Describe an O
...

9
...
Give an O
...

9
...
The company wants to connect

224

Chapter 9 Medians and Order Statistics

Figure 9
...


a spur pipeline from each well directly to the main pipeline along a shortest route
(either north or south), as shown in Figure 9
...
Given the x- and y-coordinates of
the wells, how should the professor pick the optimal location of the main pipeline,
which would be the one that minimizes the total length of the spurs? Show how to
determine the optimal location in linear time
...
Find the algorithm that implements each of the following methods with the best asymptotic worst-case running time, and analyze the
running times of the algorithms in terms of n and i
...
Sort the numbers, and list the i largest
...
Build a max-priority queue from the numbers, and call E XTRACT-M AX i times
...
Use an order-statistic algorithm to find the ith largest number, partition around
that number, and sort the i largest numbers
...

a
...

b
...
n lg n/ worstcase time using sorting
...
Show how to compute the weighted median in ‚
...
3
...
We are given n points
p1 ; p2 ; : : : ; pn with associated weights w1 ; w2 ; : : : ; wn
...
p; pi /,
where d
...

d
...
a; b/ D ja bj
...
Find the best solution for the 2-dimensional post-office location problem, in
which the points are
...
x1 ; y1 / and b D
...
a; b/ D
jx1 x2 j C jy1 y2 j
...
n/ of comparisons used by S ELECT
to select the ith order statistic from n numbers satisfies T
...
n/, but the
constant hidden by the ‚-notation is rather large
...


226

Chapter 9 Medians and Order Statistics

a
...
n/ comparisons to find the ith smallest of n
elements, where
(
T
...
n/ D
bn=2c C Ui
...
2i/ otherwise :
(Hint: Begin with bn=2c disjoint pairwise comparisons, and recurse on the set
containing the smaller element from each pair
...
Show that, if i < n=2, then Ui
...
T
...
n=i//
...
Show that if i is a constant less than n=2, then Ui
...
lg n/
...
Show that if i D n=k for k

2, then Ui
...
T
...


9-4 Alternative analysis of randomized selection
In this problem, we use indicator random variables to analyze the R ANDOMIZED S ELECT procedure in a manner akin to our analysis of R ANDOMIZED -Q UICKSORT
in Section 7
...
2
...
Thus, the call R ANDOMIZED -S ELECT
...

For 1 Ä i < j Ä n, let
Xijk D I f ´i is compared with ´j sometime during the execution of the algorithm
to find ´k g :
a
...
(Hint: Your expression may have different values, depending on the values of i, j , and k
...
Let Xk denote the total number of comparisons between elements of array A
when finding ´k
...
Show that E ŒXk  Ä 4n
...
Conclude that, assuming all elements of array A are distinct, R ANDOMIZED S ELECT runs in expected time O
...


Notes for Chapter 9

227

Chapter notes
The worst-case linear-time median-finding algorithm was devised by Blum, Floyd,
Pratt, Rivest, and Tarjan [50]
...

Floyd and Rivest [108] have developed an improved randomized version that partitions around an element recursively selected from a small sample of the elements
...
Bent and John [41] gave a lower bound of 2n comparisons for median
finding, and Sch¨ nhage, Paterson, and Pippenger [302] gave an upper bound of 3n
...
Their upper bound [93]
is slightly less than 2:95n, and their lower bound [94] is
...
[92]
...


III

Data Structures

Introduction
Sets are as fundamental to computer science as they are to mathematics
...
We call such sets dynamic
...

Algorithms may require several different types of operations to be performed on
sets
...
We call a dynamic set that
supports these operations a dictionary
...
For example, min-priority queues, which Chapter 6 introduced in the
context of the heap data structure, support the operations of inserting an element
into and extracting the smallest element from a set
...

Elements of a dynamic set
In a typical implementation of a dynamic set, each element is represented by an
object whose attributes can be examined and manipulated if we have a pointer to
the object
...
3 discusses the implementation of objects and pointers in
programming environments that do not contain them as basic data types
...
If the keys are all different, we can think of the dynamic set as being a set
of key values
...
It may

230

Part III Data Structures

also have attributes that are manipulated by the set operations; these attributes may
contain data or pointers to other objects in the set
...
A total ordering allows us to define the minimum element of the set, for
example, or to speak of the next element larger than a given element in a set
...
Here is a list of typical operations
...

S EARCH
...

I NSERT
...
We usually assume that any attributes in element x needed by the set
implementation have already been initialized
...
S; x/
A modifying operation that, given a pointer x to an element in the set S, removes x from S
...
)
M INIMUM
...

M AXIMUM
...

S UCCESSOR
...

P REDECESSOR
...


Part III

Data Structures

231

In some situations, we can extend the queries S UCCESSOR and P REDECESSOR
so that they apply to sets with nondistinct keys
...

We usually measure the time taken to execute a set operation in terms of the size
of the set
...
lg n/
...
We already saw another important data structure—the
heap—in Chapter 6
...
It also shows how to implement
objects and pointers in programming environments that do not support them as
primitives
...

Chapter 11 introduces hash tables, which support the dictionary operations I N SERT, D ELETE, and S EARCH
...
n/ time to perform a S EARCH operation, but the expected time for hash-table operations is O
...

The analysis of hashing relies on probability, but most of the chapter requires no
background in the subject
...
In the worst case, each operation takes ‚
...
lg n/
...

Chapter 13 introduces red-black trees, which are a variant of binary search trees
...
lg n/ time in the worst case
...
Although the mechanics of red-black trees are somewhat intricate, you can
glean most of their properties from the chapter without studying the mechanics in
detail
...

In Chapter 14, we show how to augment red-black trees to support operations
other than the basic ones listed above
...
Then, we augment them in
a different way to maintain intervals of real numbers
...
Although we can construct many complex data structures
using pointers, we present only the rudimentary ones: stacks, queues, linked lists,
and rooted trees
...


10
...
In a stack, the element deleted from
the set is the one most recently inserted: the stack implements a last-in, first-out,
or LIFO, policy
...
There are several efficient ways to implement stacks and queues
on a computer
...

Stacks
The I NSERT operation on a stack is often called P USH, and the D ELETE operation, which does not take an element argument, is often called P OP
...
The order in which plates are popped from the stack is the reverse
of the order in which they were pushed onto the stack, since only the top plate is
accessible
...
1 shows, we can implement a stack of at most n elements with
an array SŒ1 : : n
...
1 Stacks and queues

1

2

3

4

S 15 6

2

5

6

9

233

7

1

2

3

4

S 15 6

2

9 17 3

S:top D 4
(a)

5

6

7

S:top D 6
(b)

1

2

3

4

5

6

S 15 6

2

7

9 17 3

S:top D 5
(c)

Figure 10
...
Stack elements appear only in the lightly shaded
positions
...
The top element is 9
...
S; 17/
and P USH
...
(c) Stack S after the call P OP
...
Although element 3 still appears in the array, it is no longer in the stack; the top is
element 17
...
The stack consists of elements SŒ1 : : S:top, where SŒ1 is the
element at the bottom of the stack and SŒS:top is the element at the top
...
We can test to
see whether the stack is empty by query operation S TACK -E MPTY
...

If S:top exceeds n, the stack overflows
...
)
We can implement each of the stack operations with just a few lines of code:
S TACK -E MPTY
...
S; x/
1 S:top D S:top C 1
2 SŒS:top D x
P OP
...
S/
2
error “underflow”
3 else S:top D S:top 1
4
return SŒS:top C 1
Figure 10
...
Each of
the three stack operations takes O
...


234

Chapter 10 Elementary Data Structures

1

(a)

2

3

4

5

6

8

9

10 11 12

15 6

Q

7

9

8

Q:head D 7
1

(b)

2

Q 3

3

4

5

5

(c)

2

Q 3

3

4

5

Q:tail D 3

7

Q:tail D 12

8

9

10 11 12

15 6

9

8

8

9

10 11 12

15 6

9

8

4 17

Q:head D 7

Q:tail D 3
1

6

4

5

6

7

4 17

Q:head D 8

Figure 10
...
Queue elements appear only in the
lightly shaded positions
...
(b) The configuration
of the queue after the calls E NQUEUE
...
Q; 3/, and E NQUEUE
...
(c) The
configuration of the queue after the call D EQUEUE
...
The new head has key 6
...
The FIFO property of a queue causes it to operate like a line of customers
waiting to pay a cashier
...
When an element is enqueued, it takes its place at the tail of the queue, just as a newly arriving customer
takes a place at the end of the line
...

Figure 10
...
The queue has an attribute Q:head that indexes, or points
to, its head
...
The elements in the queue reside in
locations Q:head; Q:head C 1; : : : ; Q:tail 1, where we “wrap around” in the
sense that location 1 immediately follows location n in a circular order
...
Initially, we have Q:head D Q:tail D 1
...


10
...

In our procedures E NQUEUE and D EQUEUE, we have omitted the error checking
for underflow and overflow
...
1-4 asks you to supply code that checks
for these two error conditions
...

E NQUEUE
...
Q/
1 x D QŒQ:head
2 if Q:head == Q:length
3
Q:head D 1
4 else Q:head D Q:head C 1
5 return x
Figure 10
...
Each
operation takes O
...

Exercises
10
...
1 as a model, illustrate the result of each operation in the sequence
P USH
...
S; 1/, P USH
...
S/, P USH
...
S/ on an
initially empty stack S stored in array SŒ1 : : 6
...
1-2
Explain how to implement two stacks in one array AŒ1 : : n in such a way that
neither stack overflows unless the total number of elements in both stacks together
is n
...
1/ time
...
1-3
Using Figure 10
...
Q; 4/, E NQUEUE
...
Q; 3/, D EQUEUE
...
Q; 8/, and D EQUEUE
...

10
...


236

Chapter 10 Elementary Data Structures

10
...
Write four O
...

10
...
Analyze the running time of the
queue operations
...
1-7
Show how to implement a stack using two queues
...


10
...

Unlike an array, however, in which the linear order is determined by the array
indices, the order in a linked list is determined by a pointer in each object
...

As shown in Figure 10
...
The object may
also contain other satellite data
...
If x:pre D NIL,
the element x has no predecessor and is therefore the first element, or head, of
the list
...
An attribute L:head points to the first element of the
list
...

A list may have one of several forms
...
If a list is singly
linked, we omit the pre pointer in each element
...
If the list is unsorted, the elements can appear in any order
...
We can think of a circular list as a ring of

10
...
3 (a) A doubly linked list L representing the dynamic set f1; 4; 9; 16g
...
The next attribute of the tail and the pre attribute of the head are NIL , indicated by a diagonal
slash
...
(b) Following the execution of L IST-I NSERT
...
This new object
points to the old head with key 9
...
L; x/, where x
points to the object with key 4
...
In the remainder of this section, we assume that the lists with which we
are working are unsorted and doubly linked
...
L; k/ finds the first element with key k in list L
by a simple linear search, returning a pointer to this element
...
For the linked list in
Figure 10
...
L; 4/ returns a pointer to the third element,
and the call L IST-S EARCH
...

L IST-S EARCH
...
n/ time in the
worst case, since it may have to search the entire list
...
3(b)
...
L; x/
1 x:next D L:head
2 if L:head ¤ NIL
3
L:head:pre D x
4 L:head D x
5 x:pre D NIL
(Recall that our attribute notation can cascade, so that L:head:pre denotes the
pre attribute of the object that L:head points to
...
1/
...
It must
be given a pointer to x, and it then “splices” x out of the list by updating pointers
...

L IST-D ELETE
...
3(c) shows how an element is deleted from a linked list
...
1/ time, but if we wish to delete an element with a given key, ‚
...

Sentinels
The code for L IST-D ELETE would be simpler if we could ignore the boundary
conditions at the head and tail of the list:
L IST-D ELETE0
...
For
example, suppose that we provide with list L an object L:nil that represents NIL

10
...
4 A circular, doubly linked list with a sentinel
...
The attribute L: head is no longer needed, since we can access the head of the list
by L: nil: next
...
(b) The linked list from Figure 10
...
(c) The list after executing L IST-I NSERT0
...
The new object
becomes the head of the list
...
The new tail is the
object with key 4
...
Wherever we have a reference to NIL in list code, we replace it by a reference to the sentinel L:nil
...
4, this change turns a regular doubly linked list into a circular, doubly linked list with a sentinel, in which the sentinel L:nil lies between the
head and tail
...
Similarly, both the next attribute of the tail and the pre attribute of the head point to L:nil
...
Figure 10
...

The code for L IST-S EARCH remains the same as before, but with the references
to NIL and L:head changed as specified above:
L IST-S EARCH0
...
The following procedure inserts an element into the list:

240

Chapter 10 Elementary Data Structures

L IST-I NSERT0
...
4 shows the effects of L IST-I NSERT 0 and L IST-D ELETE 0 on a sample list
...
The gain from using sentinels within loops
is usually a matter of clarity of code rather than speed; the linked list code, for
example, becomes simpler when we use sentinels, but we save only O
...
In other situations, however, the
use of sentinels helps to tighten the code in a loop, thus reducing the coefficient of,
say, n or n2 in the running time
...
When there are many small lists, the extra
storage used by their sentinels can represent significant wasted memory
...

Exercises
10
...
1/ time? How about D ELETE?
10
...
The operations P USH and P OP
should still take O
...

10
...
The operations E NQUEUE and D E QUEUE should still take O
...

10
...
Show how to eliminate the test for
x ¤ L:nil in each iteration
...
2-5
Implement the dictionary operations I NSERT, D ELETE, and S EARCH using singly
linked, circular lists
...
3 Implementing pointers and objects

241

10
...
The
sets S1 and S2 are usually destroyed by the operation
...
1/ time using a suitable list data structure
...
2-7
Give a ‚
...
The procedure should use no more than constant storage beyond that
needed for the list itself
...
2-8 ?
Explain how to implement doubly linked lists using only one pointer value x:np per
item instead of the usual two (next and pre )
...
(The value NIL is represented by 0
...
Show
how to implement the S EARCH, I NSERT, and D ELETE operations on such a list
...
1/ time
...
3 Implementing pointers and objects
How do we implement pointers and objects in languages that do not provide them?
In this section, we shall see two ways of implementing linked data structures without an explicit pointer data type
...

A multiple-array representation of objects
We can represent a collection of objects that have the same attributes by using an
array for each attribute
...
5 shows how we can implement
the linked list of Figure 10
...
The array key holds the values
of the keys currently in the dynamic set, and the pointers reside in the arrays next
and pre
...
Under this interpretation, a pointer x is simply
a common index into the key, next, and pre arrays
...
3(a), the object with key 4 follows the object with key 16 in the
linked list
...
5, key 4 appears in keyŒ2, and key 16 appears in keyŒ5,
and so nextŒ5 D 2 and pre Œ2 D 5
...
5 The linked list of Figure 10
...
Each
vertical slice of the arrays represents a single object
...
Lightly shaded object positions contain list
elements
...


attribute of the tail and the pre attribute of the head, we usually use an integer
(such as 0 or 1) that cannot possibly represent an actual index into the arrays
...

A single-array representation of objects
The words in a computer memory are typically addressed by integers from 0
to M 1, where M is a suitably large integer
...
A pointer
is simply the address of the first memory location of the object, and we can address
other memory locations within the object by adding an offset to the pointer
...
For example, Figure 10
...
3(a)
and 10
...
An object occupies a contiguous subarray AŒj : : k
...
In Figure 10
...
To read the value of i:pre , given a pointer i, we
add the value i of the pointer to the offset 2, thus reading AŒi C 2
...
The problem of managing such a heterogeneous collection of objects is more difficult than the problem of managing a homogeneous collection, where all objects have the same attributes
...


10
...
6 The linked list of Figures 10
...
5 represented in a single array A
...
The three
attributes key, next, and pre correspond to the offsets 0, 1, and 2, respectively, within each object
...
Objects containing list elements
are lightly shaded, and arrows show the list ordering
...
Thus,
it is useful to manage the storage of objects not currently used in the linked-list
representation so that one can be allocated
...
Many applications,
however, are simple enough that they can bear responsibility for returning an unused object to a storage manager
...

Suppose that the arrays in the multiple-array representation have length m and
that at some moment the dynamic set contains n Ä m elements
...

We keep the free objects in a singly linked list, which we call the free list
...

The head of the free list is held in the global variable free
...
7
...

The free list acts like a stack: the next object allocated is the last one freed
...
We assume that the
global variable free used in the following procedures points to the first element of
the free list
...
7 The effect of the A LLOCATE -O BJECT and F REE -O BJECT procedures
...
5 (lightly shaded) and a free list (heavily shaded)
...

(b) The result of calling A LLOCATE -O BJECT
...
L; 4/
...
(c) After executing L IST-D ELETE
...
5/
...


A LLOCATE -O BJECT
...
x/
1 x:next D free
2 free D x
The free list initially contains all n unallocated objects
...
We can
even service several linked lists with just a single free list
...
8 shows two
linked lists and a free list intertwined through key, next, and pre arrays
...
1/ time, which makes them quite practical
...


10
...
8 Two linked lists, L1 (lightly shaded) and L2 (heavily shaded), and a free list (darkened) intertwined
...
3-1
Draw a picture of the sequence h13; 4; 8; 19; 5; 11i stored as a doubly linked list
using the multiple-array representation
...

10
...

10
...
3-4
It is often desirable to keep all elements of a doubly linked list compact in storage,
using, for example, the first m index locations in the multiple-array representation
...
) Explain
how to implement the procedures A LLOCATE -O BJECT and F REE -O BJECT so that
the representation is compact
...
(Hint: Use the array implementation of a stack
...
3-5
Let L be a doubly linked list of length n stored in arrays key, pre , and next of
length m
...
Suppose further
that of the m items, exactly n are on list L and m n are on the free list
...
L; F / that, given the list L and the free list F ,
moves the items in L so that they occupy array positions 1; 2; : : : ; n and adjusts the
free list F so that it remains correct, occupying array positions n C1; n C2; : : : ; m
...
n/, and it should use only a
constant amount of extra space
...


246

Chapter 10 Elementary Data Structures

10
...
In this section, we look specifically at the problem of
representing rooted trees by linked data structures
...

We represent each node of a tree by an object
...
The remaining attributes of interest are
pointers to other nodes, and they vary according to the type of tree
...
9 shows how we use the attributes p, left, and right to store pointers to
the parent, left child, and right child of each node in a binary tree T
...
If node x has no left child, then x:left D NIL , and similarly for
the right child
...
If
T:root D NIL, then the tree is empty
...
This scheme no longer
works when the number of children of a node is unbounded, since we do not know
how many attributes (arrays in the multiple-array representation) to allocate in advance
...

Fortunately, there is a clever scheme to represent trees with arbitrary numbers of
children
...
n/ space for any n-node rooted tree
...
10
...

Instead of having a pointer to each of its children, however, each node x has only
two pointers:
1
...
x:right-sibling points to the sibling of x immediately to its right
...


10
...
9 The representation of a binary tree T
...
The key attributes are not shown
...
10 The left-child, right-sibling representation of a tree T
...
The key attributes are not shown
...
In Chapter 6, for example,
we represented a heap, which is based on a complete binary tree, by a single array
plus the index of the last node in the heap
...
Many other schemes are possible
...

Exercises
10
...
4-2
Write an O
...

10
...
n/-time nonrecursive procedure that, given an n-node binary tree,
prints out the key of each node in the tree
...

10
...
n/-time procedure that prints all the keys of an arbitrary rooted tree
with n nodes, where the tree is stored using the left-child, right-sibling representation
...
4-5 ?
Write an O
...
Use no more than constant extra space outside

Problems for Chapter 10

249

of the tree itself and do not modify the tree, even temporarily, during the procedure
...
4-6 ?
The left-child, right-sibling representation of an arbitrary rooted tree uses three
pointers in each node: left-child, right-sibling, and parent
...
Show how to use
only two pointers and one boolean value in each node so that the parent of a node
or all of its children can be reached and identified in time linear in the number of
children
...
L; k/
I NSERT
...
L; x/
S UCCESSOR
...
L; x/
M INIMUM
...
L/

sorted,
singly
linked

unsorted,
doubly
linked

sorted,
doubly
linked

250

Chapter 10 Elementary Data Structures

10-2 Mergeable heaps using linked lists
A mergeable heap supports the following operations: M AKE -H EAP (which creates
an empty mergeable heap), I NSERT, M INIMUM, E XTRACT-M IN, and U NION
...
Try to make each operation as efficient as possible
...

a
...

b
...

c
...

10-3 Searching a sorted compact list
Exercise 10
...
We shall assume that all keys are distinct and that the
compact list is also sorted, that is, keyŒi < keyŒnextŒi for all i D 1; 2; : : : ; n such
that nextŒi ¤ NIL
...
Under these assumptions, you will show
p
that we can use the following randomized algorithm to search the list in O
...

C OMPACT-L IST-S EARCH
...
1; n/
4
if keyŒi < keyŒj  and keyŒj  Ä k
5
i Dj
6
if keyŒi == k
7
return i
8
i D nextŒi
9 if i == NIL or keyŒi > k
10
return NIL
11 else return i
If we ignore lines 3–7 of the procedure, we have an ordinary algorithm for
searching a sorted linked list, in which index i points to each position of the list in

1 Because we have defined a mergeable heap to support M INIMUM and E XTRACT-M IN , we can also
refer to it as a mergeable min-heap
...


Problems for Chapter 10

251

turn
...
In the latter case, if keyŒi D k, clearly we have found a key with the
value k
...

Lines 3–7 attempt to skip ahead to a randomly chosen position j
...

Because the list is compact, we know that any choice of j between 1 and n indexes
some object in the list rather than a slot on the free list
...
This algorithm takes an additional parameter t which determines
an upper bound on the number of iterations of the first loop
...
L; n; k; t/
1 i DL
2 for q D 1 to t
3
j D R ANDOM
...
L; n; k/
and C OMPACT-L IST-S EARCH 0
...
1; n/ is the same for both algorithms
...
Suppose that C OMPACT-L IST-S EARCH
...
Argue that C OMPACT-L IST-S EARCH 0
...

In the call C OMPACT-L IST-S EARCH 0
...


252

Chapter 10 Elementary Data Structures

b
...
L; n; k; t/
is O
...

Pn
c
...
1 r=n/t
...
25)
...
Show that

Pn

1
rD0

r t Ä nt C1 =
...


e
...
t C 1/
...
Show that C OMPACT-L IST-S EARCH 0
...
t C n=t/ expected
time
...
Conclude that C OMPACT-L IST-S EARCH runs in O
...

h
...


Chapter notes
Aho, Hopcroft, and Ullman [6] and Knuth [209] are excellent references for elementary data structures
...
Examples of these types of
textbooks include Goodrich and Tamassia [147], Main [241], Shaffer [311], and
Weiss [352, 353, 354]
...

The origin of stacks and queues as data structures in computer science is unclear, since corresponding notions already existed in mathematics and paper-based
business practices before the introduction of digital computers
...
M
...

Pointer-based data structures also seem to be a folk invention
...
The
A-1 language developed by G
...
Hopper in 1951 represented algebraic formulas
as binary trees
...
Newell,
J
...
Shaw, and H
...
Simon, for recognizing the importance and promoting the
use of pointers
...


11

Hash Tables

Many applications require a dynamic set that supports only the dictionary operations I NSERT, S EARCH, and D ELETE
...
A hash
table is an effective data structure for implementing dictionaries
...
n/ time in the worst case—in practice, hashing performs extremely
well
...
1/
...
Directly addressing into an ordinary array makes effective use of our ability to examine an
arbitrary position in an array in O
...
Section 11
...
We can take advantage of direct addressing when we can afford
to allocate an array that has one position for every possible key
...
Instead of using the key as an array index directly, the array
index is computed from the key
...
2 presents the main ideas, focusing on
“chaining” as a way to handle “collisions,” in which more than one key maps to the
same array index
...
3 describes how we can compute array indices from
keys using hash functions
...
Section 11
...
The bottom line is that hashing is an extremely effective and practical
technique: the basic dictionary operations require only O
...

Section 11
...
1/ worstcase time, when the set of keys being stored is static (that is, when the set of keys
never changes once stored)
...
1 Direct-address tables
Direct addressing is a simple technique that works well when the universe U of
keys is reasonably small
...
We shall assume that no two elements have the same key
...
Figure 11
...
If the set contains no element with key k, then T Œk D NIL
...
T; k/
1 return T Œk
D IRECT-A DDRESS -I NSERT
...
T; x/
1 T Œx:key D NIL
Each of these operations takes only O
...

T
0

9

U
(universe of keys)
0
6
7
4

1
K
(actual
keys)

2
3

3

key

satellite data

2
3

4
5

2
5

1

5

6

8

7
8

8

9

Figure 11
...
Each key in the universe
U D f0; 1; : : : ; 9g corresponds to an index in the table
...
The other slots, heavily shaded,
contain NIL
...
1 Direct-address tables

255

For some applications, the direct-address table itself can hold the elements in the
dynamic set
...
We would
use a special key within an object to indicate an empty slot
...
If keys are not stored, however, we must have some
way to tell whether the slot is empty
...
1-1
Suppose that a dynamic set S is represented by a direct-address table T of length m
...
What is the worst-case
performance of your procedure?
11
...
A bit vector of length m takes
much less space than an array of m pointers
...
Dictionary
operations should run in O
...

11
...
All
three dictionary operations (I NSERT, D ELETE, and S EARCH) should run in O
...
(Don’t forget that D ELETE takes as an argument a pointer to an object to be
deleted, not a key
...
1-4 ?
We wish to implement a dictionary by using direct addressing on a huge array
...
Describe a scheme for implementing a directaddress dictionary on a huge array
...
1/ space;
the operations S EARCH, I NSERT, and D ELETE should take O
...
1/ time
...
)

256

Chapter 11 Hash Tables

11
...
Furthermore, the set K of keys actually stored
may be so small relative to U that most of the space allocated for T would be
wasted
...
Specifically, we can reduce the storage requirement to ‚
...
1/ time
...

With direct addressing, an element with key k is stored in slot k
...
k/; that is, we use a hash function h to compute the
slot from the key k
...
We say that an
element with key k hashes to slot h
...
k/ is the hash value of
key k
...
2 illustrates the basic idea
...
Instead of a size of jU j, the array
can have size m
...
2 Using a hash function h to map keys to hash-table slots
...


11
...
3 Collision resolution by chaining
...
For example, h
...
k4 / and h
...
k7 / D h
...

The linked list can be either singly or doubly linked; we show it as doubly linked because deletion is
faster that way
...
We call this situation
a collision
...

Of course, the ideal solution would be to avoid collisions altogether
...
One idea is to
make h appear to be “random,” thus avoiding collisions or at least minimizing
their number
...
(Of course, a hash function h must be
deterministic in that a given input k should always produce the same output h
...
)
Because jU j > m, however, there must be at least two keys that have the same hash
value; avoiding collisions altogether is therefore impossible
...

The remainder of this section presents the simplest collision resolution technique, called chaining
...
4 introduces an alternative method for resolving
collisions, called open addressing
...
3 shows
...


258

Chapter 11 Hash Tables

The dictionary operations on a hash table T are easy to implement when collisions are resolved by chaining:
C HAINED -H ASH -I NSERT
...
x:key/
C HAINED -H ASH -S EARCH
...
k/
C HAINED -H ASH -D ELETE
...
x:key/
The worst-case running time for insertion is O
...
The insertion procedure is fast
in part because it assumes that the element x being inserted is not already present in
the table; if necessary, we can check this assumption (at additional cost) by searching for an element whose key is x:key before we insert
...
We can delete an element in O
...
3 depicts
...
If the hash table supports deletion, then its linked lists should be doubly linked
so that we can delete an item quickly
...
x:key/ so that we
could update the next attribute of x’s predecessor
...
)
Analysis of hashing with chaining
How well does hashing with chaining perform? In particular, how long does it take
to search for an element with a given key?
Given a hash table T with m slots that stores n elements, we define the load
factor ˛ for T as n=m, that is, the average number of elements stored in a chain
...

The worst-case behavior of hashing with chaining is terrible: all n keys hash
to the same slot, creating a list of length n
...
n/ plus the time to compute the hash function—no better than if we used
one linked list for all the elements
...
(Perfect hashing, described in Section 11
...
)
The average-case performance of hashing depends on how well the hash function h distributes the set of keys to be stored among the m slots, on the average
...
2 Hash tables

259

Section 11
...
We call this the assumption of simple uniform
hashing
...
1)

and the expected value of nj is E Œnj  D ˛ D n=m
...
1/ time suffices to compute the hash value h
...
k/ of the list T Œh
...
Setting aside the O
...
k/, let us consider the expected number of
elements examined by the search algorithm, that is, the number of elements in the
list T Œh
...
We
shall consider two cases
...
In the second, the search successfully finds an element with key k
...
1
In a hash table in which collisions are resolved by chaining, an unsuccessful search
takes average-case time ‚
...


Proof Under the assumption of simple uniform hashing, any key k not already
stored in the table is equally likely to hash to any of the m slots
...
k/, which has expected length E Œnh
...
Thus, the expected number
of elements examined in an unsuccessful search is ˛, and the total time required
(including the time for computing h
...
1 C ˛/
...
Instead, the probability that a list is searched is proportional to the number of elements it contains
...
1 C ˛/
...
2
In a hash table in which collisions are resolved by chaining, a successful search
takes average-case time ‚
...


Proof We assume that the element being searched for is equally likely to be any
of the n elements stored in the table
...
Because new elements are placed at the front of the
list, elements before x in the list were all inserted after x was inserted
...
Let xi denote the ith element inserted into the table, for i D 1; 2; : : : ; n, and let ki D xi :key
...
ki / D h
...
Under the assumption of simple uniform hashing, we have Pr fh
...
kj /g D 1=m, and so by Lemma 5
...
Thus, the expected number of elements examined in a successful
search is
!#
" n
n
X
1X
Xij
1C
E
n i D1
j Di C1
!
n
n
X
1X
E ŒXij 
(by linearity of expectation)
1C
D
n i D1
j Di C1
!
n
n
X 1
1X
1C
D
n i D1
m
j Di C1
1 X

...
n C 1/
1
2
n
(by equation (A
...
2 C ˛=2 ˛=2n/ D ‚
...

What does this analysis mean? If the number of hash-table slots is at least proportional to the number of elements in the table, we have n D O
...
m/=m D O
...
Thus, searching takes constant time
on average
...
1/ worst-case time and deletion takes O
...
1/ time on average
...
2 Hash tables

261

Exercises
11
...
Assuming simple uniform hashing, what is the expected number of
collisions? More precisely, what is the expected cardinality of ffk; lg W k ¤ l and
h
...
l/g?
11
...
Let the table have 9 slots,
and let the hash function be h
...

11
...
How does the professor’s modification affect the running time for successful searches, unsuccessful
searches, insertions, and deletions?
11
...
Assume that one slot can store
a flag and either one element plus a pointer or two pointers
...
1/ expected time
...
2-5
Suppose that we are storing a set of n keys into a hash table of size m
...
n/
...
2-6
Suppose we have stored n keys in a hash table of size m, with collisions resolved by
chaining, and that we know the length of each chain, including the length L of the
longest chain
...
L
...


262

Chapter 11 Hash Tables

11
...
Two of the schemes, hashing by
division and hashing by multiplication, are heuristic in nature, whereas the third
scheme, universal hashing, uses randomization to provide provably good performance
...
Unfortunately, we typically have no way to
check this condition, since we rarely know the probability distribution from which
the keys are drawn
...

Occasionally we do know the distribution
...
k/ D bkmc
satisfies the condition of simple uniform hashing
...
Qualitative information about the distribution of keys may be
useful in this design process
...
Closely
related symbols, such as pt and pts, often occur in the same program
...

A good approach derives the hash value in a way that we expect to be independent of any patterns that might exist in the data
...
3
...
This method frequently gives good
results, assuming that we choose a prime number that is unrelated to any patterns
in the distribution of keys
...
For example, we might
want keys that are “close” in some sense to yield hash values that are far apart
...
4
...
3
...


11
...
Thus, if the keys are not natural numbers, we find a way to
interpret them as natural numbers
...
Thus, we might interpret the
identifier pt as the pair of decimal integers
...
112 128/ C 116 D 14452
...
In what follows, we assume that the keys are natural numbers
...
3
...
That is, the hash function is
h
...
k/ D 4
...

When using the division method, we usually avoid certain values of m
...
k/ is just the p
lowest-order bits of k
...
As Exercise 11
...

A prime not too close to an exact power of 2 is often a good choice for m
...

We don’t mind examining an average of 3 elements in an unsuccessful search, and
so we allocate a hash table of size m D 701
...
Treating each key k as an
integer, our hash function would be
h
...
3
...
First,
we multiply the key k by a constant A in the range 0 < A < 1 and extract the

264

Chapter 11 Hash Tables

w bits
k
×

s D A 2w

r1

r0
extract p bits
h
...
4 The multiplication method of hashing
...
The p highest-order bits of the lower w-bit half of the product
form the desired hash value h
...


fractional part of kA
...
In short, the hash function is
h
...
kA mod 1/c ;
where “kA mod 1” means the fractional part of kA, that is, kA bkAc
...

We typically choose it to be a power of 2 (m D 2p for some integer p), since we
can then easily implement the function on most computers as follows
...
We
restrict A to be a fraction of the form s=2w , where s is an integer in the range
0 < s < 2w
...
4, we first multiply k by the w-bit integer
s D A 2w
...
The desired p-bit hash
value consists of the p most significant bits of r0
...
The optimal choice depends on the characteristics of the data being hashed
...
2)
A
...

As an example, suppose we have k D 123456, p D 14, m D 214 D 16384,
and w D 32
...
5 1/=2, so that A D 2654435769=232
...
76300 232 / C 17612864, and so r1 D 76300
and r0 D 17612864
...
k/ D 67
...
3 Hash functions

?

11
...
3

265

Universal hashing

If a malicious adversary chooses the keys to be hashed by some fixed hash function,
then the adversary can choose n keys that all hash to the same slot, yielding an average retrieval time of ‚
...
Any fixed hash function is vulnerable to such terrible
worst-case behavior; the only effective way to improve the situation is to choose
the hash function randomly in a way that is independent of the keys that are actually
going to be stored
...

In universal hashing, at the beginning of execution we select the hash function
at random from a carefully designed class of functions
...
Because we randomly select the hash function, the algorithm can behave differently on each execution, even for the same input, guaranteeing good
average-case performance for any input
...
Poor performance occurs only when the
compiler chooses a random hash function that causes the set of identifiers to hash
poorly, but the probability of this situation occurring is small and is the same for
any set of identifiers of the same size
...
Such a collection is said to be universal
if for each pair of distinct keys k; l 2 U , the number of hash functions h 2 H
for which h
...
l/ is at most jH j =m
...
k/ and h
...

The following theorem shows that a universal class of hash functions gives good
average-case behavior
...

Theorem 11
...
If key k is not in the table, then the expected
length E Œnh
...

If key k is in the table, then the expected length E Œnh
...

Proof We note that the expectations here are over the choice of the hash function and do not depend on any assumptions about the distribution of the keys
...
k/ D h
...
Since by the definition of a universal collection of hash
functions, a single pair of keys collides with probability at most 1=m, we have
Pr fh
...
l/g Ä 1=m
...
1, therefore, we have E ŒXkl  Ä 1=m
...

If k 62 T , then nh
...
Thus E Œnh
...

If k 2 T , then because key k appears in list T Œh
...
k/ D Yk C 1 and jfl W l 2 T and l ¤ kgj D n 1
...
k/  D E ŒYk  C 1 Ä
...

The following corollary says universal hashing provides the desired payoff: it
has now become impossible for an adversary to pick a sequence of operations that
forces the worst-case running time
...

Corollary 11
...
n/ to handle any sequence of n I NSERT,
S EARCH, and D ELETE operations containing O
...

Proof Since the number of insertions is O
...
m/ and so
˛ D O
...
The I NSERT and D ELETE operations take constant time and, by Theorem 11
...
1/
...
3 Hash functions

267

expectation, therefore, the expected time for the entire sequence of n operations
is O
...
Since each operation takes
...
n/ bound follows
...
You may wish to consult Chapter 31 first if you are
unfamiliar with number theory
...
Let Zp denote the set f0; 1; : : : ; p 1g,
and let Zp denote the set f1; 2; : : : ; p 1g
...
Because we assume that the
size of the universe of keys is greater than the number of slots in the hash table, we
have p > m
...
k/ D
...
3)

For example, with p D 17 and m D 6, we have h3;4
...
The family of all
such hash functions is
˚
«
(11
...
This class of hash functions has the nice
property that the size m of the output range is arbitrary—not necessarily prime—a
feature which we shall use in Section 11
...
Since we have p 1 choices for a
and p choices for b, the collection Hpm contains p
...

Theorem 11
...
3) and (11
...

Proof Consider two distinct keys k and l from Zp , so that k ¤ l
...
ak C b/ mod p ;
s D
...
Why? Observe that
r

s Á a
...
mod p/ :

It follows that r ¤ s because p is prime and both a and
...
6
...
” Moreover,
each of the possible p
...
a; b/ with a ¤ 0 yields a different
resulting pair
...
r s/
...
r ak/ mod p ;
where
...
Since there are only p
...
r; s/ with r ¤ s, there
is a one-to-one correspondence between pairs
...
r; s/
with r ¤ s
...
a; b/ uniformly
at random from Zp Zp , the resulting pair
...

Therefore, the probability that distinct keys k and l collide is equal to the probability that r Á s
...
For a given value of r, of the p 1 possible remaining values for s, the
number of values s such that s ¤ r and s Á r
...
p C m 1/=m/
D
...
6))

The probability that s collides with r when reduced modulo m is at most

...
p 1/ D 1=m
...
k/ D hab
...


Exercises
11
...
k/
...
How
might we take advantage of the hash values when searching the list for an element
with a given key?
11
...
We can easily represent
the number m as a 32-bit computer word, but the string of r characters, treated as
a radix-128 number, takes many words
...
4 Open addressing

269

11
...
k/ D k mod m, where
m D 2p 1 and k is a character string interpreted in radix 2p
...
Give an example of an application in which this property would be
undesirable in a hash function
...
3-4
Consider a hash table of size m D 1000 and a corresponding hash function h
...
kA mod 1/c for A D
...
Compute the locations to which the keys
61, 62, 63, 64, and 65 are mapped
...
3-5 ?
Define a family H of hash functions from a finite set U to a finite set B to be
-universal if for all pairs of distinct elements k and l in U ,
Pr fh
...
l/g Ä

;

where the probability is over the choice of the hash function h drawn at random
from the family H
...
3-6 ?
Let U be the set of n-tuples of values drawn from Zp , and let B D Zp , where p
is prime
...
ha0 ; a1 ; : : : ; an 1 i/ D
j D0

and let H D fhb W b 2 Zp g
...
n 1/=p/-universal according to
the definition of -universal in Exercise 11
...
(Hint: See Exercise 31
...
)

11
...
That is, each table
entry contains either an element of the dynamic set or NIL
...
No lists and

270

Chapter 11 Hash Tables

no elements are stored outside the table, unlike in chaining
...

Of course, we could store the linked lists for chaining inside the hash table, in
the otherwise unused hash-table slots (see Exercise 11
...
Instead of following pointers,
we compute the sequence of slots to be examined
...

To perform insertion using open addressing, we successively examine, or probe,
the hash table until we find an empty slot in which to put the key
...
n/ search time), the sequence
of positions probed depends upon the key being inserted
...
Thus, the hash function becomes
hWU

f0; 1; : : : ; m

1g ! f0; 1; : : : ; m

1g :

With open addressing, we require that for every key k, the probe sequence
hh
...
k; 1/; : : : ; h
...
In the following pseudocode,
we assume that the elements in the hash table T are keys with no satellite information; the key k is identical to the element containing key k
...
The H ASH -I NSERT procedure takes as
input a hash table T and a key k
...

H ASH -I NSERT
...
k; i/
4
if T Œj  == NIL
5
T Œj  D k
6
return j
7
else i D i C 1
8 until i == m
9 error “hash table overflow”
The algorithm for searching for key k probes the same sequence of slots that the
insertion algorithm examined when key k was inserted
...
4 Open addressing

271

terminate (unsuccessfully) when it finds an empty slot, since k would have been
inserted there and not later in its probe sequence
...
) The procedure H ASH -S EARCH takes as input
a hash table T and a key k, returning j if it finds that slot j contains key k, or NIL
if key k is not present in table T
...
T; k/
1 i D0
2 repeat
3
j D h
...
When we delete a key
from slot i, we cannot simply mark that slot as empty by storing NIL in it
...
We can solve this problem by marking the
slot, storing in it the special value DELETED instead of NIL
...
We do not need to modify H ASH -S EARCH, since it will pass
over DELETED values while searching
...

In our analysis, we assume uniform hashing: the probe sequence of each key
is equally likely to be any of the mŠ permutations of h0; 1; : : : ; m 1i
...

True uniform hashing is difficult to implement, however, and in practice suitable
approximations (such as double hashing, defined below) are used
...
These techniques all guarantee that hh
...
k; 1/; : : : ; h
...
None of these techniques fulfills the assumption of uniform hashing, however, since none of them is capable of
generating more than m2 different probe sequences (instead of the mŠ that uniform
hashing requires)
...


272

Chapter 11 Hash Tables

Linear probing
Given an ordinary hash function h0 W U ! f0; 1; : : : ; m 1g, which we refer to as
an auxiliary hash function, the method of linear probing uses the hash function
h
...
h0
...
Given key k, we first probe T Œh0
...
e
...
We next probe slot T Œh0
...
Then we wrap around to slots T Œ0; T Œ1; : : : until we finally probe
slot T Œh0
...
Because the initial probe determines the entire probe sequence,
there are only m distinct probe sequences
...
Long runs of occupied slots build up, increasing the average
search time
...
i C 1/=m
...

Quadratic probing
Quadratic probing uses a hash function of the form
h
...
h0
...
5)

where h0 is an auxiliary hash function, c1 and c2 are positive auxiliary constants,
and i D 0; 1; : : : ; m 1
...
k/; later positions
probed are offset by amounts that depend in a quadratic manner on the probe number i
...
Problem 11-3 shows
one way to select these parameters
...
k1 ; 0/ D h
...
k1 ; i/ D h
...
This property leads to a milder form of clustering, called
secondary clustering
...

Double hashing
Double hashing offers one of the best methods available for open addressing because the permutations produced have many of the characteristics of randomly
chosen permutations
...
k; i/ D
...
k/ C ih2
...
The initial probe goes to position T Œh1
...
4 Open addressing

0
1
2
3
4
5
6
7
8
9
10
11
12

273

79

69
98
72
14
50

Figure 11
...
Here we have a hash table of size 13 with h1
...
k/ D 1 C
...
Since 14 Á 1
...
mod 11/, we insert
the key 14 into empty slot 9, after examining slots 1 and 5 and finding them to be occupied
...
k/, modulo m
...
Figure 11
...

The value h2
...
(See Exercise 11
...
) A convenient way to ensure this
condition is to let m be a power of 2 and to design h2 so that it always produces an
odd number
...
For example, we could choose m prime and
let
h1
...
k/ D 1 C
...
For example, if
k D 123456, m D 701, and m0 D 700, we have h1
...
k/ D 257, so
that we first probe position 80, and then we examine every 257th slot (modulo m)
until we find the key or have examined every slot
...
m2 / probe sequences are used, rather than ‚
...
h1
...
k// pair yields a distinct probe sequence
...

Although values of m other than primes or powers of 2 could in principle be
used with double hashing, in practice it becomes more difficult to efficiently generate h2
...
m/=m of such numbers may be small (see equation (31
...

Analysis of open-address hashing
As in our analysis of chaining, we express our analysis of open addressing in terms
of the load factor ˛ D n=m of the hash table
...

We assume that we are using uniform hashing
...
k; 0/; h
...
k; m 1/i used to insert or search for
each key k is equally likely to be any permutation of h0; 1; : : : ; m 1i
...

We now analyze the expected number of probes for hashing with open addressing under the assumption of uniform hashing, beginning with an analysis of the
number of probes made in an unsuccessful search
...
6
Given an open-address hash table with load factor ˛ D n=m < 1, the expected
number of probes in an unsuccessful search is at most 1=
...

Proof In an unsuccessful search, every probe but the last accesses an occupied
slot that does not contain the desired key, and the last slot probed is empty
...
Then the event fX ig is the
intersection of events A1 \ A2 \ \ Ai 1
...
By Exercise C
...
For j > 1, the probability
that there is a j th probe and it is to an occupied slot, given that the first j 1
probes were to occupied slots, is
...
m j C 1/
...
4 Open addressing

275

because we would be finding one of the remaining
...
j 1// elements in one
of the
...
j 1// unexamined slots, and by the assumption of uniform hashing,
the probability is the ratio of these quantities
...
n j /=
...
25) to bound the expected number of probes:
E ŒX  D
Ä
D

1
X
i D1
1
X
i D1
1
X

Pr fX
˛i

ig

1

˛i

i D0

D

1
1

˛

:

This bound of 1=
...

We always make the first probe
...
With probability
approximately ˛ 2 , the first two slots are occupied so that we make a third probe,
and so on
...
6 predicts that an unsuccessful search runs in O
...
For example, if the hash table is half full, the average number of probes in an
unsuccessful search is at most 1=
...
If it is 90 percent full, the average
number of probes is at most 1=
...

Theorem 11
...

Corollary 11
...
1 ˛/ probes on average, assuming uniform hashing
...

Inserting a key requires an unsuccessful search followed by placing the key into the
first empty slot found
...
1 ˛/
...

Theorem 11
...

Proof A search for a key k reproduces the same probe sequence as when the
element with key k was inserted
...
7, if k was the
...
1 i=m/ D m=
...
Averaging over all n keys in the hash table
gives us the expected number of probes in a successful search:
1X m
n i D0 m i

D

mX 1
n i D0 m i

D

1
˛

n 1

n 1

Ä
D
D

m
X
kDm nC1

1
k

Z
1 m

...
12))
˛ m n
m
1
ln
˛ m n
1
1
ln
:
˛ 1 ˛

If the hash table is half full, the expected number of probes in a successful search
is less than 1:387
...


11
...
4-1
Consider inserting the keys 10; 22; 31; 4; 15; 28; 17; 88; 59 into a hash table of
length m D 11 using open addressing with the auxiliary hash function h0
...

Illustrate the result of inserting these keys using linear probing, using quadratic
probing with c1 D 1 and c2 D 3, and using double hashing with h1
...
k/ D 1 C
...
m 1//
...
4-2
Write pseudocode for H ASH -D ELETE as outlined in the text, and modify H ASH I NSERT to handle the special value DELETED
...
4-3
Consider an open-address hash table with uniform hashing
...

11
...
k; i/ D
...
k/ C ih2
...
Show that if m and h2
...
1=d /th of the hash table before returning to slot h1
...
Thus,
when d D 1, so that m and h2
...
(Hint: See Chapter 31
...
4-5 ?
Consider an open-address hash table with a load factor ˛
...
Use the upper bounds given
by Theorems 11
...
8 for these expected numbers of probes
...
5 Perfect hashing
Although hashing is often a good choice for its excellent average-case performance, hashing can also provide excellent worst-case performance when the set of
keys is static: once the keys are stored in the table, the set of keys never changes
...
We

278

Chapter 11 Hash Tables

T
0
1
2

S
m0 a0 b0 0
1 0 0 10
m2 a2 b2
9 10 18

S2

0

60 72

3

0

4

6
7
8

2

3

4

75

S5

5

1

5

6

7

8

m5 a5 b5
1 0 0 70
m7 a7 b7
16 23 88

S7

0

40 52 22
0

1

2

3

4

5

6

7

8

9

37
10

11

12

13

14

15

Figure 11
...
The
outer hash function is h
...
ak C b/ mod p/ mod m, where a D 3, b D 42, p D 101, and
m D 9
...
75/ D 2, and so key 75 hashes to slot 2 of table T
...
The size of hash table Sj is mj D nj , and the associated
hash function is hj
...
aj k C bj / mod p/ mod mj
...
75/ D 7, key 75 is stored in slot 7
of secondary hash table S2
...


call a hashing technique perfect hashing if O
...

To create a perfect hashing scheme, we use two levels of hashing, with universal
hashing at each level
...
6 illustrates the approach
...

Instead of making a linked list of the keys hashing to slot j , however, we use a
small secondary hash table Sj with an associated hash function hj
...

In order to guarantee that there are no collisions at the secondary level, however,
we will need to let the size mj of hash table Sj be the square of the number nj of
keys hashing to slot j
...
n/
...
3
...
The first-level hash function comes from the class Hpm , where as
in Section 11
...
3, p is a prime number greater than any key value
...
5 Perfect hashing

279

hashing to slot j are re-hashed into a secondary hash table Sj of size mj using a
hash function hj chosen from the class Hp;mj
...
First, we shall determine how to ensure that
the secondary tables have no collisions
...
n/
...
9
Suppose that we store n keys in a hash table of size m D n2 using a hash function h
randomly chosen from a universal class of hash functions
...

Proof There are n pairs of keys that may collide; each pair collides with prob2
ability 1=m if h is chosen at random from a universal family H of hash functions
...
When m D n2 ,
the expected number of collisions is
!
n
1
E ŒX  D
n2
2
D

n2

2
< 1=2 :

n

1
n2

(This analysis is similar to the analysis of the birthday paradox in Section 5
...
1
...
30), Pr fX tg Ä E ŒX  =t, with t D 1, completes the proof
...
9, where m D n2 , it follows that a hash
function h chosen at random from H is more likely than not to have no collisions
...

When n is large, however, a hash table of size m D n2 is excessive
...
9
only to hash the entries within each slot
...
Then, if nj keys hash to slot j , we
2
use a secondary hash table Sj of size mj D nj to provide collision-free constanttime lookup
...
k/ D
...


280

Chapter 11 Hash Tables

We now turn to the issue of ensuring that the overall memory used is O
...

Since the size mj of the j th secondary hash table grows quadratically with the
number nj of keys stored, we run the risk that the overall amount of storage could
be excessive
...
n/
for the primary hash table, for the storage of the sizes mj of the secondary hash
tables, and for the storage of the parameters aj and bj defining the secondary hash
functions hj drawn from the class Hp;mj of Section 11
...
3 (except when nj D 1
and we use a D b D 0)
...
A second corollary
bounds the probability that the combined size of all the secondary hash tables is
superlinear (actually, that it equals or exceeds 4n)
...
10
Suppose that we store n keys in a hash table of size m D n using a hash function h
randomly chosen from a universal class of hash functions
...

Proof We start with the following identity, which holds for any nonnegative integer a:
!
a
:
(11
...
6))
nj C 2
D E
2
j D0
!#
"m 1
"m 1 #
X
X nj
(by linearity of expectation)
nj C 2 E
D E
2
j D0
j D0
!#
"m 1
X nj
(by equation (11
...
5 Perfect hashing

281

!#
"m 1
X nj
D n C 2E
2
j D0

(since n is not a random variable)
...
By the properties of universal hashing,
the expected value of this summation is at most
!
n
...
Thus,
"m 1 #
X
n
2
nj
Ä nC2
E
j D0

1
2

D 2n 1
< 2n :
Corollary 11
...
Then,
the expected amount of storage required for all secondary hash tables in a perfect
hashing scheme is less than 2n
...
10 gives

j D0

< 2n ;

(11
...

Corollary 11
...
Then, the
probability is less than 1=2 that the total storage used for secondary hash tables
equals or exceeds 4n
...
30), Pr fX
Pm 1
time to inequality (11
...
12, we see that if we test a few randomly chosen hash functions from the universal family, we will quickly find one that uses a reasonable
amount of storage
...
5-1 ?
Suppose that we insert n keys into a hash table of size m using open addressing
and uniform hashing
...
n; m/ be the probability that no collisions occur
...
n; m/ Ä e n
...
(Hint: See equation (3
...
) Argue that when n exp
ceeds m, the probability of avoiding collisions goes rapidly to zero
...

a
...

b
...
1=n2 / that the ith insertion
requires more than 2 lg n probes
...
You have shown in part (b) that Pr fXi > 2 lg ng D O
...
Let the random
variable X D max1Äi Än Xi denote the maximum number of probes required by
any of the n insertions
...
Show that Pr fX > 2 lg ng D O
...

d
...
lg n/
...
Each key is equally likely
to be hashed to each slot
...
Your mission is to prove an O
...

a
...
Let Pk be the probability that M D k, that is, the probability that the slot
containing the most keys contains k keys
...

c
...
18), to show that Qk < e k =k k
...
Show that there exists a constant c > 1 such that Qk0 < 1=n3 for k0 D
c lg n= lg lg n
...

e
...
lg n= lg lg n/
...
The search scheme is as follows:
1
...
k/, and set i D 0
...
Probe in position j for the desired key k
...

3
...
If i now equals m, the table is full, so terminate the search
...
i C j / mod m, and return to step 2
...

a
...
5)
...
Prove that this algorithm examines every table position in the worst case
...
We say that H is k-universal if, for every
fixed sequence of k distinct keys hx
...
2/ ; : : : ; x
...
x
...
x
...
x
...

a
...

b
...
Consider an element x D
hx0 ; x1 ; : : : ; xn 1 i 2 U
...
x/ D
j D0

Let H D fha g
...
(Hint: Find a key
for which all hash functions in H produce the same value
...
Suppose that we modify H slightly from part (b): for any a 2 U and for any
b 2 Zp , define
h0ab
...
Argue that H 0 is 2-universal
...
What happens to h0ab
...
y/ as ai and b range over Zp ?)
d
...
Each h 2 H maps from a universe of
keys U to Zp , where p is prime
...
She authenticates this message to Bob by also sending
an authentication tag t D h
...
m; t/ he receives
indeed satisfies t D h
...
Suppose that an adversary intercepts
...
m; t/ with a different pair
...

Argue that the probability that the adversary succeeds in fooling Bob into accepting
...


Notes for Chapter 11

285

Chapter notes
Knuth [211] and Gonnet [145] are excellent references for the analysis of hashing algorithms
...
P
...
At about the same time, G
...

Amdahl originated the idea of open addressing
...

Fredman, Koml´ s, and Szemer´ di [112] developed the perfect hashing scheme
o
e
for static sets presented in Section 11
...
An extension of their method to dynamic
sets, handling insertions and deletions in amortized expected time O
...
[86]
...
Thus, we can use a search tree both as a dictionary and as a priority
queue
...
For a complete binary tree with n nodes, such operations run in ‚
...
If the tree is a linear chain of n nodes, however, the same operations take ‚
...
We shall see in Section 12
...
lg n/, so that basic dynamic-set
operations on such a tree take ‚
...

In practice, we can’t always guarantee that binary search trees are built randomly, but we can design variations of binary search trees with good guaranteed
worst-case performance on basic operations
...
lg n/
...

After presenting the basic properties of binary search trees, the following sections show how to walk a binary search tree to print its values in sorted order, how
to search for a value in a binary search tree, how to find the minimum or maximum
element, how to find the predecessor or successor of an element, and how to insert
into or delete from a binary search tree
...


12
...
1
...
In addition to a key and satellite data, each node contains
attributes left, right, and p that point to the nodes corresponding to its left child,

12
...
1 Binary search trees
...
Different binary search trees can represent
the same set of values
...
(a) A binary search tree on 6 nodes with height 2
...


its right child, and its parent, respectively
...
The root node is the only node in the
tree whose parent is NIL
...
If y is a node in the left subtree
of x, then y:key Ä x:key
...

Thus, in Figure 12
...
The same property holds for every node in the tree
...

The binary-search-tree property allows us to print out all the keys in a binary
search tree in sorted order by a simple recursive algorithm, called an inorder tree
walk
...

(Similarly, a preorder tree walk prints the root before the values in either subtree,
and a postorder tree walk prints the root after the values in its subtrees
...
T:root/
...
x/
1 if x ¤ NIL
2
I NORDER -T REE -WALK
...
x:right/
As an example, the inorder tree walk prints the keys in each of the two binary
search trees from Figure 12
...
The correctness of the
algorithm follows by induction directly from the binary-search-tree property
...
n/ time to walk an n-node binary search tree, since after the initial call, the procedure calls itself recursively exactly twice for each node in the
tree—once for its left child and once for its right child
...

Theorem 12
...
x/
takes ‚
...

Proof Let T
...
Since I NORDER -T REE -WALK visits all n
nodes of the subtree, we have T
...
n/
...
n/ D O
...

Since I NORDER -T REE -WALK takes a small, constant amount of time on an
empty subtree (for the test x ¤ NIL ), we have T
...

For n > 0, suppose that I NORDER -T REE -WALK is called on a node x whose
left subtree has k nodes and whose right subtree has n k 1 nodes
...
x/ is bounded by T
...
k/CT
...
x/, exclusive of the time spent in recursive calls
...
n/ D O
...
n/ Ä
...
For n D 0, we have
...
0/
...
n/ Ä
D
D
D

T
...
n k 1/ C d

...
c C d /
...
c C d /n C c
...
c C d /n C c ;

which completes the proof
...
2 Querying a binary search tree

289

Exercises
12
...

12
...
n/ time? Show how, or explain why not
...
1-3
Give a nonrecursive algorithm that performs an inorder tree walk
...
A more complicated, but elegant, solution uses no stack but assumes that we can test two pointers for equality
...
1-4
Give recursive algorithms that perform preorder and postorder tree walks in ‚
...

12
...
n lg n/ time in the worst case in
the comparison model, any comparison-based algorithm for constructing a binary
search tree from an arbitrary list of n elements takes
...


12
...
Besides the
S EARCH operation, binary search trees can support such queries as M INIMUM,
M AXIMUM, S UCCESSOR, and P REDECESSOR
...
h/ on any binary
search tree of height h
...
Given a pointer to the root of the tree and a key k, T REE -S EARCH
returns a pointer to a node with key k if one exists; otherwise, it returns NIL
...
2 Queries on a binary search tree
...
The minimum key in the tree is 2, which is found by following
left pointers from the root
...

The successor of the node with key 15 is the node with key 17, since it is the minimum key in the
right subtree of 15
...
In this case, the node with key 15 is its successor
...
x; k/
1 if x == NIL or k == x:key
2
return x
3 if k < x:key
4
return T REE -S EARCH
...
x:right; k/
The procedure begins its search at the root and traces a simple path downward in
the tree, as shown in Figure 12
...
For each node x it encounters, it compares the
key k with x:key
...
If k is smaller
than x:key, the search continues in the left subtree of x, since the binary-searchtree property implies that k could not be stored in the right subtree
...
The nodes
encountered during the recursion form a simple path downward from the root of
the tree, and thus the running time of T REE -S EARCH is O
...

We can rewrite this procedure in an iterative fashion by “unrolling” the recursion
into a while loop
...


12
...
x; k/
1 while x ¤ NIL and k ¤ x:key
2
if k < x:key
3
x D x:left
4
else x D x:right
5 return x

Minimum and maximum
We can always find an element in a binary search tree whose key is a minimum by
following left child pointers from the root until we encounter a NIL, as shown in
Figure 12
...
The following procedure returns a pointer to the minimum element in
the subtree rooted at a given node x, which we assume to be non-NIL:
T REE -M INIMUM
...
If a
node x has no left subtree, then since every key in the right subtree of x is at least as
large as x:key, the minimum key in the subtree rooted at x is x:key
...

The pseudocode for T REE -M AXIMUM is symmetric:
T REE -M AXIMUM
...
h/ time on a tree of height h since, as in T REE S EARCH, the sequence of nodes encountered forms a simple path downward from
the root
...
If all keys are distinct, the

292

Chapter 12 Binary Search Trees

successor of a node x is the node with the smallest key greater than x:key
...
The following procedure returns the successor of a
node x in a binary search tree if it exists, and NIL if x has the largest key in the
tree:
T REE -S UCCESSOR
...
x:right/
3 y D x:p
4 while y ¤ NIL and x == y:right
5
x Dy
6
y D y:p
7 return y
We break the code for T REE -S UCCESSOR into two cases
...
x:right/
...
2 is the node with
key 17
...
2-6 asks you to show, if the right subtree of
node x is empty and x has a successor y, then y is the lowest ancestor of x whose
left child is also an ancestor of x
...
2, the successor of the node with
key 13 is the node with key 15
...

The running time of T REE -S UCCESSOR on a tree of height h is O
...
The
procedure T REE -P REDECESSOR, which is symmetric to T REE -S UCCESSOR, also
runs in time O
...

Even if keys are not distinct, we define the successor and predecessor of any
node x as the node returned by calls made to T REE -S UCCESSOR
...
x/, respectively
...

Theorem 12
...
h/ time on a binary
search tree of height h
...
2 Querying a binary search tree

293

Exercises
12
...
Which of the following sequences could not be
the sequence of nodes examined?
a
...

b
...

c
...

d
...

e
...

12
...

12
...

12
...
Suppose that the search for key k in a binary search tree ends up in a leaf
...
Professor Bunyan
claims that any three keys a 2 A, b 2 B, and c 2 C must satisfy a Ä b Ä c
...

12
...

12
...
Show that if the right
subtree of a node x in T is empty and x has a successor y, then y is the lowest
ancestor of x whose left child is also an ancestor of x
...
)
12
...
Prove that this algorithm runs
in ‚
...


294

Chapter 12 Binary Search Trees

12
...
k C h/ time
...
2-9
Let T be a binary search tree whose keys are distinct, let x be a leaf node, and let y
be its parent
...


12
...
The data structure must be modified to reflect this
change, but in such a way that the binary-search-tree property continues to hold
...

Insertion
To insert a new value into a binary search tree T , we use the procedure T REE I NSERT
...
It modifies T and some of the attributes of ´ in such a way that
it inserts ´ into an appropriate position in the tree
...
T; ´/
1 y D NIL
2 x D T:root
3 while x ¤ NIL
4
y Dx
5
if ´:key < x:key
6
x D x:left
7
else x D x:right
8 ´:p D y
9 if y == NIL
10
T:root D ´
/ tree T was empty
/
11 elseif ´:key < y:key
12
y:left D ´
13 else y:right D ´

12
...
3 Inserting an item with key 13 into a binary search tree
...
The dashed line
indicates the link in the tree that is added to insert the item
...
3 shows how T REE -I NSERT works
...
The procedure maintains the trailing pointer y as the parent
of x
...
This NIL occupies the position where we wish to
place the input item ´
...
Lines 8–13 set the pointers that cause ´ to be inserted
...
h/ time on a tree of height h
...

If ´ has no children, then we simply remove it by modifying its parent to replace ´ with NIL as its child
...

If ´ has two children, then we find ´’s successor y—which must be in ´’s right
subtree—and have y take ´’s position in the tree
...
This case is the tricky one because, as we shall see, it matters
whether y is ´’s right child
...
It organizes its cases a bit differently from the three
cases outlined previously by considering the four cases shown in Figure 12
...

If ´ has no left child (part (a) of the figure), then we replace ´ by its right child,
which may or may not be NIL
...
When ´’s right child is non-NIL, this
case handles the situation in which ´ has just one child, which is its right child
...

Otherwise, ´ has both a left and a right child
...
2-5)
...

If y is ´’s right child (part (c)), then we replace ´ by y, leaving y’s right
child alone
...

In this case, we first replace y by its own right child, and then we replace ´
by y
...
When T RANSPLANT replaces the subtree rooted at node u with
the subtree rooted at node , node u’s parent becomes node ’s parent, and u’s
parent ends up having as its appropriate child
...
T; u; /
1 if u:p == NIL
2
T:root D
3 elseif u == u:p:left
4
u:p:left D
5 else u:p:right D
6 if ¤ NIL
7
:p D u:p
Lines 1–2 handle the case in which u is the root of T
...
Lines 3–4 take care of updating u:p:left if u
is a left child, and line 5 updates u:p:right if u is a right child
...
Note that T RANSPLANT does not
attempt to update :left and :right; doing so, or not doing so, is the responsibility
of T RANSPLANT’s caller
...
3 Insertion and deletion

297

q

q

(a)

z

r
r

NIL

q

q

(b)

l

z
l

NIL

q

q

(c)

z
l

y
y

l
x

NIL

q

q

(d)

z
l

q
z

r

l

y
NIL

x

y
NIL

x

y
r

l

r

x

x

Figure 12
...
Node ´ may be the root, a left child of
node q, or a right child of q
...
We replace ´ by its right child r, which
may or may not be NIL
...
We replace ´ by l
...

We replace ´ by y, updating y’s left child to become l, but leaving x as y’s right child
...
We replace y by its own right child x, and we set y to be r’s parent
...


298

Chapter 12 Binary Search Trees

With the T RANSPLANT procedure in hand, here is the procedure that deletes
node ´ from binary search tree T :
T REE -D ELETE
...
T; ´; ´:right/
3 elseif ´:right == NIL
4
T RANSPLANT
...
´:right/
6
if y:p ¤ ´
7
T RANSPLANT
...
T; ´; y/
11
y:left D ´:left
12
y:left:p D y
The T REE -D ELETE procedure executes the four cases as follows
...
Lines 5–12 deal with the remaining two
cases, in which ´ has two children
...
Because ´ has a nonempty right subtree, its successor must be the node in
that subtree with the smallest key; hence the call to T REE -M INIMUM
...
As
we noted before, y has no left child
...
If y is ´’s right child, then lines 10–12 replace ´
as a child of its parent by y and replace y’s left child by ´’s left child
...

Each line of T REE -D ELETE, including the calls to T RANSPLANT, takes constant
time, except for the call to T REE -M INIMUM in line 5
...
h/ time on a tree of height h
...

Theorem 12
...
h/ time on a binary search tree of height h
...
4 Randomly built binary search trees

299

Exercises
12
...

12
...
Argue that the number of nodes examined in searching for a
value in the tree is one plus the number of nodes examined when the value was
first inserted into the tree
...
3-3
We can sort a given set of n numbers by first building a binary search tree containing these numbers (using T REE -I NSERT repeatedly to insert the numbers one by
one) and then printing the numbers by an inorder tree walk
...
3-4
Is the operation of deletion “commutative” in the sense that deleting x and then y
from a binary search tree leaves the same tree as deleting y and then x? Argue why
it is or give a counterexample
...
3-5
Suppose that instead of each node x keeping the attribute x:p, pointing to x’s
parent, it keeps x:succ, pointing to x’s successor
...
These
procedures should operate in time O
...
(Hint:
You may wish to implement a subroutine that returns the parent of a node
...
3-6
When node ´ in T REE -D ELETE has two children, we could choose node y as
its predecessor rather than its successor
...

How might T REE -D ELETE be changed to implement such a fair strategy?

? 12
...
h/ time, where h is the height of the tree
...
If, for example, the n items
are inserted in strictly increasing order, the tree will be a chain with height n 1
...
5-4 shows that h
blg nc
...

Unfortunately, little is known about the average height of a binary search tree
when both insertion and deletion are used to create it
...
Let us therefore define a
randomly built binary search tree on n keys as one that arises from inserting the
keys in random order into an initially empty tree, where each of the nŠ permutations
of the input keys is equally likely
...
4-3 asks you to show that this notion
is different from assuming that every binary search tree on n keys is equally likely
...

Theorem 12
...
lg n/
...
We denote the height of a randomly built
binary search on n keys by Xn , and we define the exponential height Yn D 2Xn
...
The value of Rn is equally likely to be any element of the
set f1; 2; : : : ; ng
...
Because the height of a binary tree is 1 more than the
larger of the heights of the two subtrees of the root, the exponential height of a
binary tree is twice the larger of the exponential heights of the two subtrees of the
root
...
Yi 1 ; Yn i / :
As base cases, we have that Y1 D 1, because the exponential height of a tree with 1
node is 20 D 1 and, for convenience, we define Y0 D 0
...
1, we have
E ŒZn;i  D 1=n ;

(12
...
4 Randomly built binary search trees

301

for i D 1; 2; : : : ; n
...
2 max
...
lg n/
...
Having chosen Rn D i, the left subtree (whose
exponential height is Yi 1 ) is randomly built on the i 1 keys whose ranks are
less than i
...
Other than the number of keys it contains, this subtree’s structure
is not affected at all by the choice of Rn D i, and hence the random variables
Yi 1 and Zn;i are independent
...

Its structure is independent of the value of Rn , and so the random variables Yn i
and Zn;i are independent
...
2 max
...
2 max
...
Yi 1 ; Yn i / (by independence)

i D1
n
X1
E Œ2 max
...
1))

D

2X
E Œmax
...
22))

Ä

2X

...
3-4)
...
2)

302

Chapter 12 Binary Search Trees

Using the substitution method, we shall show that for all positive integers n, the
recurrence (12
...
3)

(Exercise 12
...
)
For the base cases, we note that the bounds 0 D Y0 D E ŒY0  Ä
...
1=4/ 1C3 D 1 hold
...
3))

1
...
n 1/Š
1
...
As Exercise 12
...
x/ D 2x is convex (see page 1199)
...
26), which says that
2EŒXn 

Ä E 2Xn
D E ŒYn  ;

as follows:
2EŒXn 

Ä

1 nC3
4
3

!

Problems for Chapter 12

303

1
...
n C 2/
...
lg n/
...
4-1
Prove equation (12
...

12
...
lg n/ but the height of the tree is !
...
Give an asymptotic upper
bound on the height of an n-node binary search tree in which the average depth of
a node is ‚
...

12
...
(Hint: List
the possibilities when n D 3
...
4-4
Show that the function f
...

12
...
Prove that for any constant k > 0, all but O
...
n lg n/ running time
...

a
...


304

Chapter 12 Binary Search Trees

If equality holds, we implement one of the following strategies
...
(The strategies are described for line 5, in which
we compare the keys of ´ and x
...
)
b
...

c
...

d
...
(Give the worst-case performance
and informally derive the expected running time
...
there exists an integer j , where 0 Ä j Ä min
...
p < q and ai D bi for all i D 0; 1; : : : ; p
...
This ordering is similar to that used in
English-language dictionaries
...
5 stores the bit strings 1011,
10, 011, 100, and 0
...
Let S be a set of distinct bit strings
whose lengths sum to n
...
n/ time
...
5, the output of the sort should be the
sequence 0, 011, 10, 100, 1011
...
lg n/
...
4, the technique we shall use reveals a surprising similarity
between the building of a binary search tree and the execution of R ANDOMIZED Q UICKSORT from Section 7
...

We define the total path length P
...
x; T /
...
5 A radix tree storing the bit strings 1011, 10, 011, 100, and 0
...
There is no need, therefore, to
store the keys in the nodes; the keys appear here for illustrative purposes only
...


a
...
x; T / D P
...
T / is O
...

b
...
Argue
that if T has n nodes, then
P
...
TL / C P
...
Let P
...
Show that
1X

...
i/ C P
...
n/ D
n i D0
n 1

i

1/ C n

1/ :

d
...
n/ as
2X
P
...
n/ :
P
...
Recalling the alternative analysis of the randomized version of quicksort given
in Problem 7-3, conclude that P
...
n lg n/
...
Each node of a binary search tree partitions the set of elements that fall into the subtree rooted at that node
...
Describe an implementation of quicksort in which the comparisons to sort a set
of elements are exactly the same as the comparisons to insert the elements into
a binary search tree
...
)
12-4 Number of different binary trees
Let bn denote the number of different binary trees with n nodes
...

a
...
Referring to Problem 4-4 for the definition of a generating function, let B
...
x/ D

1
X

bn x n :

nD0

Show that B
...
x/2 C 1, and hence one way to express B
...
x/ D

p
1

1
1
2x

4x :

The Taylor expansion of f
...
x/ D

1
X f
...
a/
kD0




...
k/
...

c
...
(If you wish, instead of using the Taylor expansion, you may use
the generalization of the binomial expansion (C
...
n 1/
...
)
d
...
1 C O
...
Binary search trees seem to have been independently discovered
by a number of people in the late 1950s
...
Knuth [211] also discusses
them
...
Instead of replacing node ´ by its successor y, we delete node y but
copy its key and satellite data into node ´
...
If
other components of a program maintain pointers to nodes in the tree, they could
mistakenly end up with “stale” pointers to nodes that have been deleted
...

Section 15
...
That is, given the
frequencies of searching for each key and the frequencies of searching for values
that fall between keys in the tree, we construct a binary search tree for which a
set of searches that follows these frequencies examines the minimum number of
nodes
...
4 that bounds the expected height of a randomly built
binary search tree is due to Aslam [24]
...
Their definition of a
random binary search tree differs—only slightly—from that of a randomly built
binary search tree in this chapter, however
...
h/ time
...
If its height is large, however, the
set operations may run no faster than with a linked list
...
lg n/ time in the worst case
...
1 Properties of red-black trees
A red-black tree is a binary search tree with one extra bit of storage per node: its
color, which can be either RED or BLACK
...

Each node of the tree now contains the attributes color, key, left, right, and p
...
We shall regard these NILs as being pointers to
leaves (external nodes) of the binary search tree and the normal, key-bearing nodes
as being internal nodes of the tree
...
Every node is either red or black
...
The root is black
...
Every leaf (NIL) is black
...
If a node is red, then both its children are black
...
For each node, all simple paths from the node to descendant leaves contain the
same number of black nodes
...
1 Properties of red-black trees

309

Figure 13
...

As a matter of convenience in dealing with boundary conditions in red-black
tree code, we use a single sentinel to represent NIL (see page 238)
...
Its color attribute is BLACK, and its other attributes—p, left, right,
and key—can take on arbitrary values
...
1(b) shows, all pointers to NIL
are replaced by pointers to the sentinel T:nil
...
Although we instead could add a distinct sentinel node
for each NIL in the tree, so that the parent of each NIL is well defined, that approach would waste space
...
The values of the attributes p, left, right,
and key of the sentinel are immaterial, although we may set them during the course
of a procedure for our convenience
...
In the remainder of this chapter, we omit the leaves when
we draw red-black trees, as shown in Figure 13
...

We call the number of black nodes on any simple path from, but not including, a
node x down to a leaf the black-height of the node, denoted bh
...
By property 5,
the notion of black-height is well defined, since all descending simple paths from
the node have the same number of black nodes
...

The following lemma shows why red-black trees make good search trees
...
1
A red-black tree with n internal nodes has height at most 2 lg
...

Proof We start by showing that the subtree rooted at any node x contains at least
2bh
...
We prove this claim by induction on the height of x
...
x/ 1 D 20 1 D 0 internal nodes
...

Each child has a black-height of either bh
...
x/ 1, depending on whether
its color is red or black, respectively
...
x/ 1 1 internal nodes
...
2bh
...
2bh
...
x/ 1 internal nodes, which proves
the claim
...
According
to property 4, at least half the nodes on any simple path from the root to a leaf, not

310

Chapter 13 Red-Black Trees

3
3
2
2
1
1

7

3

NIL

1
NIL

12

1

NIL

21

2
1

NIL

41

17

14

10

16

15

NIL

26

1

NIL

19

NIL

NIL

2
1

1

20

NIL

23

NIL

1

NIL

30

1

28

NIL

NIL

NIL

1
1

38

35

1

NIL

NIL

2

NIL

47

NIL

NIL

39

NIL

NIL

(a)

26
41

17
14

21
16

10
7

12

19

15

30
23

47

28

38

20

35

39

3

T:nil
(b)
26
17

41

14

21

10
7
3

16
12

15

19

30
23

47

28

20

38
35

39

(c)

Figure 13
...
Every node in a
red-black tree is either red or black, the children of a red node are both black, and every simple path
from a node to a descendant leaf contains the same number of black nodes
...
Each non-NIL node is marked with its black-height; NIL s have black-height 0
...
The root’s parent is also the sentinel
...
We shall use this drawing style in the
remainder of this chapter
...
1 Properties of red-black trees

311

including the root, must be black
...
n C 1/ h=2, or h Ä 2 lg
...

As an immediate consequence of this lemma, we can implement the dynamic-set
operations S EARCH, M INIMUM, M AXIMUM, S UCCESSOR, and P REDECESSOR
in O
...
h/ time on a binary
search tree of height h (as shown in Chapter 12) and any red-black tree on n nodes
is a binary search tree with height O
...
(Of course, references to NIL in the
algorithms of Chapter 12 would have to be replaced by T:nil
...
lg n/ time
when given a red-black tree as input, they do not directly support the dynamic-set
operations I NSERT and D ELETE, since they do not guarantee that the modified binary search tree will be a red-black tree
...
3 and 13
...
lg n/ time
...
1-1
In the style of Figure 13
...
Add the NIL leaves and color the nodes in three different
ways such that the black-heights of the resulting red-black trees are 2, 3, and 4
...
1-2
Draw the red-black tree that results after T REE -I NSERT is called on the tree in
Figure 13
...
If the inserted node is colored red, is the resulting tree a
red-black tree? What if it is colored black?
13
...
In other words, the root may be either red or black
...
If we color the root of T
black but make no other changes to T , is the resulting tree a red-black tree?
13
...
(Ignore
what happens to the keys
...
1-5
Show that the longest simple path from a node x in a red-black tree to a descendant
leaf has length at most twice that of the shortest simple path from node x to a
descendant leaf
...
1-6
What is the largest possible number of internal nodes in a red-black tree with blackheight k? What is the smallest possible number?
13
...
What is this ratio? What tree has the smallest
possible ratio, and what is the ratio?

13
...
lg n/ time
...
1
...

We change the pointer structure through rotation, which is a local operation in
a search tree that preserves the binary-search-tree property
...
2 shows the
two kinds of rotations: left rotations and right rotations
...
The left rotation “pivots” around the link
from x to y
...

The pseudocode for L EFT-ROTATE assumes that x:right ¤ T:nil and that the
root’s parent is T:nil
...
2 Rotations

313

LEFT-ROTATE(T, x)
y

γ

x

α

x

β

RIGHT-ROTATE(T, y)

α

y

β

γ

Figure 13
...
The operation L EFT-ROTATE
...
The inverse operation R IGHT-ROTATE
...
The letters ˛, ˇ, and represent arbitrary
subtrees
...


L EFT-ROTATE
...
3 shows an example of how L EFT-ROTATE modifies a binary search
tree
...
Both L EFT-ROTATE and R IGHTROTATE run in O
...
Only pointers are changed by a rotation; all other
attributes in a node remain the same
...
2-1
Write pseudocode for R IGHT-ROTATE
...
2-2
Argue that in every n-node binary search tree, there are exactly n
rotations
...
3 An example of how the procedure L EFT-ROTATE
...

Inorder tree walks of the input tree and the modified tree produce the same listing of key values
...
2-3
Let a, b, and c be arbitrary nodes in subtrees ˛, ˇ, and , respectively, in the left
tree of Figure 13
...
How do the depths of a, b, and c change when a left rotation
is performed on node x in the figure?
13
...
n/ rotations
...
)
13
...
Give
an example of two trees T1 and T2 such that T1 cannot be right-converted to T2
...
n2 / calls to R IGHT-ROTATE
...
3 Insertion

315

13
...
lg n/ time
...
3) to
insert node ´ into the tree T as if it were an ordinary binary search tree, and then we
color ´ red
...
3-1 asks you to explain why we choose to make node ´
red rather than black
...
The call RB-I NSERT
...

RB-I NSERT
...
T; ´/
The procedures T REE -I NSERT and RB-I NSERT differ in four ways
...
Second, we set ´:left
and ´:right to T:nil in lines 14–15 of RB-I NSERT, in order to maintain the
proper tree structure
...
Fourth, because coloring ´ red may cause a violation of one of the red-black properties, we call
RB-I NSERT-F IXUP
...


316

Chapter 13 Red-Black Trees

RB-I NSERT-F IXUP
...
T; ´/
12
´:p:color D BLACK
13
´:p:p:color D RED
14
R IGHT-ROTATE
...
First, we shall determine what violations of
the red-black properties are introduced in RB-I NSERT when node ´ is inserted
and colored red
...
Finally, we shall explore each of the three cases1 within the while
loop’s body and see how they accomplish the goal
...
4 shows how RBI NSERT-F IXUP operates on a sample red-black tree
...
Property 5,
which says that the number of black nodes is the same on every simple path from
a given node, is satisfied as well, because node ´ replaces the (black) sentinel, and
node ´ is red with sentinel children
...
Both possible violations are due to ´
being colored red
...
Figure 13
...

1 Case

2 falls through into case 3, and so these two cases are not mutually exclusive
...
3 Insertion

317

11
2
(a)

14

1

7

15

5
z

8 y

4

Case 1

11
2
(b)

14 y

1

7
5

z

15
8

4

Case 2

11
7
(c)

z

14 y

2

8

1

15

5
Case 3
4
7
z

(d)

2

11

1

5
4

8

14
15

Figure 13
...
(a) A node ´ after insertion
...
Since ´’s uncle y is red, case 1 in the
code applies
...

Once again, ´ and its parent are both red, but ´’s uncle y is black
...
We perform a left rotation, and the tree that results is shown in (c)
...
Recoloring and right rotation yield the tree in (d), which is a
legal red-black tree
...
Node ´ is red
...
If ´:p is the root, then ´:p is black
...
If the tree violates any of the red-black properties, then it violates at most
one of them, and the violation is of either property 2 or property 4
...
If the tree
violates property 4, it is because both ´ and ´:p are red
...
Because
we’ll be focusing on node ´ and nodes near it in the tree, it helps to know from
part (a) that ´ is red
...

Recall that we need to show that a loop invariant is true prior to the first iteration of the loop, that each iteration maintains the loop invariant, and that the loop
invariant gives us a useful property at loop termination
...
Then, as we examine how the body of the loop works in more detail, we shall argue that the loop
maintains the invariant upon each iteration
...

Initialization: Prior to the first iteration of the loop, we started with a red-black
tree with no violations, and we added a red node ´
...
When RB-I NSERT-F IXUP is called, ´ is the red node that was added
...
If ´:p is the root, then ´:p started out black and did not change prior to the
call of RB-I NSERT-F IXUP
...
We have already seen that properties 1, 3, and 5 hold when RB-I NSERTF IXUP is called
...
Because the parent and
both children of ´ are the sentinel, which is black, the tree does not also
violate property 4
...

If the tree violates property 4, then, because the children of node ´ are black
sentinels and the tree had no other violations prior to ´ being added, the

13
...
Moreover, the tree violates
no other red-black properties
...
(If ´ is
the root, then ´:p is the sentinel T:nil, which is black
...
By the loop invariant, the only property
that might fail to hold is property 2
...

Maintenance: We actually need to consider six cases in the while loop, but three
of them are symmetric to the other three, depending on whether line 2 determines ´’s parent ´:p to be a left child or a right child of ´’s grandparent ´:p:p
...
The
node ´:p:p exists, since by part (b) of the loop invariant, if ´:p is the root,
then ´:p is black
...
Hence, ´:p:p exists
...
” Line 3 makes y point to ´’s uncle ´:p:p:right, and line 4 tests y’s
color
...
Otherwise, control passes to cases 2
and 3
...

Case 1: ´’s uncle y is red
Figure 13
...
Because ´:p:p is black, we can color both ´:p and y
black, thereby fixing the problem of ´ and ´:p both being red, and we can
color ´:p:p red, thereby maintaining property 5
...
The pointer ´ moves up two levels in the tree
...
We use ´ to denote node ´ in the current iteration, and ´0 D ´:p:p
to denote the node that will be called node ´ at the test in line 1 upon the next
iteration
...
Because this iteration colors ´:p:p red, node ´0 is red at the start of the next
iteration
...
The node ´0 :p is ´:p:p:p in this iteration, and the color of this node does not
change
...

c
...


320

Chapter 13 Red-Black Trees

new z

C
(a)

A

D y

α

δ

B z

β

A

ε

D

α

γ

β

α

γ

A

β

δ

C

B

D y

ε
α

D

γ

A

ε

γ

new z

B
z

δ

B

C
(b)

C

δ

ε

β

Figure 13
...
Property 4 is violated, since ´ and its
parent ´: p are both red
...
Each of the subtrees ˛, ˇ, , ı, and " has a black root, and each has the same black-height
...
The while loop continues with node ´’s
grandparent ´: p: p as the new ´
...


If node ´0 is the root at the start of the next iteration, then case 1 corrected
the lone violation of property 4 in this iteration
...

If node ´0 is not the root at the start of the next iteration, then case 1 has
not created a violation of property 2
...
It then made ´0 red
and left ´0 :p alone
...

If ´0 :p was red, coloring ´0 red created one violation of property 4 between ´0
and ´0 :p
...
We distinguish the two cases
according to whether ´ is a right or left child of ´:p
...
6 together with case 3
...
We immediately use a left rotation to transform
the situation into case 3 (lines 12–14), in which node ´ is a left child
...
3 Insertion

321

C

C

δ y

A

α

z

γ

α

β
Case 2

δ y

B

B z

γ

A

B
z

α

A

C

β

γ

δ

β
Case 3

Figure 13
...
As in case 1, property 4 is violated
in either case 2 or case 3 because ´ and its parent ´: p are both red
...
We transform case 2 into case 3 by a left rotation, which preserves
property 5: all downward simple paths from a node to a leaf have the same number of blacks
...
The while loop then
terminates, because property 4 is satisfied: there are no longer two red nodes in a row
...
Whether we enter case 3 directly or through case 2, ´’s uncle y
is black, since otherwise we would have executed case 1
...
In case 3,
we execute some color changes and a right rotation, which preserve property 5,
and then, since we no longer have two red nodes in a row, we are done
...

We now show that cases 2 and 3 maintain the loop invariant
...
)
a
...
No further change to ´ or its color
occurs in cases 2 and 3
...
Case 3 makes ´:p black, so that if ´:p is the root at the start of the next
iteration, it is black
...
As in case 1, properties 1, 3, and 5 are maintained in cases 2 and 3
...
Cases 2 and 3 do not introduce a violation of property 2,
since the only node that is made red becomes a child of a black node by the
rotation in case 3
...


322

Chapter 13 Red-Black Trees

Having shown that each iteration of the loop maintains the invariant, we have
shown that RB-I NSERT-F IXUP correctly restores the red-black properties
...
lg n/, lines 1–16 of RB-I NSERT take O
...
In RB-I NSERTF IXUP, the while loop repeats only if case 1 occurs, and then the pointer ´ moves
two levels up the tree
...
lg n/
...
lg n/ time
...

Exercises
13
...

Observe that if we had chosen to set ´’s color to black, then property 4 of a redblack tree would not be violated
...
3-2
Show the red-black trees that result after successively inserting the keys 41; 38; 31;
12; 19; 8 into an initially empty red-black tree
...
3-3
Suppose that the black-height of each of the subtrees ˛; ˇ; ; ı; " in Figures 13
...
6 is k
...

13
...
Show that the professor’s concern is unfounded by arguing that RBI NSERT-F IXUP never sets T:nil:color to RED
...
3-5
Consider a red-black tree formed by inserting n nodes with RB-I NSERT
...

13
...


13
...
4 Deletion
Like the other basic operations on an n-node red-black tree, deletion of a node takes
time O
...
Deleting a node from a red-black tree is a bit more complicated than
inserting a node
...
3)
...
T; u; /
1 if u:p == T:nil
2
T:root D
3 elseif u == u:p:left
4
u:p:left D
5 else u:p:right D
6
:p D u:p
The procedure RB-T RANSPLANT differs from T RANSPLANT in two ways
...
Second, the assignment to :p in
line 6 occurs unconditionally: we can assign to :p even if points to the sentinel
...

The procedure RB-D ELETE is like the T REE -D ELETE procedure, but with additional lines of pseudocode
...
When we want to delete
node ´ and ´ has fewer than two children, then ´ is removed from the tree, and we
want y to be ´
...
We also remember y’s color before it is removed from or moved within the tree, and we keep track of the node x that moves
into y’s original position in the tree, because node x might also cause violations
of the red-black properties
...


324

Chapter 13 Red-Black Trees

RB-D ELETE
...
T; ´; ´:right/
6 elseif ´:right == T:nil
7
x D ´:left
8
RB-T RANSPLANT
...
´:right/
10
y-original-color D y:color
11
x D y:right
12
if y:p == ´
13
x:p D y
14
else RB-T RANSPLANT
...
T; ´; y/
18
y:left D ´:left
19
y:left:p D y
20
y:color D ´:color
21 if y-original-color == BLACK
22
RB-D ELETE -F IXUP
...
You can find
each line of T REE -D ELETE within RB-D ELETE (with the changes of replacing
NIL by T:nil and replacing calls to T RANSPLANT by calls to RB-T RANSPLANT),
executed under the same conditions
...
Line 1 sets y to point to node ´ when ´ has fewer than two children
and is therefore removed
...

Because node y’s color might change, the variable y-original-color stores y’s
color before any changes occur
...
When ´ has two children, then y ¤ ´ and node y
moves into node ´’s original position in the red-black tree; line 20 gives y the
same color as ´
...
4 Deletion

325

end of RB-D ELETE; if it was black, then removing or moving y could cause
violations of the red-black properties
...
The assignments in lines 4, 7, and 11 set x to point to either y’s only
child or, if y has no children, the sentinel T:nil
...
3
that y has no left child
...
Unless ´ is y’s original parent (which occurs only when ´ has
two children and its successor y is ´’s right child), the assignment to x:p takes
place in line 6 of RB-T RANSPLANT
...
)
When y’s original parent is ´, however, we do not want x:p to point to y’s original parent, since we are removing that node from the tree
...

Finally, if node y was black, we might have introduced one or more violations
of the red-black properties, and so we call RB-D ELETE -F IXUP in line 22 to
restore the red-black properties
...
No black-heights in the tree have changed
...
No red nodes have been made adjacent
...
In addition, if y was not ´’s right child, then y’s original
right child x replaces y in the tree
...

3
...

If node y was black, three problems may arise, which the call of RB-D ELETE F IXUP will remedy
...
Second, if both x and x:p are red, then
we have violated property 4
...
Thus, property 5
is now violated by any ancestor of y in the tree
...
That is, if we add 1 to the count of black nodes on any simple path
that contains x, then under this interpretation, property 5 holds
...
The problem is
that now node x is neither red nor black, thereby violating property 1
...
The color
attribute of x will still be either RED (if x is red-and-black) or BLACK (if x is
doubly black)
...

We can now see the procedure RB-D ELETE -F IXUP and examine how it restores
the red-black properties to the search tree
...
T; x/
1 while x ¤ T:root and x:color == BLACK
2
if x == x:p:left
3
w D x:p:right
4
if w:color == RED
5
w:color D BLACK
6
x:p:color D RED
7
L EFT-ROTATE
...
T; w/
16
w D x:p:right
17
w:color D x:p:color
18
x:p:color D BLACK
19
w:right:color D BLACK
20
L EFT-ROTATE
...
Exercises
13
...
4-2 ask you to show that the procedure restores properties 2 and 4,
and so in the remainder of this section, we shall focus on property 1
...
x points to a red-and-black node, in which case we color x (singly) black in
line 23;
2
...
having performed suitable rotations and recolorings, we exit the loop
...
4 Deletion

327

Within the while loop, x always points to a nonroot doubly black node
...
(We
have given the code for the situation in which x is a left child; the situation in
which x is a right child—line 22—is symmetric
...
Since node x is doubly black, node w cannot be T:nil, because
otherwise, the number of blacks on the simple path from x:p to the (singly black)
leaf w would be smaller than the number on the simple path from x:p to x
...
7
...
The key idea is that in each case, the
transformation applied preserves the number of black nodes (including x’s extra
black) from (and including) the root of the subtree shown to each of the subtrees
˛; ˇ; : : : ;
...
For example, in Figure 13
...
(Again, remember that node x adds an extra black
...
In Figure 13
...
If we define count
...
BLACK / D 1, then the
number of black nodes from the root to ˛ is 2 C count
...
In this case, after the transformation, the new node x has color
attribute c, but this node is really either red-and-black (if c D RED ) or doubly black
(if c D BLACK )
...
4-5)
...
7(a)) occurs when node w,
the sibling of node x, is red
...
The new sibling of x, which is one of w’s children
prior to the rotation, is now black, and thus we have converted case 1 into case 2,
3, or 4
...

2 As

in RB-I NSERT-F IXUP, the cases in RB-D ELETE -F IXUP are not mutually exclusive
...
7(b)), both of w’s
children are black
...
To compensate for removing
one black from x and w, we would like to add an extra black to x:p, which was
originally either red or black
...
Observe that if we enter case 2 through case 1, the new node x
is red-and-black, since the original x:p was red
...
We then color the new node x (singly) black in line 23
...
7(c)) occurs when w is black, its left child
is red, and its right child is black
...
The new sibling w of x is now a black node with a red right
child, and thus we have transformed case 3 into case 4
...
7(d)) occurs when node x’s sibling w is black
and w’s right child is red
...
Setting x to be the root causes the while
loop to terminate when it tests the loop condition
...
lg n/, the total cost of the procedure without the call to RB-D ELETE F IXUP takes O
...
Within RB-D ELETE -F IXUP, each of cases 1, 3, and 4
lead to termination after performing a constant number of color changes and at
most three rotations
...
lg n/ times, performing
no rotations
...
lg n/ time and performs at most three rotations, and the overall time for RB-D ELETE is therefore
also O
...


13
...
7 The cases in the while loop of the procedure RB-D ELETE -F IXUP
...
The letters
˛; ˇ; : : : ; represent arbitrary subtrees
...
Any node pointed
to by x has an extra black and is either doubly black or red-and-black
...
(a) Case 1 is transformed to case 2, 3, or 4 by exchanging the colors of nodes B and D and
performing a left rotation
...
If we enter case 2 through case 1, the
while loop terminates because the new node x is red-and-black, and therefore the value c of its color
attribute is RED
...
(d) Case 4 removes the extra black represented by x by changing some
colors and performing a left rotation (without violating the red-black properties), and then the loop
terminates
...
4-1
Argue that after executing RB-D ELETE -F IXUP, the root of the tree must be black
...
4-2
Argue that if in RB-D ELETE both x and x:p are red, then property 4 is restored by
the call to RB-D ELETE -F IXUP
...

13
...
3-2, you found the red-black tree that results from successively
inserting the keys 41; 38; 31; 12; 19; 8 into an initially empty tree
...

13
...
4-5
In each of the cases of Figure 13
...
When a node has a color attribute c
or c 0 , use the notation count
...
c 0 / symbolically in your count
...
4-6
Professors Skelton and Baron are concerned that at the start of case 1 of RBD ELETE -F IXUP, the node x:p might not be black
...
Show that x:p must be black at the start of case 1, so that
the professors have nothing to worry about
...
4-7
Suppose that a node x is inserted into a red-black tree with RB-I NSERT and then
is immediately deleted with RB-D ELETE
...


Problems for Chapter 13

331

Problems
13-1 Persistent dynamic sets
During the course of an algorithm, we sometimes find that we need to maintain past
versions of a dynamic set as it is updated
...
One way to
implement a persistent set is to copy the entire set whenever it is modified, but this
approach can slow down a program and also consume much space
...

Consider a persistent set S with the operations I NSERT, D ELETE, and S EARCH,
which we implement using binary search trees as shown in Figure 13
...
We
maintain a separate root for every version of the set
...
This node becomes the left child
of a new node with key 7, since we cannot modify the existing node with key 7
...
The new node with key 8
becomes, in turn, the right child of a new root r 0 with key 4 whose left child is the
existing node with key 3
...
8(b)
...

(See also Exercise 13
...
)

4

r

3

r

8

2

7

4

4

3

10

r′

8

2

7

7

8

10

5
(a)

(b)

Figure 13
...
(b) The persistent binary search
tree that results from the insertion of key 5
...
Heavily
shaded nodes are added when key 5 is inserted
...
For a general persistent binary search tree, identify the nodes that we need to
change to insert a key k or delete a node y
...
Write a procedure P ERSISTENT-T REE -I NSERT that, given a persistent tree T
and a key k to insert, returns a new persistent tree T 0 that is the result of inserting k into T
...
If the height of the persistent binary search tree T is h, what are the time and
space requirements of your implementation of P ERSISTENT-T REE -I NSERT?
(The space requirement is proportional to the number of new nodes allocated
...
Suppose that we had included the parent attribute in each node
...
Prove
that P ERSISTENT-T REE -I NSERT would then require
...

e
...
lg n/ per insertion or deletion
...
It returns a set
S D S1 [ fxg [ S2
...

a
...

Argue that RB-I NSERT and RB-D ELETE can maintain the bh attribute without requiring extra storage in the nodes of the tree and without increasing the
asymptotic running times
...
1/ time per node visited
...
T1 ; x; T2 /, which destroys T1 and T2
and returns a red-black tree T D T1 [ fxg [ T2
...

b
...
Describe an O
...

c
...
Describe how Ty [ fxg [ T2 can replace Ty
in O
...

d
...
lg n/ time
...
Argue that no generality is lost by making the assumption in part (b)
...

f
...
lg n/
...
To implement an AVL
tree, we maintain an extra attribute in each node: x:h is the height of node x
...

a
...
lg n/
...
)
b
...
Afterward, the tree might no longer be height balanced
...
Describe a procedure BALANCE
...
e
...
(Hint: Use rotations
...
Using part (b), describe a recursive procedure AVL-I NSERT
...
As in T REE -I NSERT from Section 12
...
Thus, to insert the node ´ into
the AVL tree T , we call AVL-I NSERT
...

d
...
lg n/ time and
performs O
...

13-4 Treaps
If we insert a set of n items into a binary search tree, the resulting tree may be
horribly unbalanced, leading to long search times
...
4,
however, randomly built binary search trees tend to be balanced
...

What if we do not have all the items at once? If we receive the items one at a
time, can we still randomly build a binary search tree out of them?

334

Chapter 13 Red-Black Trees

G: 4
B: 7
A: 10

H: 5
E: 23

K: 65
I: 73

Figure 13
...
Each node x is labeled with x: key : x: priority
...


We will examine a data structure that answers this question in the affirmative
...
Figure 13
...
As usual, each node x in the tree has a key value x:key
...
We assume that all priorities are distinct and also that all keys are
distinct
...


If

is a right child of u, then :key > u:key
...


(This combination of properties is why the tree is called a “treap”: it has features
of both a binary search tree and a heap
...
Suppose that we insert nodes
x1 ; x2 ; : : : ; xn , with associated keys, into a treap
...
e
...

a
...

b
...
lg n/, and hence the expected time
to search for a value in the treap is ‚
...

Let us see how to insert a new node into an existing treap
...
Then we call the insertion algorithm,
which we call T REAP -I NSERT, whose operation is illustrated in Figure 13
...


Problems for Chapter 13

335

G: 4
B: 7
A: 10

G: 4
H: 5

E: 23

C: 25
K: 65

B: 7
A: 10

I: 73

H: 5
E: 23

C: 25

I: 73

(a)

(b)

G: 4
D: 9

G: 4

B: 7
A: 10

H: 5
E: 23

C: 25

B: 7
K: 65

A: 10

I: 73

H: 5
E: 23

D: 9

I: 73

(c)

(d)

G: 4

F: 2

B: 7

H: 5
D: 9

C: 25

K: 65

C: 25

D: 9

A: 10

K: 65

F: 2
K: 65

E: 23

I: 73


A: 10

B: 7

G: 4
D: 9

C: 25

H: 5

E: 23

K: 65
I: 73

(e)

(f)

Figure 13
...
(a) The original treap, prior to insertion
...
(c)–(d) Intermediate stages when inserting a
node with key D and priority 9
...
(f) The
treap after inserting a node with key F and priority 2
...
11 Spines of a binary search tree
...


c
...
Explain the idea in English and give pseudocode
...
)
d
...
lg n/
...
Although
these two operations have the same expected running time, they have different
costs in practice
...

In contrast, a rotation changes parent and child pointers within the treap
...
Thus we would
like T REAP -I NSERT to perform few rotations
...

In order to do so, we will need some definitions, which Figure 13
...

The left spine of a binary search tree T is the simple path from the root to the node
with the smallest key
...
Symmetrically, the right spine of T is the
simple path from the root consisting of only right edges
...

e
...

Let C be the length of the right spine of the left subtree of x
...
Prove that the total number of
rotations that were performed during the insertion of x is equal to C C D
...
Without loss of generality,
we assume that the keys are 1; 2; : : : ; n, since we are comparing them only to one
another
...
We
define indicator random variables
Xi k D I fy is in the right spine of the left subtree of xg :
f
...

g
...
k

...
k

i 1/Š
i C 1/Š
1
i C 1/
...
Show that
E ŒC  D

k 1
X
j D1

1
j
...
Use a symmetry argument to show that
E ŒD D 1

n

1
:
kC1

j
...


Chapter notes
The idea of balancing a search tree is due to Adel’son-Vel’ski˘ and Landis [2], who
ı
introduced a class of balanced search trees called “AVL trees” in 1962, described in
Problem 13-3
...
E
...
A 2-3 tree maintains balance by manipulating
the degrees of nodes in the tree
...

Red-black trees were invented by Bayer [34] under the name “symmetric binary
B-trees
...
Andersson [15] gives a simpler-to-code

338

Chapter 13 Red-Black Trees

variant of red-black trees
...
An AA-tree is
similar to a red-black tree except that left children may never be red
...

They are the default implementation of a dictionary in LEDA [253], which is a
well-implemented collection of data structures and algorithms
...
Perhaps
the most intriguing are the “splay trees” introduced by Sleator and Tarjan [320],
which are “self-adjusting
...
)
Splay trees maintain balance without any explicit balance condition such as color
...
The amortized cost (see Chapter 17) of each operation on an n-node tree is O
...

Skip lists [286] provide an alternative to balanced binary trees
...
Each dictionary
operation runs in expected time O
...


14

Augmenting Data Structures

Some engineering situations require no more than a “textbook” data structure—such as a doubly linked list, a hash table, or a binary search tree—but many
others require a dash of creativity
...
More often, it will suffice to
augment a textbook data structure by storing additional information in it
...
Augmenting a data structure is not always straightforward, however, since the
added information must be updated and maintained by the ordinary operations on
the data structure
...
Section 14
...
We can then quickly find the ith smallest
number in a set or the rank of a given element in the total ordering of the set
...
2 abstracts the process of augmenting a data structure and provides a
theorem that can simplify the process of augmenting red-black trees
...
3
uses this theorem to help design a data structure for maintaining a dynamic set of
intervals, such as time intervals
...


14
...
Specifically, the ith order
statistic of a set of n elements, where i 2 f1; 2; : : : ; ng, is simply the element in the
set with the ith smallest key
...
n/
time from an unordered set
...
lg n/ time
...
lg n/ time
...
1 An order-statistic tree, which is an augmented red-black tree
...
In addition to its usual attributes, each node x has an attribute x: size,
which is the number of nodes, other than the sentinel, in the subtree rooted at x
...
1 shows a data structure that can support fast order-statistic operations
...
Besides the usual red-black tree attributes x:key, x:color, x:p,
x:left, and x:right in a node x, we have another attribute, x:size
...
If we define the sentinel’s size to be 0—that
is, we set T:nil:size to be 0—then we have the identity
x:size D x:left:size C x:right:size C 1 :
We do not require keys to be distinct in an order-statistic tree
...
1 has two keys with value 14 and two keys with value 21
...
We remove
this ambiguity for an order-statistic tree by defining the rank of an element as the
position at which it would be printed in an inorder walk of the tree
...
1,
for example, the key 14 stored in a black node has rank 5, and the key 14 stored in
a red node has rank 6
...
We begin with an operation that retrieves an element with
a given rank
...
x; i/ returns a pointer to the node containing the ith smallest key in the subtree rooted at x
...
T:root; i/
...
1 Dynamic order statistics

OS-S ELECT
...
x:left; i/
6 else return OS-S ELECT
...
The value of x:left:size is the number of nodes that come before x
in an inorder tree walk of the subtree rooted at x
...
If i D r, then node x is the ith smallest
element, and so we return x in line 3
...
If i > r, then
the ith smallest element resides in x’s right subtree
...
i r/th smallest element in
the subtree rooted at x:right
...

To see how OS-S ELECT operates, consider a search for the 17th smallest element in the order-statistic tree of Figure 14
...
We begin with x as the root, whose
key is 26, and with i D 17
...

Thus, we know that the node with rank 17 is the 17 13 D 4th smallest element
in 26’s right subtree
...

Since the size of 41’s left subtree is 5, its rank within its subtree is 6
...
After the recursive call, x is the node with key 30, and its rank within its subtree is 2
...
We now find that its left subtree has size 1, which
means it is the second smallest element
...

Because each recursive call goes down one level in the order-statistic tree, the
total time for OS-S ELECT is at worst proportional to the height of the tree
...
lg n/, where n is the number of nodes
...
lg n/ for a dynamic set of n elements
...


342

Chapter 14 Augmenting Data Structures

OS-R ANK
...
We can think of node x’s rank as the number of
nodes preceding x in an inorder tree walk, plus 1 for x itself
...

We use this loop invariant to show that OS-R ANK works correctly as follows:
Initialization: Prior to the first iteration, line 1 sets r to be the rank of x:key within
the subtree rooted at x
...

Maintenance: At the end of each iteration of the while loop, we set y D y:p
...
In each iteration of the while loop, we consider
the subtree rooted at y:p
...
If y is a left child, then neither y:p nor any
node in y:p’s right subtree precedes x, and so we leave r alone
...

Thus, in line 5, we add y:p:left:size C 1 to the current value of r
...
Thus, the value of r is the rank of x:key in the entire tree
...
1
to find the rank of the node with key 38, we get the following sequence of values
of y:key and r at the top of the while loop:
iteration
1
2
3
4

y:key
38
30
41
26

r
2
4
4
17

14
...

Since each iteration of the while loop takes O
...
lg n/ on an n-node order-statistic tree
...
But unless we can efficiently maintain these
attributes within the basic modifying operations on red-black trees, our work will
have been for naught
...

We noted in Section 13
...
The first phase goes down the tree from the root, inserting the new node
as a child of an existing node
...

To maintain the subtree sizes in the first phase, we simply increment x:size for
each node x on the simple path traversed from the root down toward the leaves
...
Since there are O
...
lg n/
...
Moreover, a rotation is
a local operation: only two nodes have their size attributes invalidated
...
Referring
to the code for L EFT-ROTATE
...
2, we add the following lines:
13
14

y:size D x:size
x:size D x:left:size C x:right:size C 1

Figure 14
...
The change to R IGHTROTATE is symmetric
...
1/ additional time updating size attributes in the second phase
...
lg n/,
which is asymptotically the same as for an ordinary red-black tree
...
(See Section 13
...
) The first phase
either removes one node y from the tree or moves upward it within the tree
...
2 Updating subtree sizes during rotations
...
The updates are local, requiring only the
size information stored in x, y, and the roots of the subtrees shown as triangles
...
Since this path has length O
...
lg n/
...
1/ rotations in the second phase of deletion
in the same manner as for insertion
...
lg n/ time for an n-node order-statistic tree
...
1-1
Show how OS-S ELECT
...
1
...
1-2
Show how OS-R ANK
...
1 and
the node x with x:key D 35
...
1-3
Write a nonrecursive version of OS-S ELECT
...
1-4
Write a recursive procedure OS-K EY-R ANK
...
Assume that the keys of T are distinct
...
1-5
Given an element x in an n-node order-statistic tree and a natural number i, how
can we determine the ith successor of x in the linear order of the tree in O
...
2 How to augment a data structure

345

14
...
Accordingly, suppose
we store in each node its rank in the subtree of which it is the root
...
(Remember that these two
operations can cause rotations
...
1-7
Show how to use an order-statistic tree to count the number of inversions (see
Problem 2-4) in an array of size n in time O
...

14
...
Describe an O
...
(For example, if the n chords are all diameters that meet at the center, then
the correct answer is n
...

2

14
...
We shall use it again in the next section
to design a data structure that supports operations on intervals
...
We shall also prove a theorem
that allows us to augment red-black trees easily in many cases
...
Choose an underlying data structure
...
Determine additional information to maintain in the underlying data structure
...
Verify that we can maintain the additional information for the basic modifying
operations on the underlying data structure
...
Develop new operations
...
Most design work contains an element of trial and error, and
progress on all steps usually proceeds in parallel
...
Nevertheless, this four-step method provides a good focus for your efforts in augmenting
a data structure, and it is also a good way to organize the documentation of an
augmented data structure
...
1 to design our order-statistic trees
...
A clue to the
suitability of red-black trees comes from their efficient support of other dynamicset operations on a total order, such as M INIMUM, M AXIMUM, S UCCESSOR, and
P REDECESSOR
...
Generally, the additional information makes operations more
efficient
...
lg n/ time
...
2-1
...
lg n/ time
...

For example, if we simply stored in each node its rank in the tree, the OS-S ELECT
and OS-R ANK procedures would run quickly, but inserting a new minimum element would cause a change to this information in every node of the tree
...
lg n/ nodes
...
After all,
the need for new operations is why we bother to augment a data structure in the first
place
...
2-1
...
The proof of the following theorem is
similar to the argument from Section 14
...

Theorem 14
...
Then, we can maintain the
values of f in all nodes of T during insertion and deletion without asymptotically
affecting the O
...

Proof The main idea of the proof is that a change to an f attribute in a node x
propagates only to ancestors of x in the tree
...
2 How to augment a data structure

347

quire x:p:f to be updated, but nothing else; updating x:p:f may require x:p:p:f
to be updated, but nothing else; and so on up the tree
...
Since the height of a red-black tree is O
...
lg n/ time in updating all nodes that depend on the change
...
(See Section 13
...
) The
first phase inserts x as a child of an existing node x:p
...
1/ time since, by supposition, it depends only on information in the
other attributes of x itself and the information in x’s children, but x’s children are
both the sentinel T:nil
...
Thus, the total time for the first phase of insertion is O
...
During the
second phase, the only structural changes to the tree come from rotations
...
lg n/ per rotation
...
lg n/
...
(See Section 13
...
) In the first phase,
changes to the tree occur when the deleted node is removed from the tree
...
Propagating the updates to f caused by these changes costs
at most O
...
Fixing up the red-black
tree during the second phase requires at most three rotations, and each rotation
requires at most O
...
Thus, like insertion,
the total time for deletion is O
...

In many cases, such as maintaining the size attributes in order-statistic trees, the
cost of updating after a rotation is O
...
lg n/ derived in the proof
of Theorem 14
...
Exercise 14
...

Exercises
14
...
1/ worstcase time on an augmented order-statistic tree
...

14
...
How about maintaining the
depths of nodes?

348

Chapter 14 Augmenting Data Structures

14
...
Suppose that we want to include in each node x an additional attribute f such that x:f D x1 :a ˝ x2 :a ˝ ˝ xm :a, where x1 ; x2 ; : : : ; xm
is the inorder listing of nodes in the subtree rooted at x
...
1/ time after a rotation
...

14
...
x; a; b/
that outputs all the keys k such that a Ä k Ä b in a red-black tree rooted at x
...
m C lg n/ time, where m is the
number of keys that are output and n is the number of internal nodes in the tree
...
)

14
...
A closed interval is an ordered pair of real numbers Œt1 ; t2 , with
t1 Ä t2
...
Open and
half-open intervals omit both or one of the endpoints from the set, respectively
...

Intervals are convenient for representing events that each occupy a continuous
period of time
...
The data structure in this
section provides an efficient means for maintaining such an interval database
...
We say that intervals i
and i 0 overlap if i \ i 0 ¤ ;, that is, if i:low Ä i 0 :high and i 0 :low Ä i:high
...
3 shows, any two intervals i and i 0 satisfy the interval trichotomy; that
is, exactly one of the following three properties holds:
a
...
i is to the left of i 0 (i
...
, i:high < i 0 :low),
c
...
e
...

An interval tree is a red-black tree that maintains a dynamic set of elements, with
each element x containing an interval x:int
...
3 Interval trees

349

i
i′

i
i′

i
i′

i
i′

(a)
i

i′

(b)

i′

i
(c)

Figure 14
...
(a) If i and i 0 overlap, there
are four situations; in each, i: low Ä i 0 : high and i 0 : low Ä i: high
...
(c) The intervals do not overlap, and i 0 : high < i: low
...
T; x/ adds the element x, whose int attribute is assumed to
contain an interval, to the interval tree T
...
T; x/ removes the element x from the interval tree T
...
T; i/ returns a pointer to an element x in the interval tree T
such that x:int overlaps interval i, or a pointer to the sentinel T:nil if no such
element is in the set
...
4 shows how an interval tree represents a set of intervals
...
2 as we review the design of an interval tree
and the operations that run on it
...
Thus, an inorder tree walk
of the data structure lists the intervals in sorted order by low endpoint
...

Step 3: Maintaining the information
We must verify that insertion and deletion take O
...
We can determine x:max given interval x:int and the max values of
node x’s children:

350

Chapter 14 Augmenting Data Structures

26 26
25
19
17

(a)

19

16

21

15
8

23

9

6

10

5
0

30

20

8

3
0

5

10

15

20

25

30

[16,21]
30

[8,9]

[25,30]

23

30

(b)

int
max

[5,8]

[15,23]

[17,19]

[26,26]

10

23

20

26

[0,3]

[6,10]

[19,20]

3

10

20

Figure 14
...
(a) A set of 10 intervals, shown sorted bottom to top by left endpoint
...
Each node x contains an interval, shown above the dashed
line, and the maximum value of any interval endpoint in the subtree rooted at x, shown below the
dashed line
...


x:max D max
...
1, insertion and deletion run in O
...
In fact, we
can update the max attributes after a rotation in O
...
2-3
and 14
...

Step 4: Developing new operations
The only new operation we need is I NTERVAL -S EARCH
...
If there is no interval that overlaps i in
the tree, the procedure returns a pointer to the sentinel T:nil
...
3 Interval trees

351

I NTERVAL -S EARCH
...
It terminates when either it finds an overlapping interval or x
points to the sentinel T:nil
...
1/ time,
and since the height of an n-node red-black tree is O
...
lg n/ time
...
4
...
We begin with x as the root, which contains Œ16; 21 and
does not overlap i
...
This time, x:left:max D 10 is less than i:low D 22, and so the
loop continues with the right child of x as the new x
...

As an example of an unsuccessful search, suppose we wish to find an interval
that overlaps i D Œ11; 14 in the interval tree of Figure 14
...
We once again begin with x as the root
...
Interval Œ8; 9 does not overlap i, and x:left:max D 10 is less than
i:low D 11, and so we go right
...
) Interval Œ15; 23 does not overlap i, and its left child is T:nil, so again we
go right, the loop terminates, and we return the sentinel T:nil
...
The basic idea is that at any node x,
if x:int does not overlap i, the search always proceeds in a safe direction: the
search will definitely find an overlapping interval if the tree contains one
...

Theorem 14
...
T; i/ either returns a node whose interval
overlaps i, or it returns T:nil and the tree T contains no node whose interval overlaps i
...
5 Intervals in the proof of Theorem 14
...
The value of x: left: max is shown in each case
as a dashed line
...
No interval i 0 in x’s left subtree can overlap i
...
The left subtree of x contains an interval that overlaps i (situation not shown),
or x’s left subtree contains an interval i 0 such that i 0 : high D x: left: max
...


Proof The while loop of lines 2–5 terminates either when x D T:nil or i overlaps x:int
...
Therefore, we focus
on the former case, in which the while loop terminates because x D T:nil
...

We use this loop invariant as follows:
Initialization: Prior to the first iteration, line 1 sets x to be the root of T , so that
the invariant holds
...
We
shall show that both cases maintain the loop invariant
...
If x:left D T:nil, the subtree
rooted at x:left clearly contains no interval that overlaps i, and so setting x
to x:right maintains the invariant
...
As Figure 14
...
Thus, the left
subtree of x contains no intervals that overlap i, so that setting x to x:right
maintains the invariant
...
3 Interval trees

353

If, on the other hand, line 4 is executed, then we will show that the contrapositive of the loop invariant holds
...

Since line 4 is executed, then because of the branch condition in line 3, we
have x:left:max i:low
...
5(b) illustrates the situation
...
Interval trees are keyed on the low endpoints of intervals,
and thus the search-tree property implies that for any interval i 00 in x’s right
subtree,
i:high < i 0 :low
Ä i 00 :low :
By the interval trichotomy, i and i 00 do not overlap
...

Termination: If the loop terminates when x D T:nil, then the subtree rooted at x
contains no interval overlapping i
...
Hence it is correct to return
x D T:nil
...

Exercises
14
...
1/ time
...
3-2
Rewrite the code for I NTERVAL -S EARCH so that it works properly when all intervals are open
...
3-3
Describe an efficient algorithm that, given an interval i, returns an interval overlapping i that has the minimum low endpoint, or T:nil if no such interval exists
...
3-4
Given an interval tree T and an interval i, describe how to list all intervals in T
that overlap i in O
...
n; k lg n// time, where k is the number of intervals in the
output list
...
A slightly more complicated method does not modify the tree
...
3-5
Suggest modifications to the interval-tree procedures to support the new operation I NTERVAL -S EARCH -E XACTLY
...
The operation should return a pointer to a node x in T such that
x:int:low D i:low and x:int:high D i:high, or T:nil if T contains no such node
...
lg n/
time on an n-node interval tree
...
3-6
Show how to maintain a dynamic set Q of numbers that supports the operation
M IN -G AP, which gives the magnitude of the difference of the two closest numbers in Q
...
Q/ returns
18 15 D 3, since 15 and 18 are the two closest numbers in Q
...

14
...
Assume that each rectangle is rectilinearly oriented (sides parallel to the
x- and y-axes), so that we represent a rectangle by its minimum and maximum xand y-coordinates
...
n lg n/-time algorithm to decide whether or not a set
of n rectangles so represented contains two rectangles that overlap
...
(Hint:
Move a “sweep” line across the set of rectangles
...

a
...


Notes for Chapter 14

355

b
...
(Hint: Keep a red-black tree of all the endpoints
...
Augment each node of the tree with some extra information to
maintain the point of maximum overlap
...
Suppose that n people form a circle
and that we are given a positive integer m Ä n
...
After each
person is removed, counting continues around the circle that remains
...
The order in which the people are
removed from the circle defines the
...
For example, the
...

a
...
Describe an O
...
n; m/-Josephus permutation
...
Suppose that m is not a constant
...
n lg n/-time algorithm that,
given integers n and m, outputs the
...


Chapter notes
In their book, Preparata and Shamos [282] describe several of the interval trees
that appear in the literature, citing work by H
...
M
...
The book details an interval tree that, given a static database
of n intervals, allows us to enumerate all k intervals that overlap a given query
interval in O
...


IV

Advanced Design and Analysis Techniques

Introduction
This part covers three important techniques used in designing and analyzing efficient algorithms: dynamic programming (Chapter 15), greedy algorithms (Chapter 16), and amortized analysis (Chapter 17)
...
The techniques in this part are somewhat more sophisticated,
but they help us to attack many computational problems
...

Dynamic programming typically applies to optimization problems in which we
make a set of choices in order to arrive at an optimal solution
...
Dynamic programming
is effective when a given subproblem may arise from more than one partial set of
choices; the key technique is to store the solution to each such subproblem in case it
should reappear
...

Like dynamic-programming algorithms, greedy algorithms typically apply to
optimization problems in which we make a set of choices in order to arrive at an
optimal solution
...
A simple example is coin-changing: to minimize the number of
U
...
coins needed to make change for a given amount, we can repeatedly select
the largest-denomination coin that is not larger than the amount that remains
...
We cannot always easily
tell whether a greedy approach will be effective, however
...

We use amortized analysis to analyze certain algorithms that perform a sequence
of similar operations
...
One advantage of this
approach is that although some operations might be expensive, many others might
be cheap
...
Amortized analysis is not just an analysis tool, however; it is also a way
of thinking about the design of algorithms, since the design of an algorithm and the
analysis of its running time are often closely intertwined
...


15

Dynamic Programming

Dynamic programming, like the divide-and-conquer method, solves problems by
combining the solutions to subproblems
...
) As we saw in Chapters 2
and 4, divide-and-conquer algorithms partition the problem into disjoint subproblems, solve the subproblems recursively, and then combine their solutions to solve
the original problem
...
In this context,
a divide-and-conquer algorithm does more work than necessary, repeatedly solving the common subsubproblems
...

We typically apply dynamic programming to optimization problems
...
Each solution has a value, and we wish to
find a solution with the optimal (minimum or maximum) value
...

When developing a dynamic-programming algorithm, we follow a sequence of
four steps:
1
...

2
...

3
...

4
...

Steps 1–3 form the basis of a dynamic-programming solution to a problem
...
When we do perform step 4, we sometimes maintain additional
information during step 3 so that we can easily construct an optimal solution
...
Section 15
...
Section 15
...
Given these examples of dynamic programming, Section 15
...
Section 15
...
Finally, Section 15
...


15
...
Serling Enterprises buys long steel rods and cuts them
into shorter rods, which it then sells
...
The management of Serling
Enterprises wants to know the best way to cut up the rods
...
Rod lengths are always an integral
number of inches
...
1 gives a sample price table
...
Given a rod of length n inches and a
table of prices pi for i D 1; 2; : : : ; n, determine the maximum revenue rn obtainable by cutting up the rod and selling the pieces
...

Consider the case when n D 4
...
2 shows all the ways to cut up a rod
of 4 inches in length, including the way with no cuts at all
...

We can cut up a rod of length n in 2n 1 different ways, since we have an independent option of cutting, or not cutting, at distance i inches from the left end,

length i
price pi

1
1

2
5

3
8

4
9

5
10

6
17

7
17

8
20

9
24

10
30

Figure 15
...
Each rod of length i inches earns the company pi
dollars of revenue
...
1 Rod cutting

9

361

1

(a)
1

1

8

5

(b)
5

1

(e)

5

5

8

(c)
1

(f)

5

1

(g)

1

(d)
1

1

1

1

1

(h)

Figure 15
...
Above each piece is the
value of that piece, according to the sample price chart of Figure 15
...
The optimal strategy is
part (c)—cutting the rod into two pieces of length 2—which has total value 10
...
1 We denote a decomposition into pieces using ordinary
additive notation, so that 7 D 2 C 2 C 3 indicates that a rod of length 7 is cut into
three pieces—two of length 2 and one of length 3
...
, ik provides maximum corresponding
revenue
rn D pi 1 C pi 2 C

C pi k :

For our sample problem, we can determine the optimal revenue figures ri , for
i D 1; 2; : : : ; 10, by inspection, with the corresponding optimal decompositions

1 If

we required the pieces to be cut in order of nondecreasing size, there would be fewer ways
to consider
...
2
...
This quantity is less than 2n 1 , but still much greater than any polynomial in n
...


362

Chapter 15 Dynamic Programming

r1
r2
r3
r4
r5
r6
r7
r8
r9
r10

D
D
D
D
D
D
D
D
D
D

1
5
8
10
13
17
18
22
25
30

from solution 1 D 1 (no cuts) ;
from solution 2 D 2 (no cuts) ;
from solution 3 D 3 (no cuts) ;
from solution 4 D 2 C 2 ;
from solution 5 D 2 C 3 ;
from solution 6 D 6 (no cuts) ;
from solution 7 D 1 C 6 or 7 D 2 C 2 C 3 ;
from solution 8 D 2 C 6 ;
from solution 9 D 3 C 6 ;
from solution 10 D 10 (no cuts) :

More generally, we can frame the values rn for n
enues from shorter rods:
rn D max
...
1)

The first argument, pn , corresponds to making no cuts at all and selling the rod of
length n as is
...
Since we don’t know ahead
of time which value of i optimizes revenue, we have to consider all possible values
for i and pick the one that maximizes revenue
...

Note that to solve the original problem of size n, we solve smaller problems of
the same type, but of smaller sizes
...
The overall
optimal solution incorporates optimal solutions to the two related subproblems,
maximizing revenue from each of those two pieces
...

In a related, but slightly simpler, way to arrange a recursive structure for the rodcutting problem, we view a decomposition as consisting of a first piece of length i
cut off the left-hand end, and then a right-hand remainder of length n i
...
We may view every
decomposition of a length-n rod in this way: as a first piece followed by some
decomposition of the remainder
...
We thus obtain the
following simpler version of equation (15
...
pi C rn i / :
1Äi Än

(15
...
1 Rod cutting

363

In this formulation, an optimal solution embodies the solution to only one related
subproblem—the remainder—rather than two
...
2)
in a straightforward, top-down, recursive manner
...
p; n/
1 if n == 0
2
return 0
3 q D 1
4 for i D 1 to n
5
q D max
...
p; n
6 return q

i//

Procedure C UT-ROD takes as input an array pŒ1 : : n of prices and an integer n,
and it returns the maximum revenue possible for a rod of length n
...
Line 3 initializes the
maximum revenue q to 1, so that the for loop in lines 4–5 correctly computes
q D max1Äi Än
...
p; n i//; line 6 then returns this value
...
2)
...
For n D 40, you would find that
your program takes at least several minutes, and most likely more than an hour
...

Why is C UT-ROD so inefficient? The problem is that C UT-ROD calls itself
recursively over and over again with the same parameter values; it solves the
same subproblems repeatedly
...
3 illustrates what happens for n D 4:
C UT-ROD
...
p; n i/ for i D 1; 2; : : : ; n
...
p; n/ calls C UT-ROD
...
When this
process unfolds recursively, the amount of work done, as a function of n, grows
explosively
...
n/ denote the total number of
calls made to C UT-ROD when called with its second parameter equal to n
...
The count includes the initial call at its root
...
0/ D 1 and

364

Chapter 15 Dynamic Programming

4
2

3
2
1

1
0

0

0

1

1
0

0

0

0

0

Figure 15
...
p; n/ for
n D 4
...
A path from the root to a leaf corresponds to one of
the 2n 1 ways of cutting up a rod of length n
...


T
...
j / :

(15
...
j / counts the number of calls
(including recursive calls) due to the call C UT-ROD
...

As Exercise 15
...
n/ D 2n ;

(15
...

In retrospect, this exponential running time is not so surprising
...
The
tree of recursive calls has 2n 1 leaves, one for each possible way of cutting up the
rod
...
That is, the labels give the
corresponding cut points, measured from the right-hand end of the rod
...

The dynamic-programming method works as follows
...
If we need to refer to this subproblem’s solution again later, we can just look it

15
...
Dynamic programming thus uses additional memory
to save computation time; it serves an example of a time-memory trade-off
...
A dynamic-programming approach runs in polynomial
time when the number of distinct subproblems involved is polynomial in the input
size and we can solve each such subproblem in polynomial time
...
We shall illustrate both of them with our rod-cutting example
...
2 In this approach, we write
the procedure recursively in a natural manner, but modified to save the result of
each subproblem (usually in an array or hash table)
...
If so, it returns the saved
value, saving further computation at this level; if not, the procedure computes the
value in the usual manner
...

The second approach is the bottom-up method
...
We sort the
subproblems by size and solve them in size order, smallest first
...
We solve each subproblem only once, and when we first see it, we have already solved all of its
prerequisite subproblems
...
The bottom-up approach often has
much better constant factors, since it has less overhead for procedure calls
...
p; n/
1 let rŒ0 : : n be a new array
2 for i D 0 to n
3
rŒi D 1
4 return M EMOIZED -C UT-ROD -AUX
...
The word really is memoization, not memorization
...


366

Chapter 15 Dynamic Programming

M EMOIZED -C UT-ROD -AUX
...
q; pŒi C M EMOIZED -C UT-ROD -AUX
...
” (Known revenue values are always nonnegative
...

The procedure M EMOIZED -C UT-ROD -AUX is just the memoized version of our
previous procedure, C UT-ROD
...
Otherwise, lines 3–7
compute the desired value q in the usual manner, line 8 saves it in rŒn, and line 9
returns it
...
p; n/
1 let rŒ0 : : n be a new array
2 rŒ0 D 0
3 for j D 1 to n
4
q D 1
5
for i D 1 to j
6
q D max
...
Thus, the procedure solves subproblems of
sizes j D 0; 1; : : : ; n, in that order
...
Lines 3–6 solve each subproblem of size j , for
j D 1; 2; : : : ; n, in order of increasing size
...
1 Rod cutting

367

4
3
2
1
0

Figure 15
...
The vertex labels
give the sizes of the corresponding subproblems
...
x; y/ indicates that we need a
solution to subproblem y when solving subproblem x
...
3, in which all nodes with the same label are collapsed into a single vertex and all edges
go from parent to child
...
Line 7 saves in rŒj  the solution to the subproblem
of size j
...

The bottom-up and top-down versions have the same asymptotic running time
...
n2 /, due to its
doubly-nested loop structure
...
The running time of its top-down counterpart,
M EMOIZED -C UT-ROD, is also ‚
...
Because a recursive call to solve a previously solved subproblem
returns immediately, M EMOIZED -C UT-ROD solves each subproblem just once
...
To solve a subproblem of size n, the for
loop of lines 6–7 iterates n times
...
n2 / iterations, just like the inner for loop of B OTTOM -U P C UT-ROD
...
We shall see
aggregate analysis in detail in Section 17
...
)
Subproblem graphs
When we think about a dynamic-programming problem, we should understand the
set of subproblems involved and how subproblems depend on one another
...
Figure 15
...
It
is a directed graph, containing one vertex for each distinct subproblem
...
For example, the subproblem graph contains an edge from x to y if a top-down recursive procedure for
solving x directly calls itself to solve y
...

The bottom-up method for dynamic programming considers the vertices of the
subproblem graph in such an order that we solve the subproblems y adjacent to
a given subproblem x before we solve subproblem x
...
4
that the adjacency relation is not necessarily symmetric
...
4) of the subproblem graph
...
Similarly, using notions from the same chapter, we can
view the top-down method (with memoization) for dynamic programming as a
“depth-first search” of the subproblem graph (see Section 22
...

The size of the subproblem graph G D
...
Since we solve each subproblem just
once, the running time is the sum of the times needed to solve each subproblem
...
In this common case, the running time of dynamic programming
is linear in the number of vertices and edges
...

We can extend the dynamic-programming approach to record not only the optimal
value computed for each subproblem, but also a choice that led to the optimal
value
...

Here is an extended version of B OTTOM -U P -C UT-ROD that computes, for each
rod size j , not only the maximum revenue rj , but also sj , the optimal size of the
first piece to cut off:

15
...
p; n/
1 let rŒ0 : : n and sŒ0 : : n be new arrays
2 rŒ0 D 0
3 for j D 1 to n
4
q D 1
5
for i D 1 to j
6
if q < pŒi C rŒj i
7
q D pŒi C rŒj i
8
sŒj  D i
9
rŒj  D q
10 return r and s
This procedure is similar to B OTTOM -U P -C UT-ROD, except that it creates the array s in line 1, and it updates sŒj  in line 8 to hold the optimal size i of the first
piece to cut off when solving a subproblem of size j
...
p; n/
1
...
p; n/
2 while n > 0
3
print sŒn
4
n D n sŒn
In our rod-cutting example, the call E XTENDED -B OTTOM -U P -C UT-ROD
...
p; 10/ would print just 10, but a call with
n D 7 would print the cuts 1 and 6, corresponding to the first optimal decomposition for r7 given earlier
...
1-1
Show that equation (15
...
3) and the initial condition
T
...


370

Chapter 15 Dynamic Programming

15
...
Define the density of a rod of
length i to be pi =i, that is, its value per inch
...
It then continues by applying the greedy strategy to the remaining piece of
length n i
...
1-3
Consider a modification of the rod-cutting problem in which, in addition to a
price pi for each rod, each cut incurs a fixed cost of c
...
Give a dynamic-programming algorithm to solve this modified problem
...
1-4
Modify M EMOIZED -C UT-ROD to return not only the value but the actual solution,
too
...
1-5
The Fibonacci numbers are defined by recurrence (3
...
Give an O
...
Draw the
subproblem graph
...
2 Matrix-chain multiplication
Our next example of dynamic programming is an algorithm that solves the problem
of matrix-chain multiplication
...
5)

We can evaluate the expression (15
...
Matrix multiplication is
associative, and so all parenthesizations yield the same product
...
For example, if the
chain of matrices is hA1 ; A2 ; A3 ; A4 i, then we can fully parenthesize the product
A1 A2 A3 A4 in five distinct ways:

15
...
A1
...
A3 A4 /// ;

...
A2 A3 /A4 // ;

...
A3 A4 // ;

...
A2 A3 //A4 / ;

...
Consider first the cost of multiplying two matrices
...
2
...

M ATRIX -M ULTIPLY
...
If A is a p q matrix and B is
a q r matrix, the resulting matrix C is a p r matrix
...
In what
follows, we shall express costs in terms of the number of scalar multiplications
...
Suppose
that the dimensions of the matrices are 10 100, 100 5, and 5 50, respectively
...
A1 A2 /A3 /, we perform
10 100 5 D 5000 scalar multiplications to compute the 10 5 matrix product A1 A2 , plus another 10 5 50 D 2500 scalar multiplications to multiply this
matrix by A3 , for a total of 7500 scalar multiplications
...
A1
...
Thus, computing the product according to
the first parenthesization is 10 times faster
...

Note that in the matrix-chain multiplication problem, we are not actually multiplying matrices
...
Typically, the time invested in determining this optimal
order is more than paid for by the time saved later on when actually performing the
matrix multiplications (such as performing only 7500 scalar multiplications instead
of 75,000)
...
Denote the number of alternative parenthesizations of a sequence of n matrices by P
...
When n D 1, we have just one
matrix and therefore only one way to fully parenthesize the matrix product
...
k C 1/st matrices for any k D 1; 2; : : : ; n 1
...
n/ D

n 1
X

if n D 1 ;
P
...
n

k/ if n

2:

(15
...
4n =n3=2 /
...
2-3) is to show that the solution to the recurrence (15
...
2n /
...

Applying dynamic programming
We shall use the dynamic-programming method to determine how to optimally
parenthesize a matrix chain
...
Characterize the structure of an optimal solution
...
Recursively define the value of an optimal solution
...
Compute the value of an optimal solution
...
2 Matrix-chain multiplication

373

4
...

We shall go through these steps in order, demonstrating clearly how we apply each
step to the problem
...
In the matrix-chain multiplication problem, we can
perform this step as follows
...
Observe that if the problem is nontrivial, i
...
, i < j , then to parenthesize the product
Ai Ai C1 Aj , we must split the product between Ak and AkC1 for some integer k
in the range i Ä k < j
...

The cost of parenthesizing this way is the cost of computing the matrix Ai ::k , plus
the cost of computing AkC1::j , plus the cost of multiplying them together
...
Suppose that to optimally parenthesize Ai Ai C1 Aj , we split the product between Ak and AkC1
...
Why? If there were a less costly way to parenthesize Ai Ai C1 Ak ,
then we could substitute that parenthesization in the optimal parenthesization
of Ai Ai C1 Aj to produce another way to parenthesize Ai Ai C1 Aj whose cost
was lower than the optimum: a contradiction
...

Now we use our optimal substructure to show that we can construct an optimal
solution to the problem from optimal solutions to subproblems
...
Thus, we can build an optimal solution to
an instance of the matrix-chain multiplication problem by splitting the problem into
two subproblems (optimally parenthesizing Ai Ai C1 Ak and AkC1 AkC2 Aj ),
finding optimal solutions to subproblem instances, and then combining these optimal subproblem solutions
...


374

Chapter 15 Dynamic Programming

Step 2: A recursive solution
Next, we define the cost of an optimal solution recursively in terms of the optimal
solutions to subproblems
...
Let mŒi; j  be the minimum number of scalar
multiplications needed to compute the matrix Ai ::j ; for the full problem, the lowestcost way to compute A1::n would thus be mŒ1; n
...
If i D j , the problem is trivial;
the chain consists of just one matrix Ai ::i D Ai , so that no scalar multiplications
are necessary to compute the product
...
To
compute mŒi; j  when i < j , we take advantage of the structure of an optimal
solution from step 1
...
Then, mŒi; j 
equals the minimum cost for computing the subproducts Ai ::k and AkC1::j , plus the
cost of multiplying these two matrices together
...
Thus, we obtain
mŒi; j  D mŒi; k C mŒk C 1; j  C pi 1 pk pj :
This recursive equation assumes that we know the value of k, which we do not
...

Since the optimal parenthesization must use one of these values for k, we need only
check them all to find the best
...
7)
min fmŒi; k C mŒk C 1; j  C pi 1 pk pj g if i < j :
i Äk
The mŒi; j  values give the costs of optimal solutions to subproblems, but they
do not provide all the information we need to construct an optimal solution
...
That is, sŒi; j  equals a value k such
that mŒi; j  D mŒi; k C mŒk C 1; j  C pi 1 pk pj
...
7)
to compute the minimum cost mŒ1; n for multiplying A1 A2 An
...
3, this recursive algorithm takes exponential time, which is no better than the brute-force method of
checking each way of parenthesizing the product
...
2 Matrix-chain multiplication

375

Observe that we have relatively few distinct subproblems: one subproblem for
each choice of i and j satisfying 1 Ä i Ä j Ä n, or n C n D ‚
...

2
A recursive algorithm may encounter each subproblem many times in different
branches of its recursion tree
...

Instead of computing the solution to recurrence (15
...
(We present the corresponding top-down approach using memoization in Section 15
...
)
We shall implement the tabular, bottom-up method in the procedure M ATRIX C HAIN -O RDER, which appears below
...
Its input is a sequence p D
hp0 ; p1 ; : : : ; pn i, where p:length D n C 1
...
We shall use the table s to construct an optimal solution
...
Equation (15
...

That is, for k D i; i C 1; : : : ; j 1, the matrix Ai ::k is a product of k i C 1 <
j i C 1 matrices and the matrix AkC1::j is a product of j k < j i C 1
matrices
...
For
the subproblem of optimally parenthesizing the chain Ai Ai C1 Aj , we consider
the subproblem size to be the length j i C 1 of the chain
...
p/
1 n D p:length 1
2 let mŒ1 : : n; 1 : : n and sŒ1 : : n 1; 2 : : n be new tables
3 for i D 1 to n
4
mŒi; i D 0
5 for l D 2 to n
/ l is the chain length
/
6
for i D 1 to n l C 1
7
j D i Cl 1
8
mŒi; j  D 1
9
for k D i to j 1
10
q D mŒi; k C mŒk C 1; j  C pi 1 pk pj
11
if q < mŒi; j 
12
mŒi; j  D q
13
sŒi; j  D k
14 return m and s

376

Chapter 15 Dynamic Programming

m

s

6

1

6

15,125

j

5
2
11,875 10,500

4

9,375

3

7,875

1

2
15,750

7,125

4,375

2,625

3

5,375

2,500
750

3

4

4

3

3

3,500

1,000

5

j

i

5

5,000

1

6

0

0

0

0

0

A2

A3

A4

A5

2
3

3
3

2

3
3

3

i
3
4
5

4

5
5

0

A1

1

2

1
3

A6

Figure 15
...
The m table uses only the main
diagonal and upper triangle, and the s table uses only the upper triangle
...
Of the darker entries, the pairs
that have the same shading are taken together in line 10 when computing
8
D 13,000 ;
ˆmŒ2; 2 C mŒ3; 5 C p1 p2 p5 D 0 C 2500 C 35 15 20
<
mŒ2; 5 D min mŒ2; 3 C mŒ4; 5 C p1 p3 p5 D 2625 C 1000 C 35 5 20 D 7125 ;
ˆ
:
D 11,375
mŒ2; 4 C mŒ5; 5 C p1 p4 p5 D 4375 C 0 C 35 10 20
D 7125 :

The algorithm first computes mŒi; i D 0 for i D 1; 2; : : : ; n (the minimum
costs for chains of length 1) in lines 3–4
...
7) to compute
mŒi; i C 1 for i D 1; 2; : : : ; n 1 (the minimum costs for chains of length l D 2)
during the first execution of the for loop in lines 5–13
...
At each step, the mŒi; j  cost computed in lines 10–13
depends only on table entries mŒi; k and mŒk C 1; j  already computed
...
5 illustrates this procedure on a chain of n D 6 matrices
...
The figure shows the table rotated to make the
main diagonal run horizontally
...
Using this layout, we can find the minimum cost mŒi; j  for multiplying a subchain
Ai Ai C1 Aj of matrices at the intersection of lines running northeast from Ai and

15
...
Each horizontal row in the table contains the entries for matrix
chains of the same length
...
It computes each entry mŒi; j 
using the products pi 1 pk pj for k D i; i C 1; : : : ; j 1 and all entries southwest
and southeast from mŒi; j 
...
n3 / for the algorithm
...
Exercise 15
...
n3 /
...
n2 / space to store the m and s tables
...

Step 4: Constructing an optimal solution
Although M ATRIX -C HAIN -O RDER determines the optimal number of scalar multiplications needed to compute a matrix-chain product, it does not directly show
how to multiply the matrices
...
Each entry sŒi; j  records a value of k such that an optimal parenthesization of Ai Ai C1 Aj splits the product between Ak and AkC1
...
We can determine the earlier matrix multiplications recursively, since sŒ1; sŒ1; n determines the last matrix multiplication when computing
A1::sŒ1;n and sŒsŒ1; n C 1; n determines the last matrix multiplication when computing AsŒ1;nC1::n
...
The initial call P RINT-O PTIMAL -PARENS
...

P RINT-O PTIMAL -PARENS
...
s; i; sŒi; j /
5
P RINT-O PTIMAL -PARENS
...
5, the call P RINT-O PTIMAL -PARENS
...
A1
...
A4 A5 /A6 //
...
2-1
Find an optimal parenthesization of a matrix-chain product whose sequence of
dimensions is h5; 10; 3; 12; 5; 50; 6i
...
2-2
Give a recursive algorithm M ATRIX -C HAIN -M ULTIPLY
...
(The initial call would be M ATRIX -C HAIN -M ULTIPLY
...
)
15
...
6)
is
...

15
...
How many vertices does it have? How many edges does it have, and
which edges are they?
15
...
i; j / be the number of times that table entry mŒi; j  is referenced while
computing other table entries in a call of M ATRIX -C HAIN -O RDER
...
i; j / D

n3

n
3

:

(Hint: You may find equation (A
...
)
15
...


15
...
From an engineering perspective, when should we look for a dynamic-programming solution
to a problem? In this section, we examine the two key ingredients that an opti-

15
...
We also revisit and discuss more fully
how memoization might help us take advantage of the overlapping-subproblems
property in a top-down recursive approach
...
Recall that a problem exhibits
optimal substructure if an optimal solution to the problem contains within it optimal solutions to subproblems
...
(As Chapter 16 discusses, it also might mean that a greedy strategy applies, however
...
Consequently, we must take care to ensure that the range of subproblems we consider includes those used in an optimal solution
...
In Section 15
...
In Section 15
...

You will find yourself following a common pattern in discovering optimal substructure:
1
...
Making this choice leaves one or more subproblems to be solved
...
You suppose that for a given problem, you are given the choice that leads to an
optimal solution
...
You just assume that it has been given to you
...
Given this choice, you determine which subproblems ensue and how to best
characterize the resulting space of subproblems
...
You show that the solutions to the subproblems used within an optimal solution
to the problem must themselves be optimal by using a “cut-and-paste” technique
...
In particular, by “cutting out” the
nonoptimal solution to each subproblem and “pasting in” the optimal one, you
show that you can get a better solution to the original problem, thus contradicting your supposition that you already had an optimal solution
...

To characterize the space of subproblems, a good rule of thumb says to try to
keep the space as simple as possible and then expand it as necessary
...
This subproblem space worked well, and we had no need to try a more general space of
subproblems
...
As before,
an optimal parenthesization must split this product between Ak and AkC1 for some
1 Ä k < j
...
For this problem, we needed
to allow our subproblems to vary at “both ends,” that is, to allow both i and j to
vary in the subproblem Ai Ai C1 Aj
...
how many subproblems an optimal solution to the original problem uses, and
2
...

In the rod-cutting problem, an optimal solution for cutting up a rod of size n
uses just one subproblem (of size n i), but we must consider n choices for i
in order to determine which one yields an optimal solution
...
For a given matrix Ak at which we split the product, we have two subproblems—parenthesizing Ai Ai C1 Ak and parenthesizing
AkC1 AkC2 Aj —and we must solve both of them optimally
...

Informally, the running time of a dynamic-programming algorithm depends on
the product of two factors: the number of subproblems overall and how many
choices we look at for each subproblem
...
n/ subproblems
overall, and at most n choices to examine for each, yielding an O
...

Matrix-chain multiplication had ‚
...
n3 / running time (actually, a ‚
...
2-5)
...
Each vertex corresponds to a subproblem, and the choices for a sub-

15
...
Recall that in rod cutting,
the subproblem graph had n vertices and at most n edges per vertex, yielding an
O
...
For matrix-chain multiplication, if we were to draw the subproblem graph, it would have ‚
...
n3 / vertices and edges
...

That is, we first find optimal solutions to subproblems and, having solved the subproblems, we find an optimal solution to the problem
...
The cost of the problem solution is usually the
subproblem costs plus a cost that is directly attributable to the choice itself
...
2)
...
2)
...
The cost attributable to the choice itself is the term pi 1 pk pj
...
In particular, problems to which greedy algorithms
apply have optimal substructure
...

Surprisingly, in some cases this strategy works!
Subtleties
You should be careful not to assume that optimal substructure applies when it does
not
...
V; E/ and vertices u; 2 V
...
Such a path must be simple, since removing a cycle from a path produces a path with fewer edges
...
We can use the breadth-first search
technique of Chapter 22 to solve the unweighted problem
...
6 A directed graph showing that the problem of finding a longest simple path in an
unweighted directed graph does not have optimal substructure
...


Unweighted longest simple path: Find a simple path from u to consisting of
the most edges
...

The unweighted shortest-path problem exhibits optimal substructure, as follows
...
Then, any path p from u
to must contain an intermediate vertex, say w
...
)
p
p1
p2
Thus, we can decompose the path u ; into subpaths u ; w ;
...
We claim that if p is an optimal (i
...
, shortest) path from u to , then p1
must be a shortest path from u to w
...
Symmetrically, p2 must be a shortest
path from w to
...
In Section 25
...

You might be tempted to assume that the problem of finding an unweighted
longest simple path exhibits optimal substructure as well
...
6 supplies an example
...
Is q ! r a longest
simple path from q to r? No, for the path q ! s ! t ! r is a simple path
that is longer
...


15
...
If we combine the longest simple
paths q ! s ! t ! r and r ! q ! s ! t, we get the path q ! s ! t ! r !
q ! s ! t, which is not simple
...
No
efficient dynamic-programming algorithm for this problem has ever been found
...

Why is the substructure of a longest simple path so different from that of a shortest path? Although a solution to a problem for both longest and shortest paths uses
two subproblems, the subproblems in finding the longest simple path are not independent, whereas for shortest paths they are
...
For the example of Figure 15
...
For the first
of these subproblems, we choose the path q ! s ! t ! r, and so we have also
used the vertices s and t
...
If we cannot use vertex t in the second problem, then we
cannot solve it at all, since t is required to be on the path that we find, and it is
not the vertex at which we are “splicing” together the subproblem solutions (that
vertex being r)
...
We must use at least one of them
to solve the other subproblem, however, and we must use both of them to solve it
optimally
...
Looked at
another way, using resources in solving one subproblem (those resources being
vertices) renders them unavailable for the other subproblem
...
We claim that
if a vertex w is on a shortest path p from u to , then we can splice together any
p1
p2
shortest path u ; w and any shortest path w ; to produce a shortest path from u
to
...
Why? Suppose that some vertex x ¤ w appears in both p1 and p2 , so that
pux
px
we can decompose p1 as u ; x ; w and p2 as w ; x ;
...
Now let us construct a path p 0 D u ; x ; from u to
...
Thus, we are assured that the subproblems
for the shortest-path problem are independent
...
1 and 15
...
In matrix-chain multiplication, the subproblems are multiplying subchains
Ai Ai C1 Ak and AkC1 AkC2 Aj
...
In rod cutting, to determine the
best way to cut up a rod of length n, we look at the best ways of cutting up rods
of length i for i D 0; 1; : : : ; n 1
...

Overlapping subproblems
The second ingredient that an optimization problem must have for dynamic programming to apply is that the space of subproblems must be “small” in the sense
that a recursive algorithm for the problem solves the same subproblems over and
over, rather than always generating new subproblems
...
When a recursive algorithm revisits the same problem repeatedly, we say that the optimization problem
has overlapping subproblems
...
Dynamic-programming algorithms typically take advantage of
overlapping subproblems by solving each subproblem once and then storing the
solution in a table where it can be looked up when needed, using constant time per
lookup
...
1, we briefly examined how a recursive solution to rod cutting makes exponentially many calls to find solutions of smaller subproblems
...

To illustrate the overlapping-subproblems property in greater detail, let us reexamine the matrix-chain multiplication problem
...
5,
observe that M ATRIX -C HAIN -O RDER repeatedly looks up the solution to subproblems in lower rows when solving subproblems in higher rows
...
Although these requirements may sound contradictory, they describe two different
notions, rather than two points on the same axis
...
Two subproblems are overlapping if they are really the same
subproblem that occurs as a subproblem of different problems
...
3 Elements of dynamic programming

385

1
...
1

2
...
2

2
...
4

2
...
4

3
...
4

2
...
3

1
...
4

2
...
3

1
...
4

4
...
1

2
...
2

3
...
2

3
...
1

2
...
7 The recursion tree for the computation of R ECURSIVE -M ATRIX -C HAIN
...

Each node contains the parameters i and j
...


mŒ3; 5, and mŒ3; 6
...
To see how, consider
the following (inefficient) recursive procedure that determines mŒi; j , the minimum number of scalar multiplications needed to compute the matrix-chain product
Ai ::j D Ai Ai C1 Aj
...
7)
...
p; i; j /
1 if i == j
2
return 0
3 mŒi; j  D 1
4 for k D i to j 1
5
q D R ECURSIVE -M ATRIX -C HAIN
...
p; k C 1; j /
C pi 1 pk pj
6
if q < mŒi; j 
7
mŒi; j  D q
8 return mŒi; j 
Figure 15
...
p; 1; 4/
...

Observe that some pairs of values occur many times
...
Let T
...

Because the execution of lines 1–2 and of lines 6–7 each take at least unit time, as

386

Chapter 15 Dynamic Programming

does the multiplication in line 5, inspection of the procedure yields the recurrence
T
...
n/

n 1
X

...
k/ C T
...
i/ appears once as T
...
n k/, and collecting the n 1 1s in the summation together with the 1 out
front, we can rewrite the recurrence as
T
...
i/ C n :

(15
...
n/ D
...
Specifi1
...
n/
2n 1 for all n
0
T
...
Inductively, for n 2 we have
T
...
5))
D 2
...
Thus, the total amount of work performed by the call
R ECURSIVE -M ATRIX -C HAIN
...

Compare this top-down, recursive algorithm (without memoization) with the
bottom-up dynamic-programming algorithm
...
Matrix-chain multiplication has only ‚
...
The recursive algorithm, on the other hand,
must again solve each subproblem every time it reappears in the recursion tree
...


15
...

For matrix-chain multiplication, the table sŒi; j  saves us a significant amount of
work when reconstructing an optimal solution
...
We choose from among j i possibilities when we determine which
subproblems to use in an optimal solution to parenthesizing Ai Ai C1 Aj , and
j i is not a constant
...
j i/ D !
...
By storing
in sŒi; j  the index of the matrix at which we split the product Ai Ai C1 Aj , we
can reconstruct each choice in O
...

Memoization
As we saw for the rod-cutting problem, there is an alternative approach to dynamic programming that often offers the efficiency of the bottom-up dynamicprogramming approach while maintaining a top-down strategy
...
As in the bottom-up approach, we maintain a table with subproblem solutions, but the control structure
for filling in the table is more like the recursive algorithm
...
Each table entry initially contains a special value to indicate that
the entry has yet to be filled in
...

Each subsequent time that we encounter this subproblem, we simply look up the
value stored in the table and return it
...
Note where it
resembles the memoized top-down method for the rod-cutting problem
...
Another, more general,
approach is to memoize by using hashing with the subproblem parameters as keys
...
p/
1 n D p:length 1
2 let mŒ1 : : n; 1 : : n be a new table
3 for i D 1 to n
4
for j D i to n
5
mŒi; j  D 1
6 return L OOKUP -C HAIN
...
m; p; i; j /
1 if mŒi; j  < 1
2
return mŒi; j 
3 if i == j
4
mŒi; j  D 0
5 else for k D i to j 1
6
q D L OOKUP -C HAIN
...
m; p; k C 1; j / C pi 1 pk pj
7
if q < mŒi; j 
8
mŒi; j  D q
9 return mŒi; j 
The M EMOIZED -M ATRIX -C HAIN procedure, like M ATRIX -C HAIN -O RDER,
maintains a table mŒ1 : : n; 1 : : n of computed values of mŒi; j , the minimum number of scalar multiplications needed to compute the matrix Ai ::j
...
Upon
calling L OOKUP -C HAIN
...
Otherwise,
the cost is computed as in R ECURSIVE -M ATRIX -C HAIN, stored in mŒi; j , and
returned
...
m; p; i; j / always returns the value of mŒi; j ,
but it computes it only upon the first call of L OOKUP -C HAIN with these specific
values of i and j
...
7 illustrates how M EMOIZED -M ATRIX -C HAIN saves time compared
with R ECURSIVE -M ATRIX -C HAIN
...

Like the bottom-up dynamic-programming algorithm M ATRIX -C HAIN -O RDER,
the procedure M EMOIZED -M ATRIX -C HAIN runs in O
...
Line 5 of
M EMOIZED -M ATRIX -C HAIN executes ‚
...
We can categorize the calls
of L OOKUP -C HAIN into two types:
1
...
calls in which mŒi; j  < 1, so that L OOKUP -C HAIN simply returns in line 2
...
3 Elements of dynamic programming

389

There are ‚
...
All calls of the second type are made as recursive calls by calls of the first type
...
n/ of them
...
n3 / calls of the second type in all
...
1/ time, and each call of the first type takes O
...
The total time, therefore, is O
...
Memoization thus turns
an
...
n3 /-time algorithm
...
n3 / time
...
There are only ‚
...
Without memoization, the natural recursive algorithm runs in exponential
time, since solved subproblems are repeatedly solved
...
Moreover, for
some problems we can exploit the regular pattern of table accesses in the dynamicprogramming algorithm to reduce time or space requirements even further
...

Exercises
15
...

15
...
3
...
Explain why memoization fails to speed up a good divideand-conquer algorithm such as M ERGE -S ORT
...
3-3
Consider a variant of the matrix-chain multiplication problem in which the goal is
to parenthesize the sequence of matrices so as to maximize, rather than minimize,

390

Chapter 15 Dynamic Programming

the number of scalar multiplications
...
3-4
As stated, in dynamic programming we first solve the subproblems and then choose
which of them to use in an optimal solution to the problem
...
She suggests that we can find an optimal solution to the matrixchain multiplication problem by always choosing the matrix Ak at which to split
the subproduct Ai Ai C1 Aj (by selecting k to minimize the quantity pi 1 pk pj )
before solving the subproblems
...

15
...
1, we also had limit li on the
number of pieces of length i that we are allowed to produce, for i D 1; 2; : : : ; n
...
1 no longer
holds
...
3-6
Imagine that you wish to exchange one currency for another
...
Suppose that you can trade n different currencies, numbered 1; 2; : : : ; n,
where you start with currency 1 and wish to wind up with currency n
...

A sequence of trades may entail a commission, which depends on the number of
trades you make
...
Show that, if ck D 0 for all k D 1; 2; : : : ; n, then the problem of finding the
best sequence of exchanges from currency 1 to currency n exhibits optimal substructure
...


15
...
A strand of DNA consists of a string of molecules called

15
...

Representing each of these bases by its initial letter, we can express a strand
of DNA as a string over the finite set fA; C; G; Tg
...
) For example, the DNA of one organism may be
S1 D ACCGGTCGAGTGCGCGGAAGCCGGCCGAA, and the DNA of another organism may be S2 D GTCGTTCGGAATGCCGTTGCTCTGTAAA
...
We can, and do, define similarity in many different ways
...
(Chapter 32 explores algorithms to solve
this problem
...
Alternatively, we could say that two strands are similar if the number of changes needed
to turn one into the other is small
...
) Yet another
way to measure the similarity of strands S1 and S2 is by finding a third strand S3
in which the bases in S3 appear in each of S1 and S2 ; these bases must appear
in the same order, but not necessarily consecutively
...
In our example, the longest strand S3 is
GTCGTCGGAAGCCGGCCGAA
...
A subsequence of a given sequence is just the given sequence with zero or
more elements left out
...
For example, Z D hB; C; D; Bi is a subsequence of X D
hA; B; C; B; D; A; Bi with corresponding index sequence h2; 3; 5; 7i
...
For example, if
X D hA; B; C; B; D; A; Bi and Y D hB; D; C; A; B; Ai, the sequence hB; C; Ai is
a common subsequence of both X and Y
...
The
sequence hB; C; B; Ai is an LCS of X and Y , as is the sequence hB; D; A; Bi,
since X and Y have no common subsequence of length 5 or greater
...
This section shows how to efficiently
solve the LCS problem using dynamic programming
...
Each subsequence
of X corresponds to a subset of the indices f1; 2; : : : ; mg of X
...

The LCS problem has an optimal-substructure property, however, as the following theorem shows
...
To be precise, given a
sequence X D hx1 ; x2 ; : : : ; xm i, we define the ith prefix of X , for i D 0; 1; : : : ; m,
as Xi D hx1 ; x2 ; : : : ; xi i
...

Theorem 15
...

1
...
If xm ¤ yn , then ´k ¤ xm implies that Z is an LCS of Xm

1
1

and Yn 1
...


3
...

Proof (1) If ´k ¤ xm , then we could append xm D yn to Z to obtain a common
subsequence of X and Y of length k C 1, contradicting the supposition that Z is
a longest common subsequence of X and Y
...

Now, the prefix Zk 1 is a length-
...

We wish to show that it is an LCS
...
Then, appending xm D yn to W produces a common subsequence of
X and Y whose length is greater than k, which is a contradiction
...
If there were a
common subsequence W of Xm 1 and Y with length greater than k, then W would
also be a common subsequence of Xm and Y , contradicting the assumption that Z
is an LCS of X and Y
...

The way that Theorem 15
...
Thus, the LCS problem has an optimal-substructure property
...
4 Longest common subsequence

393

sive solution also has the overlapping-subproblems property, as we shall see in a
moment
...
1 implies that we should examine either one or two subproblems when
finding an LCS of X D hx1 ; x2 ; : : : ; xm i and Y D hy1 ; y2 ; : : : ; yn i
...
Appending xm D yn to this LCS yields
an LCS of X and Y
...
Whichever of these two
LCSs is longer is an LCS of X and Y
...

We can readily see the overlapping-subproblems property in the LCS problem
...
But each of these subproblems has the subsubproblem of finding
an LCS of Xm 1 and Yn 1
...

As in the matrix-chain multiplication problem, our recursive solution to the LCS
problem involves establishing a recurrence for the value of an optimal solution
...
If
either i D 0 or j D 0, one of the sequences has length 0, and so the LCS has
length 0
...
cŒi; j 1; cŒi

if i D 0 or j D 0 ;
if i; j > 0 and xi D yj ;
1; j / if i; j > 0 and xi ¤ yj :

(15
...
When xi D yj , we can and should consider
the subproblem of finding an LCS of Xi 1 and Yj 1
...
In
the previous dynamic-programming algorithms we have examined—for rod cutting
and matrix-chain multiplication—we ruled out no subproblems due to conditions
in the problem
...
For example, the
edit-distance problem (see Problem 15-5) has this characteristic
...
9), we could easily write an exponential-time recursive algorithm to compute the length of an LCS of two sequences
...
mn/ distinct subproblems, however, we can use dynamic programming
to compute the solutions bottom up
...
It stores the cŒi; j  values in a table cŒ0 : : m; 0 : : n,
and it computes the entries in row-major order
...
) The procedure also
maintains the table bŒ1 : : m; 1 : : n to help us construct an optimal solution
...
The procedure returns the b and c tables;
cŒm; n contains the length of an LCS of X and Y
...
X; Y /
1 m D X:length
2 n D Y:length
3 let bŒ1 : : m; 1 : : n and cŒ0 : : m; 0 : : n be new tables
4 for i D 1 to m
5
cŒi; 0 D 0
6 for j D 0 to n
7
cŒ0; j  D 0
8 for i D 1 to m
9
for j D 1 to n
10
if xi == yj
11
cŒi; j  D cŒi 1; j 1 C 1
12
bŒi; j  D “-”
13
elseif cŒi 1; j  cŒi; j 1
14
cŒi; j  D cŒi 1; j 
15
bŒi; j  D “"”
16
else cŒi; j  D cŒi; j 1
17
bŒi; j  D “ ”
18 return c and b
Figure 15
...
The running time of the
procedure is ‚
...
1/ time to compute
...
We simply begin at bŒm; n and
trace through the table by following the arrows
...
4 Longest common subsequence

j

395

1

2

3

4

5

6

yj

i

0

B

D

C

A

B

A

0

xi

0

0

0

0

0

0

0

1

A

0

0

0

0

1

1

1

2

B

0

1

1

1

1

2

2

3

C

0

1

1

2

2

2

2

4

B

0

1

1

2

2

3

3

5

D

0

1

2

2

2

3

3

6

A

0

1

2

2

3

3

4

7

B

0

1

2

2

3

4

4

Figure 15
...
The square in row i and column j contains the value of cŒi; j 
and the appropriate arrow for the value of bŒi; j 
...
For i; j > 0, entry cŒi; j  depends
only on whether xi D yj and the values in entries cŒi 1; j , cŒi; j 1, and cŒi 1; j 1, which
are computed before cŒi; j 
...
Each “-” on the shaded sequence corresponds
to an entry (highlighted) for which xi D yj is a member of an LCS
...
With this method, we encounter the elements of this LCS in reverse order
...
The initial call is P RINT-LCS
...

P RINT-LCS
...
b; X; i 1; j 1/
5
print xi
6 elseif bŒi; j  == “"”
7
P RINT-LCS
...
b; X; i; j 1/
For the b table in Figure 15
...
The procedure takes
time O
...


396

Chapter 15 Dynamic Programming

Improving the code
Once you have developed an algorithm, you will often find that you can improve
on the time or space it uses
...

Others can yield substantial asymptotic savings in time and space
...
Each
cŒi; j  entry depends on only three other c table entries: cŒi 1; j 1, cŒi 1; j ,
and cŒi; j 1
...
1/ time which of
these three values was used to compute cŒi; j , without inspecting table b
...
mCn/ time using a procedure similar to P RINT-LCS
...
4-2 asks you to give the pseudocode
...
mn/ space
by this method, the auxiliary space requirement for computing an LCS does not
asymptotically decrease, since we need ‚
...

We can, however, reduce the asymptotic space requirements for LCS-L ENGTH,
since it needs only two rows of table c at a time: the row being computed and the
previous row
...
4-4 asks you to show, we can use only slightly
more than the space for one row of c to compute the length of an LCS
...
m C n/ time
...
4-1
Determine an LCS of h1; 0; 0; 1; 0; 1; 0; 1i and h0; 1; 0; 1; 1; 0; 1; 1; 0i
...
4-2
Give pseudocode to reconstruct an LCS from the completed c table and the original
sequences X D hx1 ; x2 ; : : : ; xm i and Y D hy1 ; y2 ; : : : ; yn i in O
...

15
...
mn/ time
...
4-4
Show how to compute the length of an LCS using only 2 min
...
1/ additional space
...
m; n/ entries plus O
...


15
...
4-5
Give an O
...

15
...
n lg n/-time algorithm to find the longest monotonically increasing subsequence of a sequence of n numbers
...
Maintain candidate subsequences by linking
them through the input sequence
...
5 Optimal binary search trees
Suppose that we are designing a program to translate text from English to French
...
We could perform these lookup operations by building a binary search
tree with n English words as keys and their French equivalents as satellite data
...
We could ensure an O
...
Words appear with different frequencies, however, and a frequently
used word such as the may appear far from the root while a rarely used word such
as machicolation appears near the root
...
We want
words that occur frequently in the text to be placed nearer the root
...
How do we organize a binary search tree
so as to minimize the number of nodes visited in all searches, given that we know
how often each word occurs?
What we need is known as an optimal binary search tree
...

k1 < k2 <
For each key ki , we have a probability pi that a search will be for ki
...


machicolation has a French counterpart: mˆ chicoulis
...
05

d4
d3

(a)

Figure 15
...
15
0
...
10
0
...
05
0
...
10
0
...
20
0
...
80
...
75
...


d0 ; d1 ; d2 ; : : : ; dn representing values not in K
...
For each dummy
key di , we have a probability qi that a search will correspond to di
...
9
shows two binary search trees for a set of n D 5 keys
...
Every search is either successful (finding
some key ki ) or unsuccessful (finding some dummy key di ), and so we have
n
X
i D1

pi C

n
X

qi D 1 :

(15
...
Let
us assume that the actual cost of a search equals the number of nodes examined,
i
...
, the depth of the node found by the search in T , plus 1
...
depthT
...
depthT
...
ki / pi C

n
X
i D0

depthT
...
11)

15
...
The last equality follows from
equation (15
...
In Figure 15
...
15
0
...
05
0
...
20
0
...
10
0
...
05
0
...
10

contribution
0
...
10
0
...
20
0
...
15
0
...
20
0
...
20
0
...
80

For a given set of probabilities, we wish to construct a binary search tree whose
expected search cost is smallest
...

Figure 15
...
75
...
Nor
can we necessarily construct an optimal binary search tree by always putting the
key with the greatest probability at the root
...

(The lowest expected cost of any binary search tree with k5 at the root is 2
...
)
As with matrix-chain multiplication, exhaustive checking of all possibilities fails
to yield an efficient algorithm
...
In Problem 12-4, we saw that the number of binary trees
with n nodes is
...
Not surprisingly, we shall
solve this problem with dynamic programming
...
Consider any subtree of a binary search tree
...

In addition, a subtree that contains keys ki ; : : : ; kj must also have as its leaves the
dummy keys di 1 ; : : : ; dj
...
The
usual cut-and-paste argument applies
...

We need to use the optimal substructure to show that we can construct an optimal solution to the problem from optimal solutions to subproblems
...
The left subtree of the root kr contains the keys
ki ; : : : ; kr 1 (and dummy keys di 1 ; : : : ; dr 1 ), and the right subtree contains the
keys krC1 ; : : : ; kj (and dummy keys dr ; : : : ; dj )
...

There is one detail worth noting about “empty” subtrees
...
By the above argument, ki ’s
left subtree contains the keys ki ; : : : ; ki 1
...
Bear in mind, however, that subtrees also contain dummy keys
...
Symmetrically, if we select kj as the root,
then kj ’s right subtree contains the keys kj C1 ; : : : ; kj ; this right subtree contains
no actual keys, but it does contain the dummy key dj
...
We pick our
subproblem domain as finding an optimal binary search tree containing the keys
1, j Ä n, and j
i 1
...
) Let us define eŒi; j  as
the expected cost of searching an optimal binary search tree containing the keys
ki ; : : : ; kj
...

The easy case occurs when j D i 1
...

The expected search cost is eŒi; i 1 D qi 1
...
What happens to the
expected search cost of a subtree when it becomes a subtree of a node? The depth
of each node in the subtree increases by 1
...
11), the expected search
cost of this subtree increases by the sum of all the probabilities in the subtree
...
5 Optimal binary search trees

w
...
12)

lDi 1

Thus, if kr is the root of an optimal subtree containing keys ki ; : : : ; kj , we have
eŒi; j  D pr C
...
i; r

1// C
...
r C 1; j // :

Noting that
w
...
i; r

1/ C pr C w
...
i; j / :

(15
...
13) assumes that we know which node kr to use as
the root
...
14)
eŒi; j  D
min feŒi; r 1 C eŒr C 1; j  C w
...

To help us keep track of the structure of optimal binary search trees, we define
rootŒi; j , for 1 Ä i Ä j Ä n, to be the index r for which kr is the root of an
optimal binary search tree containing keys ki ; : : : ; kj
...
5-1
...
For both problem
domains, our subproblems consist of contiguous index subranges
...
14) would be as inefficient as a direct, recursive matrix-chain multiplication algorithm
...
The first index needs to run to n C 1 rather than n because
in order to have a subtree containing only the dummy key dn , we need to compute
and store eŒn C 1; n
...
We use only the entries eŒi; j  for which j
i 1
...
This
table uses only the entries for which 1 Ä i Ä j Ä n
...
Rather than compute the value
of w
...
j i/ additions—we store these values in a table wŒ1 : : n C 1; 0 : : n
...
For j
compute
wŒi; j  D wŒi; j

1 C pj C qj :

(15
...
n2 / values of wŒi; j  in ‚
...

The pseudocode that follows takes as inputs the probabilities p1 ; : : : ; pn and
q0 ; : : : ; qn and the size n, and it returns the tables e and root
...
p; q; n/
1 let eŒ1 : : n C 1; 0 : : n, wŒ1 : : n C 1; 0 : : n,
and rootŒ1 : : n; 1 : : n be new tables
2 for i D 1 to n C 1
3
eŒi; i 1 D qi 1
4
wŒi; i 1 D qi 1
5 for l D 1 to n
6
for i D 1 to n l C 1
7
j D i Cl 1
8
eŒi; j  D 1
9
wŒi; j  D wŒi; j 1 C pj C qj
10
for r D i to j
11
t D eŒi; r 1 C eŒr C 1; j  C wŒi; j 
12
if t < eŒi; j 
13
eŒi; j  D t
14
rootŒi; j  D r
15 return e and root
From the description above and the similarity to the M ATRIX -C HAIN -O RDER procedure in Section 15
...
The for loop of lines 2–4 initializes the values of eŒi; i

and wŒi; i

...
14)
and (15
...
In the first iteration, when l D 1, the loop computes eŒi; i and wŒi; i for i D 1; 2; : : : ; n
...
The innermost for loop, in lines 10–14, tries each candidate index r
to determine which key kr to use as the root of an optimal binary search tree containing keys ki ; : : : ; kj
...

Figure 15
...
9
...
5, the tables are rotated to make

15
...
75
2
1
...
00
3
3
1
...
20 1
...
00
2
0
...
80
3
3
0
...
50 0
...
90 0
...
60 0
...
45 0
...
25 0
...
50
0
6
0
...
10 0
...
05 0
...
10

4

2

i

4
0
...
35 0
...
50
5
0
...
25 0
...
20 0
...
05 0
...
05 0
...
05 0
...
10 The tables eŒi; j , wŒi; j , and rootŒi; j  computed by O PTIMAL -BST on the key
distribution shown in Figure 15
...
The tables are rotated so that the diagonals run horizontally
...
O PTIMAL -BST computes the rows from bottom to
top and from left to right within each row
...
n3 / time, just like M ATRIX -C HAIN O RDER
...
n3 /, since its for loops are
nested three deep and each loop index takes on at most n values
...
Thus, like M ATRIX -C HAIN O RDER, the O PTIMAL -BST procedure takes
...

Exercises
15
...
root/ which,
given the table root, outputs the structure of an optimal binary search tree
...
10, your procedure should print out the structure

404

Chapter 15 Dynamic Programming

k2 is the root
k1 is the left child of k2
d0 is the left child of k1
d1 is the right child of k1
k5 is the right child of k2
k4 is the left child of k5
k3 is the left child of k4
d2 is the left child of k3
d3 is the right child of k3
d4 is the right child of k4
d5 is the right child of k5
corresponding to the optimal binary search tree shown in Figure 15
...

15
...
06

1
0
...
06

2
0
...
06

3
0
...
06

4
0
...
05

5
0
...
05

6
0
...
05

7
0
...
05

15
...
i; j / directly from equation (15
...
How would this change affect the asymptotic running
time of O PTIMAL -BST?
15
...
Use this fact to
modify the O PTIMAL -BST procedure to run in ‚
...


Problems
15-1 Longest simple path in a directed acyclic graph
Suppose that we are given a directed acyclic graph G D
...
Describe a dynamicprogramming approach for finding a longest weighted simple path from s to t
...
11 Seven points in the plane, shown on a unit grid
...
This tour is not bitonic
...
Its length is approximately 25:58
...
Examples of palindromes are all strings of length 1, civic,
racecar, and aibohphobia (fear of palindromes)
...
For example, given the input character, your algorithm
should return carac
...

Figure 15
...
The general problem is
NP-hard, and its solution is therefore believed to require more than polynomial
time (see Chapter 34)
...
L
...
Figure 15
...
In this
case, a polynomial-time algorithm is possible
...
n2 /-time algorithm for determining an optimal bitonic tour
...
(Hint: Scan left to right, maintaining optimal possibilities for the two parts of the tour
...
The input text is a sequence of n

406

Chapter 15 Dynamic Programming

words of lengths l1 ; l2 ; : : : ; ln , measured in characters
...
Our
criterion of “neatness” is as follows
...
We wish to minimize the sum, over
all lines except the last, of the cubes of the numbers of extra space characters at the
ends of lines
...
Analyze the running time and space requirements of
your algorithm
...
Our goal is, given x and y,
to produce a series of transformations that change x to y
...
Initially, ´ is empty, and at termination, we should have
´Œj  D yŒj  for j D 1; 2; : : : ; n
...
Initially, i D j D 1
...

We may choose from among six transformation operations:
Copy a character from x to ´ by setting ´Œj  D xŒi and then incrementing both i
and j
...

Replace a character from x by another character c, by setting ´Œj  D c, and then
incrementing both i and j
...

Delete a character from x by incrementing i but leaving j alone
...

Insert the character c into ´ by setting ´Œj  D c and then incrementing j , but
leaving i alone
...

Twiddle (i
...
, exchange) the next two characters by copying them from x to ´ but
in the opposite order; we do so by setting ´Œj  D xŒi C 1 and ´Œj C 1 D xŒi
and then setting i D i C 2 and j D j C 2
...

Kill the remainder of x by setting i D m C 1
...
This operation, if performed, must
be the final operation
...

Each of the transformation operations has an associated cost
...
We also assume that the individual costs of
the copy and replace operations are less than the combined costs of the delete and
insert operations; otherwise, the copy and replace operations would not be used
...
For the sequence above, the cost of
transforming algorithm to altruistic is

...
copy// C cost
...
delete/ C
...
insert//
C cost
...
kill/ :
a
...
Describe a dynamic-programming algorithm
that finds the edit distance from xŒ1 : : m to yŒ1 : : n and prints an optimal operation sequence
...

The edit-distance problem generalizes the problem of aligning two DNA sequences
(see, for example, Setubal and Meidanis [310, Section 3
...
There are several
methods for measuring the similarity of two DNA sequences by aligning them
...
e
...
Then we assign a
“score” to each position
...

The score for the alignment is the sum of the scores of the individual positions
...

b
...

15-6 Planning a company party
Professor Stewart is consulting for the president of a corporation that is planning
a company party
...
The personnel office has ranked each
employee with a conviviality rating, which is a real number
...

Professor Stewart is given the tree that describes the structure of the corporation,
using the left-child, right-sibling representation described in Section 10
...
Each
node of the tree holds, in addition to the pointers, the name of an employee and
that employee’s conviviality ranking
...
Analyze the
running time of your algorithm
...
V; E/ for speech
recognition
...
u; / 2 E is labeled with a sound
...
The labeled graph is a formal model of a person speaking

Problems for Chapter 15

409

a restricted language
...

We define the label of a directed path to be the concatenation of the labels of the
edges on that path
...
Describe an efficient algorithm that, given an edge-labeled graph G with distinguished vertex 0 and a sequence s D h 1 ; 2 ; : : : ; k i of sounds from †,
returns a path in G that begins at 0 and has s as its label, if any such path exists
...
Analyze the running
time of your algorithm
...
)
Now, suppose that every edge
...
u; / of traversing the edge
...
The sum of the probabilities of the edges leaving any vertex
equals 1
...
We can view the probability of a path beginning at 0 as the
probability that a “random walk” beginning at 0 will follow the specified path,
where we randomly choose which edge to take leaving a vertex u according to the
probabilities of the available edges leaving u
...
Extend your answer to part (a) so that if a path is returned, it is a most probable path starting at 0 and having label s
...

15-8 Image compression by seam carving
We are given a color picture consisting of an m n array AŒ1 : : m; 1 : : n of pixels,
where each pixel specifies a triple of red, green, and blue (RGB) intensities
...
Specifically, we wish to remove
one pixel from each of the m rows, so that the whole picture becomes one pixel
narrower
...

a
...

b
...
Intuitively, the lower a pixel’s disruption measure, the
more similar the pixel is to its neighbors
...


410

Chapter 15 Dynamic Programming

Give an algorithm to find a seam with the lowest disruption measure
...
Because this operation copies the string, it costs n time units to break
a string of n characters into two pieces
...
The order in which the breaks occur can affect the
total amount of time used
...
If she programs the
breaks to occur in left-to-right order, then the first break costs 20 time units, the
second break costs 18 time units (breaking the string from characters 3 to 20 at
character 8), and the third break costs 12 time units, totaling 50 time units
...
In yet another order, she could break first at 8
(costing 20), then break the left piece at 2 (costing 8), and finally the right piece
at 10 (costing 12), for a total cost of 40
...
More formally, given a
string S with n characters and an array LŒ1 : : m containing the break points, compute the lowest cost for a sequence of breaks, along with a sequence of breaks that
achieves this cost
...
You decide to invest
this money with the goal of maximizing your return at the end of 10 years
...

Amalgamated Investments requires you to observe the following rules
...
In each year j , investment i provides
a return rate of rij
...
The return rates are guaranteed,
that is, you are given all the return rates for the next 10 years for each investment
...
At the end of each year, you
can leave the money made in the previous year in the same investments, or you
can shift money to other investments, by either shifting money between existing
investments or moving money to a new investement
...


Problems for Chapter 15

411

a
...
Prove that there exists an optimal investment strategy that, in
each year, puts all the money into a single investment
...
)
b
...

c
...
What is the
running time of your algorithm?
d
...
Show
that the problem of maximizing your income at the end of 10 years no longer
exhibits optimal substructure
...
The demand
for such products varies from month to month, and so the company needs to develop a strategy to plan its manufacturing given the fluctuating, but predictable,
demand
...
For each
month i, the company P
knows the demand di , that is, the number of machines that
it will sell
...
The
i
company keeps a full-time staff who provide labor to manufacture up to m machines per month
...
Furthermore, if, at the end of a month, the company is holding any
unsold machines, it must pay inventory costs
...
j / for j D 1; 2; : : : ; D, where h
...
j / Ä h
...

Give an algorithm that calculates a plan for the company that minimizes its costs
while fulfilling all the demand
...

15-12 Signing free-agent baseball players
Suppose that you are the general manager for a major-league baseball team
...
The team
owner has given you a budget of $X to spend on free agents
...


412

Chapter 15 Dynamic Programming

You are considering N different positions, and for each position, P free-agent
players who play that position are available
...
(If you do not sign any players at a
particular position, then you plan to stick with the players you already have at that
position
...
” A player with
a higher VORP is more valuable than a player with a lower VORP
...

For each available free-agent player, you have three pieces of information:
the player’s position,
the amount of money it will cost to sign the player, and
the player’s VORP
...
You may assume that each player signs for a
multiple of $100,000
...
Analyze the running time and space requirement of your algorithm
...
Bellman began the systematic study of dynamic programming in 1955
...
Although optimization techniques incorporating elements of
dynamic programming were known earlier, Bellman provided the area with a solid
mathematical basis [37]
...
For example, a general manager
might consider right-handed pitchers and left-handed pitchers to be separate “positions,” as well as
starting pitchers, long relief pitchers (relief pitchers who can pitch several innings), and short relief
pitchers (relief pitchers who normally pitch at most only one inning)
...
It provides several ways
to compare the relative values of individual players
...
They
call a dynamic-programming algorithm tD=eD if its table size is O
...
ne / other entries
...
2 would be 2D=1D, and the longest-common-subsequence
algorithm in Section 15
...

Hu and Shing [182, 183] give an O
...

The O
...
Knuth [70] posed the question of whether subquadratic
algorithms for the LCS problem exist
...
mn= lg n/ time,
where n Ä m and the sequences are drawn from a set of bounded size
...
n C m/ lg
...

Many of these results extend to the problem of computing string edit distances
(Problem 15-5)
...
n3 /-time algorithm
...
5
...
5-4 is due to
Knuth [212]
...
n2 / time and O
...
n lg n/
...


16

Greedy Algorithms

Algorithms for optimization problems typically go through a sequence of steps,
with a set of choices at each step
...
A greedy algorithm always makes the choice that looks best at
the moment
...
This chapter explores optimization problems for which greedy algorithms provide optimal solutions
...
3
...
We shall first examine, in Section 16
...
We shall arrive at the greedy algorithm by first considering a dynamic-programming approach and then showing that we can always make
greedy choices to arrive at an optimal solution
...
2 reviews the basic
elements of the greedy approach, giving a direct approach for proving greedy algorithms correct
...
3 presents an important application of greedy techniques: designing data-compression (Huffman) codes
...
4, we investigate some of the theory underlying combinatorial structures called “matroids,”
for which a greedy algorithm always produces an optimal solution
...
5 applies matroids to solve a problem of scheduling unit-time tasks with
deadlines and penalties
...
Later chapters will present many algorithms that we can view as applications of the greedy method, including minimum-spanning-tree algorithms (Chapter 23), Dijkstra’s algorithm for shortest paths from a single source (Chapter 24),
and Chv´ tal’s greedy set-covering heuristic (Chapter 35)
...
Although you can read

16
...


16
...
Suppose we have a set S D fa1 ; a2 ; : : : ; an g
of n proposed activities that wish to use a resource, such as a lecture hall, which
can serve only one activity at a time
...
If selected, activity ai takes place during the
half-open time interval Œsi ; fi /
...
That is, ai and aj are compatible if si
or sj
fi
...
We assume that the activities are sorted
in monotonically increasing order of finish time:
f1 Ä f2 Ä f3 Ä

Ä fn

1

Ä fn :

(16
...
) For example,
consider the following set S of activities:
i
si
fi

1
1
4

2
3
5

3
0
6

4
5
7

5
3
9

6
5
9

7
6
10

8
8
11

9
8
12

10
2
14

11
12
16

For this example, the subset fa3 ; a9 ; a11 g consists of mutually compatible activities
...
In
fact, fa1 ; a4 ; a8 ; a11 g is a largest subset of mutually compatible activities; another
largest subset is fa2 ; a4 ; a9 ; a11 g
...
We start by thinking about a
dynamic-programming solution, in which we consider several choices when determining which subproblems to use in an optimal solution
...
Based on these observations, we
shall develop a recursive greedy algorithm to solve the activity-scheduling problem
...
Although the steps we shall go through
in this section are slightly more involved than is typical when developing a greedy
algorithm, they illustrate the relationship between greedy algorithms and dynamic
programming
...
Let us denote by Sij the set of activities that start after activity ai finishes and
that finish before activity aj starts
...
By including ak in an optimal solution, we
are left with two subproblems: finding mutually compatible activities in the set Si k
(activities that start after activity ai finishes and that finish before activity ak starts)
and finding mutually compatible activities in the set Skj (activities that start after
activity ak finishes and that finish before activity aj starts)
...
Thus, we
have Aij D Ai k [ fak g [ Akj , and so the maximum-size set Aij of mutually compatible activities in Sij consists of jAij j D jAi k j C jAkj j C 1 activities
...
If we could
find a set A0kj of mutually compatible activities in Skj where jA0kj j > jAkj j, then
we could use A0kj , rather than Akj , in a solution to the subproblem for Sij
...
A symmetric argument applies to the activities in Si k
...
If we denote the size of
an optimal solution for the set Sij by cŒi; j , then we would have the recurrence
cŒi; j  D cŒi; k C cŒk; j  C 1 :
Of course, if we did not know that an optimal solution for the set Sij includes
activity ak , we would have to examine all activities in Sij to find which one to
choose, so that
(
0
if Sij D ; ;
cŒi; j  D max fcŒi; k C cŒk; j  C 1g if S ¤ ; :
(16
...
But we would be overlooking
another important characteristic of the activity-selection problem that we can use
to great advantage
...
1 An activity-selection problem

417

Making the greedy choice
What if we could choose an activity to add to our optimal solution without having
to first solve all the subproblems? That could save us from having to consider all
the choices inherent in recurrence (16
...
In fact, for the activity-selection problem,
we need consider only one choice: the greedy choice
...
Now, of the activities we end up choosing, one of them must be the first one to finish
...
(If more
than one activity in S has the earliest finish time, then we can choose any such
activity
...
Choosing the first activity
to finish is not the only way to think of making a greedy choice for this problem;
Exercise 16
...

If we make the greedy choice, we have only one remaining subproblem to solve:
finding activities that start after a1 finishes
...
Thus, all activities that are compatible with activity a1 must start
after a1 finishes
...
Let Sk D fai 2 S W si fk g be the set of activities that
start after activity ak finishes
...
1 Optimal substructure tells us that if a1
is in the optimal solution, then an optimal solution to the original problem consists
of activity a1 and all the activities in an optimal solution to the subproblem S1
...

1 We sometimes refer to the sets S

k as subproblems rather than as just sets of activities
...


418

Chapter 16 Greedy Algorithms

Theorem 16
...
Then am is included in some maximum-size subset of mutually
compatible activities of Sk
...
If aj D am , we are
done, since we have shown that am is in some maximum-size subset of mutually
compatible activities of Sk
...
The activities in A0k are disjoint, which follows because
the activities in Ak are disjoint, aj is the first activity in Ak to finish, and fm Ä fj
...

Thus, we see that although we might be able to solve the activity-selection problem with dynamic programming, we don’t need to
...
) Instead, we can repeatedly choose the activity that finishes first, keep only
the activities compatible with this activity, and repeat until no activities remain
...
We can consider
each activity just once overall, in monotonically increasing order of finish times
...
Instead, it can
work top-down, choosing an activity to put into the optimal solution and then solving the subproblem of choosing activities from those that are compatible with those
already chosen
...

A recursive greedy algorithm
Now that we have seen how to bypass the dynamic-programming approach and instead use a top-down, greedy algorithm, we can write a straightforward, recursive
procedure to solve the activity-selection problem
...


16
...
It returns a maximum-size set of mutually compatible activities in Sk
...
1)
...
n lg n/ time, breaking ties arbitrarily
...
The initial call, which solves the entire problem, is
R ECURSIVE -ACTIVITY-S ELECTOR
...

R ECURSIVE -ACTIVITY-S ELECTOR
...
s; f; m; n/
6 else return ;
Figure 16
...
In a given recursive call
R ECURSIVE -ACTIVITY-S ELECTOR
...
The loop examines akC1 ; akC2 ; : : : ; an , until it finds the first activity am that is compatible with ak ; such an activity has
fk
...
s; f; m; n/
...
In this case, Sk D ;, and so the
procedure returns ; in line 6
...
s; f; 0; n/ is ‚
...
Over all recursive calls, each activity is examined exactly once
in the while loop test of line 2
...

An iterative greedy algorithm
We easily can convert our recursive procedure to an iterative one
...
It is usually a
straightforward task to transform a tail-recursive procedure to an iterative form; in
fact, some compilers for certain programming languages perform this task automatically
...
e
...


420

Chapter 16 Greedy Algorithms

k

sk

fk

0



0

1

1

4

2

3

5

3

0

6

4

5

7

a0
a1
a0

RECURSIVE -ACTIVITY-SELECTOR(s, f, 0, 11)

m=1
a2

RECURSIVE -ACTIVITY-SELECTOR(s, f, 1, 11)

a1
a3
a1
a4
a1

m=4
RECURSIVE -ACTIVITY-SELECTOR(s, f, 4, 11)

5

3

9

6

5

9

7

6

10

8

8

11

9

8

12

10

2

14

11

12

a5

16

a1

a4

a1

a4

a1

a4

a1

a4

a6

a7

a8
m=8
a9

RECURSIVE -ACTIVITY -SELECTOR (s, f, 8, 11)
a1
a4

a8
a10

a1

a4

a8

a1

a4

a8

m = 11

a8

a11

a11

RECURSIVE -ACTIVITY -SELECTOR (s, f, 11, 11)
a1
a4

time
0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

Figure 16
...
Activities considered in each recursive call appear between horizontal lines
...
s; f; 0; 11/, selects activity a1
...
If the starting time of an activity occurs before
the finish time of the most recently added activity (the arrow between them points left), it is rejected
...
The last recursive call,
R ECURSIVE -ACTIVITY-S ELECTOR
...
The resulting set of selected activities is
fa1 ; a4 ; a8 ; a11 g
...
1 An activity-selection problem

421

The procedure G REEDY-ACTIVITY-S ELECTOR is an iterative version of the procedure R ECURSIVE -ACTIVITY-S ELECTOR
...
It collects selected activities into a set A and returns this set when it is done
...
s; f /
1 n D s:length
2 A D fa1 g
3 k D1
4 for m D 2 to n
5
if sŒm f Œk
6
A D A [ fam g
7
k Dm
8 return A
The procedure works as follows
...
Since we consider
the activities in order of monotonically increasing finish time, fk is always the
maximum finish time of any activity in A
...
3)

Lines 2–3 select activity a1 , initialize A to contain just this activity, and initialize k
to index this activity
...
The loop considers each activity am in turn and adds am to A if it is compatible with all previously selected activities; such an activity is the earliest in Sk to
finish
...
3) to check (in line 5) that its start time sm is not earlier
than the finish time fk of the activity most recently added to A
...
The set A returned
by the call G REEDY-ACTIVITY-S ELECTOR
...
s; f; 0; n/
...
n/ time, assuming that the activities were already sorted initially by
their finish times
...
1-1
Give a dynamic-programming algorithm for the activity-selection problem, based
on recurrence (16
...
Have your algorithm compute the sizes cŒi; j  as defined
above and also produce the maximum-size subset of mutually compatible activities
...
1)
...

16
...
Describe how this approach is a greedy algorithm, and prove that it yields an optimal
solution
...
1-3
Not just any greedy approach to the activity-selection problem produces a maximum-size set of mutually compatible activities
...
Do the same for
the approaches of always selecting the compatible activity that overlaps the fewest
other remaining activities and always selecting the compatible remaining activity
with the earliest start time
...
1-4
Suppose that we have a set of activities to schedule among a large number of lecture
halls, where any activity can take place in any lecture hall
...
Give an efficient greedy
algorithm to determine which activity should use which lecture hall
...
We can
create an interval graph whose vertices are the given activities and whose edges
connect incompatible activities
...
)
16
...
The objective is no longer
to maximize the number of activities scheduled, but instead to maximize the total
value of the activities scheduled
...
Give a polynomial-time algorithm for
this problem
...
2 Elements of the greedy strategy

423

16
...
At each decision point, the algorithm makes choice that seems best at
the moment
...
This section
discusses some of the general properties of greedy methods
...
1 to develop a greedy algorithm was
a bit more involved than is typical
...
Determine the optimal substructure of the problem
...
Develop a recursive solution
...
2), but we bypassed developing a recursive algorithm based
on this recurrence
...
Show that if we make the greedy choice, then only one subproblem remains
...
Prove that it is always safe to make the greedy choice
...
)
5
...

6
...

In going through these steps, we saw in great detail the dynamic-programming underpinnings of a greedy algorithm
...
We then found
that if we always made the greedy choice, we could restrict the subproblems to be
of the form Sk
...
In the
activity-selection problem, we could have started by dropping the second subscript
and defining subproblems of the form Sk
...

More generally, we design greedy algorithms according to the following sequence
of steps:
1
...

2
...


424

Chapter 16 Greedy Algorithms

3
...

We shall use this more direct process in later sections of this chapter
...

How can we tell whether a greedy algorithm will solve a particular optimization
problem? No way works all the time, but the greedy-choice property and optimal
substructure are the two key ingredients
...

Greedy-choice property
The first key ingredient is the greedy-choice property: we can assemble a globally
optimal solution by making locally optimal (greedy) choices
...

Here is where greedy algorithms differ from dynamic programming
...
Consequently, we typically solve dynamic-programming
problems in a bottom-up manner, progressing from smaller subproblems to larger
subproblems
...
Of
course, even though the code works top down, we still must solve the subproblems before making a choice
...
The choice
made by a greedy algorithm may depend on choices so far, but it cannot depend on
any future choices or on the solutions to subproblems
...
A dynamicprogramming algorithm proceeds bottom up, whereas a greedy strategy usually
progresses in a top-down fashion, making one greedy choice after another, reducing each given problem instance to a smaller one
...
Typically, as in the case of Theorem 16
...
It then shows how to modify
the solution to substitute the greedy choice for some other choice, resulting in one
similar, but smaller, subproblem
...
For example, in the activity-selection problem, as-

16
...
By preprocessing the
input or by using an appropriate data structure (often a priority queue), we often
can make greedy choices quickly, thus yielding an efficient algorithm
...
This property is a key ingredient of assessing the applicability of dynamic programming as well as greedy
algorithms
...
1 that if an optimal solution to subproblem Sij includes an activity ak ,
then it must also contain optimal solutions to the subproblems Si k and Skj
...
Based on this observation of
optimal substructure, we were able to devise the recurrence (16
...

We usually use a more direct approach regarding optimal substructure when
applying it to greedy algorithms
...
All we really need to do is argue that an optimal solution to
the subproblem, combined with the greedy choice already made, yields an optimal
solution to the original problem
...

Greedy versus dynamic programming
Because both the greedy and dynamic-programming strategies exploit optimal substructure, you might be tempted to generate a dynamic-programming solution to a
problem when a greedy solution suffices or, conversely, you might mistakenly think
that a greedy solution works when in fact a dynamic-programming solution is required
...

The 0-1 knapsack problem is the following
...
The ith item is worth i dollars and weighs wi pounds, where i and wi are
integers
...
Which items should he take?
(We call this the 0-1 knapsack problem because for each item, the thief must either

426

Chapter 16 Greedy Algorithms

take it or leave it behind; he cannot take a fractional amount of an item or take an
item more than once
...

You can think of an item in the 0-1 knapsack problem as being like a gold ingot
and an item in the fractional knapsack problem as more like gold dust
...
For the 0-1
problem, consider the most valuable load that weighs at most W pounds
...
For the comparable fractional problem, consider that if we remove
a weight w of one item j from the optimal load, the remaining load must be the
most valuable load weighing at most W w that the thief can take from the n 1
original items plus wj w pounds of item j
...
To
solve the fractional problem, we first compute the value per pound i =wi for each
item
...
If the supply of that item is exhausted
and he can still carry more, he takes as much as possible of the item with the next
greatest value per pound, and so forth, until he reaches his weight limit W
...
n lg n/
time
...
2-1
...
2(a)
...
Item 1 weighs 10 pounds and
is worth 60 dollars
...
Item 3
weighs 30 pounds and is worth 120 dollars
...
The greedy strategy, therefore,
would take item 1 first
...
2(b),
however, the optimal solution takes items 2 and 3, leaving item 1 behind
...

For the comparable fractional problem, however, the greedy strategy, which
takes item 1 first, does yield an optimal solution, as shown in Figure 16
...
Taking item 1 doesn’t work in the 0-1 problem because the thief is unable to fill his
knapsack to capacity, and the empty space lowers the effective value per pound of
his load
...
2 Elements of the greedy strategy

item 1

+

30
20

20 $100

30 $120
20 $100
+
10

$100

$120 knapsack
(a)

$80
+

50

10
$60

20
30

30 $120

item 3
item 2

427

= $220

$60

= $160
(b)

+
10

20 $100
+

$60

10

= $180

$60

= $240
(c)

Figure 16
...
(a) The thief must select a subset of the three items shown whose weight must not exceed
50 pounds
...
Any solution with item 1 is suboptimal,
even though item 1 has the greatest value per pound
...


choice
...
2-2
asks you to show, we can use dynamic programming to solve the 0-1 problem
...
2-1
Prove that the fractional knapsack problem has the greedy-choice property
...
2-2
Give a dynamic-programming solution to the 0-1 knapsack problem that runs in
O
...

16
...
Give
an efficient algorithm to find an optimal solution to this variant of the knapsack
problem, and argue that your algorithm is correct
...
2-4
Professor Gekko has always dreamed of inline skating across North Dakota
...
S
...


428

Chapter 16 Greedy Algorithms

The professor can carry two liters of water, and he can skate m miles before running
out of water
...
) The professor will start in Grand Forks with two full liters of
water
...
S
...

The professor’s goal is to minimize the number of water stops along his route
across the state
...
Prove that your strategy yields an optimal solution, and give
its running time
...
2-5
Describe an efficient algorithm that, given a set fx1 ; x2 ; : : : ; xn g of points on the
real line, determines the smallest set of unit-length closed intervals that contains
all of the given points
...

16
...
n/ time
...
2-7
Suppose you are given two sets A and B, each containing n positive integers
...
After reordering, let ai be the ith
element of set A, and let bi be the ith element of set B
...
Give an algorithm that will maximize your payoff
...


16
...
We consider the
data to be a sequence of characters
...
e
...

Suppose we have a 100,000-character data file that we wish to store compactly
...
3
...

We have many options for how to represent such a file of information
...
3 Huffman codes

Frequency (in thousands)
Fixed-length codeword
Variable-length codeword

429

a
45
000
0

b
13
001
101

c
12
010
100

d
16
011
111

e
9
100
1101

f
5
101
1100

Figure 16
...
A data file of 100,000 characters contains only the characters a–f, with the frequencies indicated
...
Using the variable-length code shown, we can encode the file in only
224,000 bits
...
If we use a fixed-length code, we need 3 bits to represent 6 characters:
a = 000, b = 001,
...
This method requires 300,000 bits to code the
entire file
...
Figure 16
...
This code requires

...
In fact, this is an optimal
character code for this file, as we shall see
...
Such codes are called prefix codes
...

Encoding is always simple for any binary character code; we just concatenate the
codewords representing each character of the file
...
3, we code the 3-character file abc as 0 101 100 D
0101100, where “ ” denotes concatenation
...
Since no codeword
is a prefix of any other, the codeword that begins an encoded file is unambiguous
...


“prefix-free codes” would be a better name, but the term “prefix codes” is standard in the

430

Chapter 16 Greedy Algorithms

100

100

0

1

0

86
0

14
1

58
0
a:45

0
c:12

55

0
28

1
b:13

1

a:45
0

14
1
d:16

0
e:9

1

25
1
f:5

0
c:12

30
1
b:13
0
f:5

(a)

0
14

1
d:16
1
e:9

(b)

Figure 16
...
3
...
Each internal node is labeled with the sum of the frequencies of the leaves in its subtree
...
,
f = 101
...
, f = 1100
...
In our
example, the string 001011101 parses uniquely as 0 0 101 1101, which decodes
to aabe
...
A binary tree whose leaves are
the given characters provides one such representation
...
” Figure 16
...
Note that these are not binary search
trees, since the leaves need not appear in sorted order and internal nodes do not
contain character keys
...
3-2)
...
4(a), is not a full binary tree: it contains codewords beginning 10
...
Since
we can now restrict our attention to full binary trees, we can say that if C is the
alphabet from which the characters are drawn and all character frequencies are positive, then the tree for an optimal prefix code has exactly jC j leaves, one for each
letter of the alphabet, and exactly jC j 1 internal nodes (see Exercise B
...

Given a tree T corresponding to a prefix code, we can easily compute the number
of bits required to encode a file
...
c/ denote the depth

16
...
Note that dT
...
The number of bits required to encode a file is thus
X
c:freq dT
...
4)
B
...

Constructing a Huffman code
Huffman invented a greedy algorithm that constructs an optimal prefix code called
a Huffman code
...
2, its proof of correctness relies on the greedy-choice property and optimal substructure
...
Doing so will help clarify how the algorithm makes
greedy choices
...

The algorithm builds the tree T corresponding to the optimal code in a bottom-up
manner
...
The algorithm uses a min-priority
queue Q, keyed on the freq attribute, to identify the two least-frequent objects to
merge together
...

H UFFMAN
...
Q/
6
´:right D y D E XTRACT-M IN
...
Q; ´/
/ return the root of the tree
/
9 return E XTRACT-M IN
...
5
...
The final tree represents the optimal prefix code
...

Line 2 initializes the min-priority queue Q with the characters in C
...
5 The steps of Huffman’s algorithm for the frequencies given in Figure 16
...
Each part
shows the contents of the queue sorted into increasing order by frequency
...
Leaves are shown as rectangles containing a character
and its frequency
...
An edge connecting an internal node with its children is labeled 0 if it is an edge to a left
child and 1 if it is an edge to a right child
...
(a) The initial set of n D 6 nodes, one for each
letter
...
(f) The final tree
...
The frequency of ´ is computed as the sum of the frequencies of x and y
in line 7
...
(This order is
arbitrary; switching the left and right child of any node yields a different code of
the same cost
...

Although the algorithm would produce the same result if we were to excise the
variables x and y—assigning directly to ´:left and ´:right in lines 5 and 6, and
changing line 7 to ´:freq D ´:left:freq C ´:right:freq—we shall use the node

16
...
Therefore, we find it convenient to
leave them in
...
For a set C of n characters, we
can initialize Q in line 2 in O
...
3
...
lg n/, the loop contributes O
...
Thus, the total running time of H UFFMAN on a set of n characters is O
...
We can reduce the running time to O
...

Correctness of Huffman’s algorithm
To prove that the greedy algorithm H UFFMAN is correct, we show that the problem of determining an optimal prefix code exhibits the greedy-choice and optimalsubstructure properties
...

Lemma 16
...
Let
x and y be two characters in C having the lowest frequencies
...

Proof The idea of the proof is to take the tree T representing an arbitrary optimal
prefix code and modify it to make a tree representing another optimal prefix code
such that the characters x and y appear as sibling leaves of maximum depth in the
new tree
...

Let a and b be two characters that are sibling leaves of maximum depth in T
...

Since x:freq and y:freq are the two lowest leaf frequencies, in order, and a:freq
and b:freq are two arbitrary frequencies, in order, we have x:freq Ä a:freq and
y:freq Ä b:freq
...
However, if we had x:freq D b:freq, then we would also have
a:freq D b:freq D x:freq D y:freq (see Exercise 16
...
Thus, we will assume that x:freq ¤ b:freq, which means that
x ¤ b
...
6 shows, we exchange the positions in T of a and x to produce a
tree T 0 , and then we exchange the positions in T 0 of b and y to produce a tree T 00

434

Chapter 16 Greedy Algorithms

T′

T

T′′

x
y

a
y

a

b

a
b

x

b

x

y

Figure 16
...
2
...
Leaves x and y are the two characters with the
lowest frequencies; they appear in arbitrary positions in T
...
Since each swap does
not increase the cost, the resulting tree T 00 is also an optimal tree
...
(Note that if x D b but
y ¤ a, then tree T 00 does not have x and y as sibling leaves of maximum depth
...
) By equation (16
...
T /

B
...
c/
D
c2C

X

c:freq dT 0
...
x/ C a:freq dT
...
x/ a:freq dT 0
...
x/ C a:freq dT
...
a/ a:freq dT
...
a:freq x:freq/
...
a/ dT
...
a/ dT
...
More specifically, a:freq x:freq is nonnegative because x is a minimum-frequency leaf, and
dT
...
x/ is nonnegative because a is a leaf of maximum depth in T
...
T 0 / B
...
Therefore, B
...
T /, and since T is optimal, we have B
...
T 00 /,
which implies B
...
T /
...

Lemma 16
...
Why is this a greedy choice? We can
view the cost of a single merger as being the sum of the frequencies of the two items
being merged
...
3-4 shows that the total cost of the tree constructed
equals the sum of the costs of its mergers
...


16
...

Lemma 16
...

Let x and y be two characters in C with minimum frequency
...
Define f for C 0 as for C , except that
´:freq D x:freq C y:freq
...
Then the tree T , obtained from T 0 by replacing the leaf node
for ´ with an internal node having x and y as children, represents an optimal prefix
code for the alphabet C
...
T / of tree T in terms of the
cost B
...
4)
...
c/ D dT 0
...
c/ D c:freq dT 0
...
Since dT
...
y/ D dT 0
...
x/ C y:freq dT
...
x:freq C y:freq/
...
´/ C 1/
D ´:freq dT 0
...
x:freq C y:freq/ ;
from which we conclude that
B
...
T 0 / C x:freq C y:freq
or, equivalently,
B
...
T /

x:freq

y:freq :

We now prove the lemma by contradiction
...
Then there exists an optimal tree T 00 such that
B
...
T /
...
2), T 00 has x and y as
siblings
...
Then
B
...
T 00 / x:freq y:freq
< B
...
T 0 / ;
yielding a contradiction to the assumption that T 0 represents an optimal prefix code
for C 0
...

Theorem 16
...

Proof

Immediate from Lemmas 16
...
3
...
3-1
Explain why, in the proof of Lemma 16
...

16
...

16
...
3-4
Prove that we can also express the total cost of a tree for a code as the sum, over
all internal nodes, of the combined frequencies of the two children of the node
...
3-5
Prove that if we order the characters in an alphabet so that their frequencies
are monotonically decreasing, then there exists an optimal code whose codeword
lengths are monotonically increasing
...
3-6
Suppose we have an optimal prefix code on a set C D f0; 1; : : : ; n 1g of characters and we wish to transmit this code using as few bits as possible
...
(Hint:
Use 2n 1 bits to specify the structure of the tree, as discovered by a walk of the
tree
...
3-7
Generalize Huffman’s algorithm to ternary codewords (i
...
, codewords using the
symbols 0, 1, and 2), and prove that it yields optimal ternary codes
...
3-8
Suppose that a data file contains a sequence of 8-bit characters such that all 256
characters are about equally common: the maximum character frequency is less
than twice the minimum character frequency
...


16
...
3-9
Show that no compression scheme can expect to compress a file of randomly chosen 8-bit characters by even a single bit
...
)

? 16
...
This theory
describes many situations in which the greedy method yields optimal solutions
...
” Although this theory does
not cover all cases for which a greedy method applies (for example, it does not
cover the activity-selection problem of Section 16
...
3), it does cover many cases of practical interest
...

Matroids
A matroid is an ordered pair M D
...

1
...

2
...
We say that « is hereditary if it
satisfies this property
...

3
...
We say that M satisfies the exchange property
...
He was studying matric matroids, in which the elements of S are the rows of a given matrix and a set of rows is
independent if they are linearly independent in the usual sense
...
4-2
asks you to show, this structure defines a matroid
...
SG ; « G /
defined in terms of a given undirected graph G D
...

If A is a subset of E, then A 2 « G if and only if A is acyclic
...
V; A/ forms a forest
...


438

Chapter 16 Greedy Algorithms

Theorem 16
...
V; E/ is an undirected graph, then MG D
...

Proof Clearly, SG D E is a finite set
...
Putting it another way, removing edges from an
acyclic set of edges cannot create cycles
...
Suppose that
GA D
...
V; B/ are forests of G and that jBj > jAj
...

We claim that a forest F D
...
To
see why, suppose that F consists of t trees, where the ith tree contains i vertices
and ei edges
...


i

1/ (by Theorem B
...
Thus, forest GA contains jV j jAj trees, and
forest GB contains jV j jBj trees
...
Moreover,
since T is connected, it must contain an edge
...
Since the edge
...
u; / to forest GA without creating
a cycle
...

Given a matroid M D
...
As an example, consider a graphic matroid MG
...

If A is an independent subset in a matroid M , we say that A is maximal if it has
no extensions
...
The following property is often useful
...
4 Matroids and greedy methods

439

Theorem 16
...

Proof Suppose to the contrary that A is a maximal independent subset of M
and there exists another larger maximal independent subset B of M
...

As an illustration of this theorem, consider a graphic matroid MG for a connected, undirected graph G
...
Such a tree
is called a spanning tree of G
...
S; « / is weighted if it is associated with a weight
function w that assigns a strictly positive weight w
...
The
weight function w extends to subsets of S by summation:
X
w
...
A/ D
x2A

for any A Â S
...
e/ denote the weight of an edge e in a
graphic matroid MG , then w
...

Greedy algorithms on a weighted matroid
Many problems for which a greedy approach provides optimal solutions can be formulated in terms of finding a maximum-weight independent subset in a weighted
matroid
...
S; « /, and we wish to
find an independent set A 2 « such that w
...
We call such a subset that is independent and has maximum possible weight an optimal subset of the
matroid
...
x/ of any element x 2 S is positive, an optimal
subset is always a maximal independent subset—it always helps to make A as large
as possible
...
V; E/ and a length function w such that w
...
(We use the term “length” here to refer to the original edge
weights for the graph, reserving the term “weight” to refer to the weights in the
associated matroid
...
To view this as a problem of
finding an optimal subset of a matroid, consider the weighted matroid MG with
weight function w 0 , where w 0
...
e/ and w0 is larger than the maximum
length of any edge
...
More
specifically, each maximal independent subset A corresponds to a spanning tree

440

Chapter 16 Greedy Algorithms

1 edges, and since
X
w 0
...
A/ D
with jV j

e2A

X

...
e//

e2A

D
...
e/

e2A

D
...
A/

for any maximal independent subset A, an independent subset that maximizes the
quantity w 0
...
A/
...

Chapter 23 gives algorithms for the minimum-spanning-tree problem, but here
we give a greedy algorithm that works for any weighted matroid
...
S; « / with an associated positive weight
function w, and it returns an optimal subset A
...
The algorithm
is greedy because it considers in turn each element x 2 S, in order of monotonically decreasing weight, and immediately adds it to the set A being accumulated if
A [ fxg is independent
...
M; w/
1 AD;
2 sort M:S into monotonically decreasing order by weight w
3 for each x 2 M:S, taken in monotonically decreasing order by weight w
...
If A would remain independent, then line 5 adds x to A
...
Since the empty set is independent, and since each iteration of the for
loop maintains A’s independence, the subset A is always independent, by induction
...
We shall see in
a moment that A is a subset of maximum possible weight, so that A is an optimal
subset
...
Let n denote jSj
...
n lg n/
...
Each execution of line 4 requires a check on whether or not
the set A [ fxg is independent
...
f
...
n lg n C nf
...


16
...

Lemma 16
...
S; « / is a weighted matroid with weight function w and that S
is sorted into monotonically decreasing order by weight
...
If x exists, then there exists
an optimal subset A of S that contains x
...
Otherwise, let B be any nonempty optimal subset
...

No element of B has weight greater than w
...
To see why, observe that y 2 B
implies that fyg is independent, since B 2 « and « is hereditary
...
x/ w
...

Construct the set A as follows
...
By the choice of x, set A is
independent
...
At that
point, A and B are the same except that A has x and B has some other element y
...
A/ D w
...
y/ C w
...
B/ :
Because set B is optimal, set A, which contains x, must also be optimal
...

Lemma 16
...
S; « / be any matroid
...

Proof Since x is an extension of A, we have that A [ fxg is independent
...
Thus, x is an extension of ;
...
9
Let M D
...
If x is an element of S such that x is not an
extension of ;, then x is not an extension of any independent subset A of S
...
8
...
9 says that any element that cannot be used immediately can never
be used
...

Lemma 16
...
S; « /
...
S 0 ; « 0 /, where
S 0 D fy 2 S W fx; yg 2 « g ;
« 0 D fB Â S fxg W B [ fxg 2 « g ;
and the weight function for M 0 is the weight function for M , restricted to S 0
...
)
Proof If A is any maximum-weight independent subset of M containing x, then
A0 D A fxg is an independent subset of M 0
...
Since we have in
both cases that w
...
A0 / C w
...

Theorem 16
...
S; « / is a weighted matroid with weight function w, then G REEDY
...

Proof By Corollary 16
...
Once G REEDY selects the first element x, Lemma 16
...
Finally, Lemma 16
...

After the procedure G REEDY sets A to fxg, we can interpret all of its remaining
steps as acting in the matroid M 0 D
...
Thus, the subsequent
operation of G REEDY will find a maximum-weight independent subset for M 0 , and
the overall operation of G REEDY will find a maximum-weight independent subset
for M
...
5 A task-scheduling problem as a matroid

443

Exercises
16
...
S; « k / is a matroid, where S is any finite set and « k is the set of all
subsets of S of size at most k, where k Ä jSj
...
4-2 ?
Given an m n matrix T over some field (such as the reals), show that
...

16
...
S; « / is a matroid, then
...
S; « 0 / are just the complements of the
maximal independent sets of
...

16
...
Define the structure
...
Show that
...
That is, the set of all sets A
that contain at most one member of each subset in the partition determines the
independent sets of a matroid
...
4-5
Show how to transform the weight function of a weighted matroid problem, where
the desired optimal solution is a minimum-weight maximal independent subset, to
make it a standard weighted-matroid problem
...


? 16
...
The problem
looks complicated, but we can solve it in a surprisingly simple manner by casting
it as a matroid and using a greedy algorithm
...
Given a finite set S of unit-time tasks, a

444

Chapter 16 Greedy Algorithms

schedule for S is a permutation of S specifying the order in which to perform
these tasks
...

The problem of scheduling unit-time tasks with deadlines and penalties for a
single processor has the following inputs:
a set S D fa1 ; a2 ; : : : ; an g of n unit-time tasks;
a set of n integer deadlines d1 ; d2 ; : : : ; dn , such that each di satisfies 1 Ä di Ä n
and task ai is supposed to finish by time di ; and
a set of n nonnegative weights or penalties w1 ; w2 ; : : : ; wn , such that we incur
a penalty of wi if task ai is not finished by time di , and we incur no penalty if
a task finishes by its deadline
...

Consider a given schedule
...
Otherwise, the task is early in the schedule
...
To see why, note that if some early task ai follows some late task aj ,
then we can switch the positions of ai and aj , and ai will still be early and aj will
still be late
...
To do so, we put
the schedule into early-first form
...
Since aj is early before the swap, k C 1 Ä dj
...
Because task aj is
moved earlier in the schedule, it remains early after the swap
...
Having determined A, we can create
the actual schedule by listing the elements of A in order of monotonically increasing deadlines, then listing the late tasks (i
...
, S A) in any order, producing a
canonical ordering of the optimal schedule
...
Clearly, the set of early tasks for a schedule forms
an independent set of tasks
...

Consider the problem of determining whether a given set A of tasks is independent
...
A/ denote the number of tasks in A whose
deadline is t or earlier
...
A/ D 0 for any set A
...
5 A task-scheduling problem as a matroid

445

Lemma 16
...

1
...

2
...
A/ Ä t
...
If the tasks in A are scheduled in order of monotonically increasing deadlines,
then no task is late
...
A/ > t for
some t, then there is no way to make a schedule with no late tasks for set A, because
more than t tasks must finish before time t
...
If (2) holds,
then (3) must follow: there is no way to “get stuck” when scheduling the tasks in
order of monotonically increasing deadlines, since (2) implies that the ith largest
deadline is at least i
...

Using property 2 of Lemma 16
...
5-2)
...
The
following theorem thus ensures that we can use the greedy algorithm to find an
independent set A of tasks with the maximum total penalty
...
13
If S is a set of unit-time tasks with deadlines, and « is the set of all independent
sets of tasks, then the corresponding system
...

Proof Every subset of an independent set of tasks is certainly independent
...
Let k be the largest t such that N t
...
A/
...
A/ D N0
...
) Since Nn
...
A/ D jAj,
but jBj > jAj, we must have that k < n and that Nj
...
A/ for all j in
the range k C 1 Ä j Ä n
...
Let ai be a task in B A with deadline k C 1
...

We now show that A0 must be independent by using property 2 of Lemma 16
...

For 0 Ä t Ä k, we have N t
...
A/ Ä t, since A is independent
...
A0 / Ä N t
...
Therefore, A0
is independent, completing our proof that
...

By Theorem 16
...
We can then create an optimal schedule having the
tasks in A as its early tasks
...
7 An instance of the problem of scheduling unit-time tasks with deadlines and penalties
for a single processor
...
The running
time is O
...
n/ independence checks made
by that algorithm takes time O
...
5-2)
...

Figure 16
...
In this example, the
greedy algorithm selects, in order, tasks a1 , a2 , a3 , and a4 , then rejects a5 (because
N4
...
fa1 ; a2 ; a3 ; a4 ; a6 g/ D 5), and
finally accepts a7
...

Exercises
16
...
7, but with each
penalty wi replaced by 80 wi
...
5-2
Show how to use property 2 of Lemma 16
...
jAj/ whether
or not a given set A of tasks is independent
...
Assume that each coin’s value is an integer
...
Describe a greedy algorithm to make change consisting of quarters, dimes,
nickels, and pennies
...


Problems for Chapter 16

447

b
...

i
...
, the denominations are c 0 ; c 1 ; : : : ; c k for some integers c > 1 and k
Show that the greedy algorithm always yields an optimal solution
...
Give a set of coin denominations for which the greedy algorithm does not yield
an optimal solution
...

d
...
nk/-time algorithm that makes change for any set of k different coin
denominations, assuming that one of the coins is a penny
...
You have one
computer on which to run these tasks, and the computer can run only one task at a
time
...
P goal is to minimize the average completion time, that is,
Your
n
to minimize
...
For example, suppose there are two tasks, a1 and a2 ,
with p1 D 3 and p2 D 5, and consider the schedule in which a2 runs first, followed
by a1
...
5 C 8/=2 D 6:5
...
3 C 8/=2 D 5:5
...
Give an algorithm that schedules the tasks so as to minimize the average completion time
...
Prove that your algorithm minimizes
the average completion time, and state the running time of your algorithm
...
Suppose now that the tasks are not all available at once
...
Suppose also that we allow preemption, so
that a task can be suspended and restarted at a later time
...
It might then resume at time 10 but be
preempted at time 11, and it might finally resume at time 13 and complete at
time 15
...
In this scenario, ai ’s completion time is 15
...
Prove that your algorithm minimizes the average
completion time, and state the running time of your algorithm
...
The incidence matrix for an undirected graph G D
...
Argue that a set of columns of M is linearly independent over the field
of integers modulo 2 if and only if the corresponding set of edges is acyclic
...
4-2 to provide an alternate proof that
...

b
...
e/ with each edge in an
undirected graph G D
...
Give an efficient algorithm to find an acyclic
subset of E of maximum total weight
...
Let G
...
E; « / be defined so that
A 2 « if and only if A does not contain any directed cycles
...
E; « / is not a matroid
...

d
...
V; E/ with no self-loops is a
jV j jEj matrix M such that M e D 1 if edge e leaves vertex , M e D 1 if
edge e enters vertex , and M e D 0 otherwise
...

e
...
4-2 tells us that the set of linearly independent sets of columns of
any matrix M forms a matroid
...
How can there fail to be a perfect correspondence between the notion of a set of edges being acyclic and the notion of the
associated set of columns of the incidence matrix being linearly independent?
16-4 Scheduling variations
Consider the following algorithm for the problem from Section 16
...
Let all n time slots be initially empty,
where time slot i is the unit-length slot of time that finishes at time i
...
When considering task aj ,
if there exists a time slot at or before aj ’s deadline dj that is still empty, assign aj
to the latest such slot, filling it
...

a
...

b
...
3 to implement the algorithm efficiently
...
Analyze the running time of your
implementation
...

Even though a program may access large amounts of data, by storing a small subset
of the main memory in the cache—a small but faster memory—overall access time
can greatly decrease
...
For example, a program that accesses 4 distinct elements fa; b; c; d g
might make the sequence of requests hd; b; d; b; d; a; c; d; b; a; c; bi
...
When the cache contains k elements and the program requests the

...
More precisely, for each request ri , the
cache-management algorithm checks whether element ri is already in the cache
...
Upon a cache
miss, the system retrieves ri from the main memory, and the cache-management
algorithm must decide whether to keep ri in the cache
...
The cache-management algorithm evicts data with the goal of minimizing
the number of cache misses over the entire sequence of requests
...
That is, we have to make decisions
about which data to keep in the cache without knowing the future requests
...

We can solve this off-line problem by a greedy strategy called furthest-in-future,
which chooses to evict the item in the cache whose next access in the request
sequence comes furthest in the future
...
Write pseudocode for a cache manager that uses the furthest-in-future strategy
...
What is the running time of your algorithm?
b
...

c
...


450

Chapter 16 Greedy Algorithms

Chapter notes
Much more material on greedy algorithms and matroids can be found in Lawler
[224] and Papadimitriou and Steiglitz [271]
...

Our proof of the correctness of the greedy algorithm for the activity-selection
problem is based on that of Gavril [131]
...

Huffman codes were invented in 1952 [185]; Lelewer and Hirschberg [231] surveys data-compression techniques known as of 1987
...

a

17

Amortized Analysis

In an amortized analysis, we average the time required to perform a sequence of
data-structure operations over all the operations performed
...
Amortized analysis differs from average-case analysis in that probability is not involved; an amortized analysis guarantees the average performance
of each operation in the worst case
...
Section 17
...
n/ on the total cost of a sequence of n operations
...
n/=n
...

Section 17
...
When there is more than one type of operation, each type of
operation may have a different amortized cost
...
Later in the sequence, the credit pays for
operations that are charged less than they actually cost
...
3 discusses the potential method, which is like the accounting method
in that we determine the amortized cost of each operation and may overcharge operations early on to compensate for undercharges later
...

We shall use two examples to examine these three methods
...
The
other is a binary counter that counts up from 0 by means of the single operation
I NCREMENT
...
They need not—and should
not—appear in the code
...

When we perform an amortized analysis, we often gain insight into a particular
data structure, and this insight can help us optimize the design
...
4,
for example, we shall use the potential method to analyze a dynamically expanding
and contracting table
...
1 Aggregate analysis
In aggregate analysis, we show that for all n, a sequence of n operations takes
worst-case time T
...
In the worst case, the average cost, or amortized
cost, per operation is therefore T
...
Note that this amortized cost applies to
each operation, even when there are several types of operations in the sequence
...

Stack operations
In our first example of aggregate analysis, we analyze stacks that have been augmented with a new operation
...
1 presented the two fundamental stack
operations, each of which takes O
...
S; x/ pushes object x onto stack S
...
S/ pops the top of stack S and returns the popped object
...

Since each of these operations runs in O
...
The total cost of a sequence of n P USH and P OP operations is therefore n,
and the actual running time for n operations is therefore ‚
...

Now we add the stack operation M ULTIPOP
...

Of course, we assume that k is positive; otherwise the M ULTIPOP operation leaves
the stack unchanged
...


17
...
1 The action of M ULTIPOP on a stack S, shown initially in (a)
...
S; 4/, whose result is shown in (b)
...
S; 7/,
which empties the stack—shown in (c)—since there were fewer than 7 objects remaining
...
S; k/
1 while not S TACK -E MPTY
...
S/
3
k Dk 1
Figure 17
...

What is the running time of M ULTIPOP
...
The number of iterations of the while loop is the number min
...
Each iteration of the loop makes one call to P OP in
line 2
...
s; k/, and the actual running time
is a linear function of this cost
...
The worst-case cost of a M ULTIPOP operation in the sequence
is O
...
The worst-case time of any stack operation is therefore O
...
n2 /, since we
may have O
...
n/ each
...
n2 / result, which we obtained by considering the worst-case cost
of each operation individually, is not tight
...
In fact, although a single M ULTIPOP operation
can be expensive, any sequence of n P USH, P OP, and M ULTIPOP operations on an
initially empty stack can cost at most O
...
Why? We can pop each object from the
stack at most once for each time we have pushed it onto the stack
...
For any
value of n, any sequence of n P USH, P OP, and M ULTIPOP operations takes a total
of O
...
The average cost of an operation is O
...
1/
...
In
this example, therefore, all three stack operations have an amortized cost of O
...

We emphasize again that although we have just shown that the average cost, and
hence the running time, of a stack operation is O
...
We actually showed a worst-case bound of O
...
Dividing this total cost by n yielded the average cost per operation, or
the amortized cost
...
We use an array AŒ0 : : k 1 of
bits, where A:length D k, as the counter
...
Initially, x D 0, and thus AŒi D 0 for i D 0; 1; : : : ; k 1
...

I NCREMENT
...
2 shows what happens to a binary counter as we increment it 16 times,
starting with the initial value 0 and ending with the value 16
...

If AŒi D 1, then adding 1 flips the bit to 0 in position i and yields a carry of 1,
to be added into position i C 1 on the next iteration of the loop
...
The cost of each I NCREMENT operation is linear
in the number of bits flipped
...
A single execution of I NCREMENT takes time ‚
...
Thus, a sequence of n I NCREMENT operations on
an initially zero counter takes time O
...

We can tighten our analysis to yield a worst-case cost of O
...
As Figure 17
...

The next bit up, AŒ1, flips only every other time: a sequence of n I NCREMENT

Counter
value
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

455

A[
7
A[ ]
6
A[ ]
5
A[ ]
4
A[ ]
3]
A[
2
A[ ]
1]
A[
0]

17
...
2 An 8-bit binary counter as its value goes from 0 to 16 by a sequence of 16 I NCREMENT
operations
...
The running cost for flipping bits is
shown at the right
...


operations on an initially zero counter causes AŒ1 to flip bn=2c times
...
In general, for i D 0; 1; : : : ; k 1, bit AŒi flips bn=2i c times in a
k,
sequence of n I NCREMENT operations on an initially zero counter
...
The total number of flips in the
sequence is thus
k 1
Xj n k
i D0

2i

< n

1
X 1
2i
i D0

D 2n ;
by equation (A
...
The worst-case time for a sequence of n I NCREMENT operations
on an initially zero counter is therefore O
...
The average cost of each operation,
and therefore the amortized cost per operation, is O
...
1/
...
1-1
If the set of stack operations included a M ULTIPUSH operation, which pushes k
items onto the stack, would the O
...
1-2
Show that if a D ECREMENT operation were included in the k-bit counter example,
n operations could cost as much as ‚
...

17
...
Use aggregate analysis
to determine the amortized cost per operation
...
2 The accounting method
In the accounting method of amortized analysis, we assign differing charges to
different operations, with some operations charged more or less than they actually cost
...
When
an operation’s amortized cost exceeds its actual cost, we assign the difference to
specific objects in the data structure as credit
...
Thus, we can view the
amortized cost of an operation as being split between its actual cost and credit that
is either deposited or used up
...
This method differs from aggregate analysis, in which all operations have
the same amortized cost
...
If we want to show
that in the worst case the average cost per operation is small by analyzing with
amortized costs, we must ensure that the total amortized cost of a sequence of operations provides an upper bound on the total actual cost of the sequence
...
If we denote the actual cost of the ith operation by ci and the amortized cost
of the ith operation by ci , we require
y
n
n
X
X
ci
y
ci
(17
...
The total credit stored in the data structure
is the difference between the total amortized cost and the total actual cost, or

17
...
By inequality (17
...
If we ever were to allow the total credit
to become negative (the result of undercharging early operations with the promise
of repaying the account later on), then the total amortized costs incurred at that
time would be below the total actual costs incurred; for the sequence of operations
up to that time, the total amortized cost would not be an upper bound on the total
actual cost
...

Pn

i D1

Stack operations
To illustrate the accounting method of amortized analysis, let us return to the stack
example
...
k; s/ ,

where k is the argument supplied to M ULTIPOP and s is the stack size when it is
called
...


Note that the amortized cost of M ULTIPOP is a constant (0), whereas the actual cost
is variable
...
In general, the amortized
costs of the operations under consideration may differ from each other, and they
may even differ asymptotically
...
Suppose we use a dollar bill to represent each unit
of cost
...
Recall the analogy of Section 10
...
When we push a plate
on the stack, we use 1 dollar to pay the actual cost of the push and are left with a
credit of 1 dollar (out of the 2 dollars charged), which we leave on top of the plate
...

The dollar stored on the plate serves as prepayment for the cost of popping it
from the stack
...
To pop a plate, we take
the dollar of credit off the plate and use it to pay the actual cost of the operation
...


458

Chapter 17 Amortized Analysis

Moreover, we can also charge M ULTIPOP operations nothing
...
To pop a second plate, we again have a dollar of credit on the plate
to pay for the P OP operation, and so on
...
In other words, since each plate on the
stack has 1 dollar of credit on it, and the stack always has a nonnegative number of
plates, we have ensured that the amount of credit is always nonnegative
...
Since the total amortized cost is O
...

Incrementing a binary counter
As another illustration of the accounting method, we analyze the I NCREMENT operation on a binary counter that starts at zero
...
Let us once again use a dollar bill to represent
each unit of cost (the flipping of a bit in this example)
...
When a bit is set, we use 1 dollar (out of the 2 dollars charged) to pay
for the actual setting of the bit, and we place the other dollar on the bit as credit to
be used later when we flip the bit back to 0
...

Now we can determine the amortized cost of I NCREMENT
...
The
I NCREMENT procedure sets at most one bit, in line 6, and therefore the amortized
cost of an I NCREMENT operation is at most 2 dollars
...
Thus, for n I NCREMENT operations, the total amortized cost is O
...

Exercises
17
...
After every k operations, we make a copy of the entire stack for backup
purposes
...
n/ by assigning suitable amortized costs to the various stack operations
...
3 The potential method

459

17
...
1-3 using an accounting method of analysis
...
2-3
Suppose we wish not only to increment a counter but also to reset it to zero (i
...
,
make all bits in it 0)
...
1/,
show how to implement a counter as an array of bits so that any sequence of n
I NCREMENT and R ESET operations takes time O
...

(Hint: Keep a pointer to the high-order 1
...
3 The potential method
Instead of representing prepaid work as credit stored with specific objects in the
data structure, the potential method of amortized analysis represents the prepaid
work as “potential energy,” or just “potential,” which can be released to pay for
future operations
...

The potential method works as follows
...
For each i D 1; 2; : : : ; n, we let ci be the actual
cost of the ith operation and Di be the data structure that results after applying
the ith operation to data structure Di 1
...
Di /, which is the potential associated with data
y
structure Di
...
Di /
y

ˆ
...
2)

The amortized cost of each operation is therefore its actual cost plus the change in
potential due to the operation
...
2), the total amortized cost of the n
operations is
n
X

ci
y

n
X
D

...
Di /

i D1

D

i D1
n
X

ci C ˆ
...
Di 1 //
ˆ
...
3)

i D1

The second equality follows from equation (A
...
Di / terms telescope
...
Dn / ˆ
...


460

Chapter 17 Amortized Analysis

In practice, we do not always know how many operations might be performed
...
D0 / for all i, then we guarantee, as in
Therefore, if we require that ˆ
...
We usually just define ˆ
...
Di / 0 for all i
...
3-1 for an easy way
to handle cases in which ˆ
...
)
Intuitively, if the potential difference ˆ
...
Di 1 / of the ith operation is
positive, then the amortized cost ci represents an overcharge to the ith operation,
y
and the potential of the data structure increases
...

The amortized costs defined by equations (17
...
3) depend on the choice
of the potential function ˆ
...
We often find trade-offs
that we can make in choosing a potential function; the best potential function to
use depends on the desired time bounds
...
We define the potential function ˆ on a
stack to be the number of objects in the stack
...
D0 / D 0
...
Di /

0
D ˆ
...

Let us now compute the amortized costs of the various stack operations
...
Di /

ˆ
...
s C 1/
D 1:

s

By equation (17
...
Di /
D 1C1
D 2:

ˆ
...
3 The potential method

461

Suppose that the ith operation on the stack is M ULTIPOP
...
k; s/ objects to be popped off the stack
...
Di /

ˆ
...
Di /
D k0 k0
D 0:

ˆ
...

The amortized cost of each of the three operations is O
...
n/
...
Di / ˆ
...
The worst-case cost of n operations is therefore O
...

Incrementing a binary counter
As another example of the potential method, we again look at incrementing a binary
counter
...

Let us compute the amortized cost of an I NCREMENT operation
...
The actual cost of the operation is
therefore at most ti C 1, since in addition to resetting ti bits, it sets at most one
bit to 1
...

If bi > 0, then bi D bi 1 ti C 1
...
Di /

ˆ
...
bi 1 ti C 1/
D 1 ti :

bi

1

The amortized cost is therefore
ci
y

D ci C ˆ
...
Di 1 /
Ä
...
1 ti /
D 2:

If the counter starts at zero, then ˆ
...
Since ˆ
...
n/
...
The counter starts with b0 1s, and after n I NCREMENT

462

Chapter 17 Amortized Analysis

operations it has bn 1s, where 0 Ä b0 ; bn Ä k
...
) We can rewrite equation (17
...
Dn / C ˆ
...
4)

i D1

We have ci Ä 2 for all 1 Ä i Ä n
...
D0 / D b0 and ˆ
...
n/, the total actual cost
is O
...
In other words, if we execute at least n D
...
n/, no matter what initial value the counter contains
...
3-1
ˆ
...
Di /
ˆ
...
Show that there exists a potential function ˆ0 such that ˆ0
...
Di /
0 for all i
1, and the amortized costs using ˆ0 are the same as the
amortized costs using ˆ
...
3-2
Redo Exercise 17
...

17
...
lg n/ worst-case time
...
lg n/ and the
amortized cost of E XTRACT-M IN is O
...

17
...
3-5
Suppose that a counter begins at a number with b 1s in its binary representation, rather than at 0
...
n/ if n D
...
(Do not assume that b is constant
...
4 Dynamic tables

463

17
...
1-6) so that
the amortized cost of each E NQUEUE and each D EQUEUE operation is O
...

17
...
S; x/ inserts x into S
...
S/ deletes the largest djSj =2e elements from S
...
m/ time
...
jSj/ time
...
4 Dynamic tables
We do not always know in advance how many objects some applications will store
in a table
...
We must then reallocate the table with a larger size and copy all objects
stored in the original table over into the new, larger table
...
In this section, we study this problem of dynamically expanding and
contracting a table
...
1/, even though the actual cost of an operation
is large when it triggers an expansion or a contraction
...

We assume that the dynamic table supports the operations TABLE -I NSERT and
TABLE -D ELETE
...
Likewise, TABLE -D ELETE removes an item
from the table, thereby freeing a slot
...
1),
a heap (Chapter 6), or a hash table (Chapter 11)
...
3
...
We define the load factor ˛
...
We assign an empty table (one with no items) size 0, and we define its load
factor to be 1
...

We start by analyzing a dynamic table in which we only insert items
...

17
...
1

Table expansion

Let us assume that storage for a table is allocated as an array of slots
...
1 In some
software environments, upon attempting to insert an item into a full table, the only
alternative is to abort with an error
...
Thus, upon inserting an item
into a full table, we can expand the table by allocating a new table with more slots
than the old table had
...

A common heuristic allocates a new table with twice as many slots as the old
one
...

In the following pseudocode, we assume that T is an object representing the
table
...
Initially, the table is empty: T:num D T:size D 0
...
T; x/
1 if T:size == 0
2
allocate T:table with 1 slot
3
T:size D 1
4 if T:num == T:size
5
allocate new-table with 2 T:size slots
6
insert all items in T:table into new-table
7
free T:table
8
T:table D new-table
9
T:size D 2 T:size
10 insert x into T:table
11 T:num D T:num C 1

1 In

some situations, such as an open-address hash table, we may wish to consider a table to be full if
its load factor equals some constant strictly less than 1
...
4-1
...
4 Dynamic tables

465

Notice that we have two “insertion” procedures here: the TABLE -I NSERT procedure itself and the elementary insertion into a table in lines 6 and 10
...
We assume that
the actual running time of TABLE -I NSERT is linear in the time to insert individual
items, so that the overhead for allocating an initial table in line 2 is constant and
the overhead for allocating and freeing storage in lines 5 and 7 is dominated by
the cost of transferring items in line 6
...

Let us analyze a sequence of n TABLE -I NSERT operations on an initially empty
table
...
If the current table is full, however, and an
expansion occurs, then ci D i: the cost is 1 for the elementary insertion in line 10
plus i 1 for the items that we must copy from the old table to the new table in
line 6
...
n/,
which leads to an upper bound of O
...

This bound is not tight, because we rarely expand the table in the course of n
TABLE -I NSERT operations
...
The amortized cost of an operation is in
fact O
...
The cost of the ith operation
is
(
i if i 1 is an exact power of 2 ;
ci D
1 otherwise :
The total cost of n TABLE -I NSERT operations is therefore
n
X
i D1

X

blg nc

ci

Ä nC

2j

j D0

< n C 2n
D 3n ;
because at most n operations cost 1 and the costs of the remaining operations form
a geometric series
...

By using the accounting method, we can gain some feeling for why the amortized cost of a TABLE -I NSERT operation should be 3
...
For example, suppose that the size of the table is m
immediately after an expansion
...
We charge 3 dollars for each insertion
...
We place another dollar as credit on the item
inserted
...
The table will not fill again until we have inserted another m=2 1 items,
and thus, by the time the table contains m items and is full, we will have placed a
dollar on each item to pay to reinsert it during the expansion
...
4
...
1/ amortized cost as well
...

The function
ˆ
...
5)

is one possibility
...
T / D 0, as desired
...
T / D T:num, as desired
...
T / is always nonnegative
...

To analyze the amortized cost of the ith TABLE -I NSERT operation, we let numi
denote the number of items stored in the table after the ith operation, sizei denote
the total size of the table after the ith operation, and ˆi denote the potential after
the ith operation
...

If the ith TABLE -I NSERT operation does not trigger an expansion, then we have
sizei D sizei 1 and the amortized cost of the operation is
ci
y

D
D
D
D

ci C ˆi ˆi
1 C
...
2 numi
3:

1

sizei /
sizei /


...
2
...
numi 1/
...
2 numi sizei /
...
2 numi 2
...
2
...
numi 1/
3:


...
4 Dynamic tables

467

32

sizei

24

16

numi

Φi

8

i

0
0

8

16

24

32

Figure 17
...
The thin line shows numi , the dashed line shows sizei , and
the thick line shows ˆi
...

Afterwards, the potential drops to 0, but it is immediately increased by 2 upon inserting the item that
caused the expansion
...
3 plots the values of numi , sizei , and ˆi against i
...

17
...
2

Table expansion and contraction

To implement a TABLE -D ELETE operation, it is simple enough to remove the specified item from the table
...
Table
contraction is analogous to table expansion: when the number of items in the table
drops too low, we allocate a new, smaller table and then copy the items from the
old table into the new one
...
Ideally, we would like to preserve two
properties:
the load factor of the dynamic table is bounded below by a positive constant,
and
the amortized cost of a table operation is bounded above by a constant
...

You might think that we should double the table size upon inserting an item into
a full table and halve the size when a deleting an item would cause the table to
become less than half full
...
Consider the following scenario
...
The first n=2 operations are
insertions, which by our previous analysis cost a total of ‚
...
At the end of this
sequence of insertions, T:num D T:size D n=2
...

The first insertion causes the table to expand to size n
...
Two further insertions cause another
expansion, and so forth
...
n/, and
there are ‚
...
Thus, the total cost of the n operations is ‚
...
n/
...
Likewise, after contracting the table,
we do not insert enough items to pay for an expansion
...
Specifically, we continue to double the table size upon inserting
an item into a full table, but we halve the table size when deleting an item causes
the table to become less than 1=4 full, rather than 1=2 full as before
...

Intuitively, we would consider a load factor of 1=2 to be ideal, and the table’s
potential would then be 0
...

Thus, we will need a potential function that has grown to T:num by the time that
the load factor has either increased to 1 or decreased to 1=4
...

We omit the code for TABLE -D ELETE, since it is analogous to TABLE -I NSERT
...
That is, if T:num D 0, then T:size D 0
...
We start by defining a potential function ˆ that is 0 immediately after an expansion or contraction and builds
as the load factor increases to 1 or decreases to 1=4
...
4 Dynamic tables

469

32

24

sizei

16

numi

8
Φi
0

0

8

16

24

32

40

48

i

Figure 17
...
The thin line shows numi , the dashed line shows sizei , and
the thick line shows ˆi
...

Likewise, immediately before a contraction, the potential has built up to the number of items in the
table
...
T / D T:num=T:size
...
T / D 1, we always have T:num D ˛
...
We shall use as our potential function
(
2 T:num T:size if ˛
...
T / D
(17
...
T / < 1=2 :
Observe that the potential of an empty table is 0 and that the potential is never
negative
...

Before proceeding with a precise analysis, we pause to observe some properties
of the potential function, as illustrated in Figure 17
...
Notice that when the load
factor is 1=2, the potential is 0
...
T / D T:num, and thus the potential can pay for an expansion if
an item is inserted
...
T / D T:num, and thus the potential can pay for a contraction if an item
is deleted
...
Initially, num0 D 0, size0 D 0, ˛0 D 1, and ˆ0 D 0
...
The analysis is identical to that for table expansion in Section 17
...
1 if ˛i 1 1=2
...

y
If ˛i 1 < 1=2, the table cannot expand as a result of the operation, since the table expands only when ˛i 1 D 1
...
sizei =2 numi /
1 C
...
sizei 1 =2 numi 1 /

...
numi 1//

1=2, then

D ci C ˆi ˆi 1
D 1 C
...
sizei 1 =2 numi 1 /
D 1 C
...
numi 1 C 1/ sizei 1 /
...

We now turn to the case in which the ith operation is TABLE -D ELETE
...
If ˛i 1 < 1=2, then we must consider whether the
operation causes the table to contract
...
sizei =2 numi /
1 C
...
sizei 1 =2 numi 1 /

...
numi C 1//

17
...

We have sizei =2 D sizei 1 =4 D numi 1 D numi C 1, and the amortized cost of
the operation is
ci
y

D
D
D
D

ci C ˆi ˆi 1

...
sizei =2 numi /
...
numi C 1/ C
...
2 numi C 2/
...
We leave the analysis as Exercise 17
...

In summary, since the amortized cost of each operation is bounded above by
a constant, the actual time for any sequence of n operations on a dynamic table
is O
...

Exercises
17
...
Why
might we consider the table to be full when its load factor reaches some value ˛
that is strictly less than 1? Describe briefly how to make insertion into a dynamic,
open-address hash table run in such a way that the expected value of the amortized
cost per insertion is O
...
Why is the expected value of the actual cost per insertion
not necessarily O
...
4-2
1=2 and the ith operation on a dynamic table is TABLE Show that if ˛i 1
D ELETE, then the amortized cost of the operation with respect to the potential
function (17
...

17
...
Using the potential function
ˆ
...


472

Chapter 17 Amortized Analysis

Problems
17-1 Bit-reversed binary counter
Chapter 30 examines an important algorithm called the fast Fourier transform,
or FFT
...

This permutation swaps elements whose indices have binary representations that
are the reverse of each other
...
We define
revk
...
a/ D

k 1
X

ak

i
i 12

:

i D0

For example, if n D 16 (or, equivalently, k D 4), then revk
...

a
...
k/ time, write an algorithm to perform the
bit-reversal permutation on an array of length n D 2k in O
...

We can use an algorithm based on an amortized analysis to improve the running
time of the bit-reversal permutation
...
revk
...
If k D 4, for example, and the bit-reversed
counter starts at 0, then successive calls to B IT-R EVERSED -I NCREMENT produce
the sequence
0000; 1000; 0100; 1100; 0010; 1010; : : : D 0; 8; 4; 12; 2; 10; : : : :
b
...
Describe
an implementation of the B IT-R EVERSED -I NCREMENT procedure that allows
the bit-reversal permutation on an n-element array to be performed in a total
of O
...

c
...
Is it
still possible to implement an O
...
We can improve the time for
insertion by keeping several sorted arrays
...
Let k D dlg
...
We have k sorted arrays A0 ; A1 ; : : : ; Ak 1 , where for
i D 0; 1; : : : ; k 1, the length of array Ai is 2i
...
The total number of elePk 1
ments held in all k arrays is therefore i D0 ni 2i D n
...

a
...
Analyze
its worst-case running time
...
Describe how to perform the I NSERT operation
...

c
...

17-3 Amortized weight-balanced trees
Consider an ordinary binary search tree augmented by adding to each node x the
attribute x:size giving the number of keys stored in the subtree rooted at x
...
We say that a given node x is ˛-balanced
if x:left:size Ä ˛ x:size and x:right:size Ä ˛ x:size
...
The following amortized
approach to maintaining weight-balanced trees was suggested by G
...

a
...
Given a node x
in an arbitrary binary search tree, show how to rebuild the subtree rooted at x
so that it becomes 1=2-balanced
...
x:size/,
and it can use O
...

b
...
lg n/ worst-case time
...
Suppose that we implement I NSERT and D ELETE as usual for an n-node
binary search tree, except that after every such operation, if any node in the tree
is no longer ˛-balanced, then we “rebuild” the subtree rooted at the highest such
node in the tree so that it becomes 1=2-balanced
...
For a node x
in a binary search tree T , we define

...
x/ ;
ˆ
...
x/ 2

where c is a sufficiently large constant that depends on ˛
...
Argue that any binary search tree has nonnegative potential and that a 1=2balanced tree has potential 0
...
Suppose that m units of potential can pay for rebuilding an m-node subtree
...
1/ amortized time
to rebuild a subtree that is not ˛-balanced?
e
...
lg n/ amortized time
...
We have
seen that RB-I NSERT and RB-D ELETE use only O
...

a
...
n C 1/st node causes
...
Then describe a legal
red-black tree with n nodes for which calling RB-D ELETE on a particular node
causes
...

Although the worst-case number of color changes per operation can be logarithmic,
we shall prove that any sequence of m RB-I NSERT and RB-D ELETE operations on
an initially empty red-black tree causes O
...
Note that we count each color change as a structural modification
...
Some of the cases handled by the main loop of the code of both RB-I NSERTF IXUP and RB-D ELETE -F IXUP are terminating: once encountered, they cause
the loop to terminate after a constant number of additional operations
...
(Hint: Look at Figures 13
...
6, and 13
...
)

Problems for Chapter 17

475

We shall first analyze the structural modifications when only insertions are performed
...
T / to be the number of red nodes
in T
...

c
...
Argue that
ˆ
...
T / 1
...
When we insert a node into a red-black tree using RB-I NSERT, we can break
the operation into three parts
...

e
...
1/
...
m/ structural modifications when there are
both insertions and deletions
...
x/ D

if x is red ;
1 if x is black and has no red children ;
0 if x is black and has one red child ;
2 if x is black and has two red children :

Now we redefine the potential of a red-black tree T as
X
w
...
T / D
x2T

and let T 0 be the tree that results from applying any nonterminating case of RBI NSERT-F IXUP or RB-D ELETE -F IXUP to T
...
Show that ˆ
...
T / 1 for all nonterminating cases of RB-I NSERTF IXUP
...
1/
...
Show that ˆ
...
T / 1 for all nonterminating cases of RB-D ELETE F IXUP
...
1/
...
Complete the proof that in the worst case, any sequence of m RB-I NSERT and
RB-D ELETE operations performs O
...


476

Chapter 17 Amortized Analysis

17-5 Competitive analysis of self-organizing lists with move-to-front
A self-organizing list is a linked list of n elements, in which each element has a
unique key
...

A self-organizing list has two important properties:
1
...
If that element is
the kth element from the start of the list, then the cost to find the element is k
...
We may reorder the list elements after any operation, according to a given rule
with a given cost
...

Assume that we start with a given list of n elements, and we are given an access
sequence D h 1 ; 2 ; : : : ; m i of keys to find, in order
...

Out of the various possible ways to reorder the list after an operation, this problem focuses on transposing adjacent list elements—switching their positions in the
list—with a unit cost for each transpose operation
...

For a heuristic H and a given initial ordering of the list, denote the access cost of
sequence by CH
...

a
...
/ D
...

With the move-to-front heuristic, immediately after searching for an element x,
we move x to the first position on the list (i
...
, the front of the list)
...
x/ denote the rank of element x in list L, that is, the position of x in
list L
...
x/ D 4
...

b
...
x/ 1
...
Heuristic H may transpose

Problems for Chapter 17

477

elements in the list in any way it wants, and it might even know the entire access
sequence in advance
...
We denote the cost of access i by ci for move-tofront and by ci for heuristic H
...

c
...
x/
rankLi 1
...


1
...
Suppose that list Li has qi inversions
after processing the access sequence h 1 ; 2 ; : : : ; i i
...
Li / D 2qi
...
e; c/;
...
e; d /;
...
d; b/), and so ˆ
...
Observe that
ˆ
...
L0 / D 0
...
Argue that a transposition either increases the potential by 2 or decreases the
potential by 2
...
To understand how the potential
changes due to i , let us partition the elements other than x into four sets, depending on where they are in the lists just before the ith access:
Set A consists of elements that precede x in both Li
Set B consists of elements that precede x in Li
Set C consists of elements that follow x in Li

1
1

1

and Li 1
...


and precede x in Li 1
...


e
...
x/ D jAj C jBj C 1 and rankLi 1
...

f
...
Li /

ˆ
...
jAj

jBj C ti / ;

where, as before, heuristic H performs ti transpositions during access
Define the amortized cost ci of access
y

i

by ci D ci C ˆ
...
Show that the amortized cost ci of access
y

i

ˆ
...

1 /
...


h
...
/ of access sequence with move-to-front is at
most 4 times the cost CH
...


478

Chapter 17 Amortized Analysis

Chapter notes
Aho, Hopcroft, and Ullman [5] used aggregate analysis to determine the running
time of operations on a disjoint-set forest; we shall analyze this data structure using the potential method in Chapter 21
...
He attributes the accounting method to several authors, including M
...
Brown, R
...

Tarjan, S
...
Mehlhorn
...
D
...
The term “amortized” is due to D
...
Sleator and R
...
Tarjan
...
For each configuration of the problem, we define a potential function
that maps the configuration to a real number
...
The number of steps must
therefore be at least jˆfinal ˆinit j = jˆmax j
...
Krumme, Cybenko,
and Venkataraman [221] applied potential functions to prove lower bounds on gossiping: communicating a unique item from each vertex in a graph to every other
vertex
...

Moreover, if we recognize that when we find an element, we can splice it out of its
position in the list and relocate it to the front of the list in constant time, we can
show that the cost of move-to-front is at most twice the cost of any other heuristic
including, again, one that knows the entire access sequence in advance
...
Two of the chapters, for example,
make extensive use of the amortized analysis techniques we saw in Chapter 17
...
Because disks operate much more slowly than
random-access memory, we measure the performance of B-trees not only by how
much computing time the dynamic-set operations consume but also by how many
disk accesses they perform
...

Chapter 19 gives an implementation of a mergeable heap, which supports the
operations I NSERT, M INIMUM, E XTRACT-M IN, and U NION
...
Fibonacci heaps—the data structure in Chapter 19—also support the operations D ELETE and D ECREASE -K EY
...
The operations I NSERT, M INIMUM, and U NION take only O
...
lg n/
amortized time
...
1/ amortized time
...
Alternatively, if it supported M AXIMUM
and E XTRACT-M AX, it would be a mergeable max-heap
...


482

Part V Advanced Data Structures

K EY operation takes constant amortized time, Fibonacci heaps are key components
of some of the asymptotically fastest algorithms to date for graph problems
...
n lg n/ lower bound for sorting when the keys
are integers in a restricted range, Chapter 20 asks whether we can design a data
structure that supports the dynamic-set operations S EARCH, I NSERT, D ELETE,
M INIMUM, M AXIMUM, S UCCESSOR, and P REDECESSOR in o
...
The answer turns out to be that we can,
by using a recursive data structure known as a van Emde Boas tree
...
lg lg u/
time
...
We have a universe
of n elements that are partitioned into dynamic sets
...
The operation U NION unites two sets, and the query F IND S ET identifies the unique set that contains a given element at the moment
...
m ˛
...
n/ is an incredibly
slowly growing function—˛
...
The
amortized analysis that proves this time bound is as complex as the data structure
is simple
...
Other advanced data structures include the following:
Dynamic trees, introduced by Sleator and Tarjan [319] and discussed by Tarjan
[330], maintain a forest of disjoint rooted trees
...
Dynamic trees support queries to find parents, roots, edge
costs, and the minimum edge cost on a simple path from a node up to a root
...
One implementation of dynamic trees
gives an O
...
lg n/ worst-case time bounds
...

Splay trees, developed by Sleator and Tarjan [320] and, again, discussed by
Tarjan [330], are a form of binary search tree on which the standard searchtree operations run in O
...
One application of splay trees
simplifies dynamic trees
...
Driscoll, Sarnak, Sleator, and Tarjan [97] present
techniques for making linked data structures persistent with only a small time

Part V

Advanced Data Structures

483

and space cost
...

As in Chapter 20, several data structures allow a faster implementation of dictionary operations (I NSERT, D ELETE, and S EARCH) for a restricted universe
of keys
...

Fredman and Willard introduced fusion trees [115], which were the first data
structure to allow faster dictionary operations when the universe is restricted to
integers
...
lg n= lg lg n/
time
...

Dynamic graph data structures support various queries while allowing the
structure of a graph to change through operations that insert or delete vertices
or edges
...

Chapter notes throughout this book mention additional data structures
...
B-trees are similar to red-black trees (Chapter 13), but they are better at minimizing disk I/O operations
...

B-trees differ from red-black trees in that B-tree nodes may have many children,
from a few to thousands
...
B-trees
are similar to red-black trees in that every n-node B-tree has height O
...
The
exact height of a B-tree can be considerably less than that of a red-black tree,
however, because its branching factor, and hence the base of the logarithm that
expresses its height, can be much larger
...
lg n/
...
Figure 18
...
If an internal B-tree node x contains x:n keys, then x has x:n C 1
children
...
When
searching for a key in a B-tree, we make an
...
The structure of leaf nodes differs
from that of internal nodes; we will examine these differences in Section 18
...

Section 18
...
Section 18
...
3 discusses deletion
...

Data structures on secondary storage
Computer systems take advantage of various technologies that provide memory
capacity
...
1 A B-tree whose keys are the consonants of English
...
All leaves are at the same depth in the tree
...


platter

spindle

track

read/write
head

arms

Figure 18
...
It comprises one or more platters (two platters are shown here)
that rotate around a spindle
...
Arms
rotate around a common pivot axis
...


consists of silicon memory chips
...
Most computer systems also have secondary storage based on
magnetic disks; the amount of such secondary storage often exceeds the amount of
primary memory by at least two orders of magnitude
...
2 shows a typical disk drive
...
A magnetizable
material covers the surface of each platter
...
The arms can move their heads toward or away

486

Chapter 18 B-Trees

from the spindle
...
Multiple platters increase only the disk drive’s capacity
and not its performance
...
1 The mechanical
motion has two components: platter rotation and arm movement
...

We typically see 15,000 RPM speeds in server-grade drives, 7200 RPM speeds
in drives for desktops, and 5400 RPM speeds in drives for laptops
...
33 milliseconds, which is over 5
orders of magnitude longer than the 50 nanosecond access times (more or less)
commonly found for silicon memory
...
On average we have to wait
for only half a rotation, but still, the difference in access times for silicon memory
compared with disks is enormous
...
As of
this writing, average access times for commodity disks are in the range of 8 to 11
milliseconds
...
Information is divided into a number
of equal-sized pages of bits that appear consecutively within tracks, and each disk
read or write is of one or more entire pages
...
Once the read/write head is positioned correctly and the disk
has rotated to the beginning of the desired page, reading or writing a magnetic disk
is entirely electronic (aside from the rotation of the disk), and the disk can quickly
read or write large amounts of data
...
For this reason, in this chapter we shall
look separately at the two principal components of the running time:
the number of disk accesses, and
the CPU (computing) time
...
We note that disk-access
time is not constant—it depends on the distance between the current track and
the desired track and also on the initial rotational position of the disk
...
Although they
are faster than mechanical disk drives, they cost more per gigabyte and have lower capacities than
mechanical disk drives
...

In a typical B-tree application, the amount of data handled is so large that all
the data do not fit into main memory at once
...
B-tree algorithms keep only a constant number of pages in
main memory at any time; thus, the size of main memory does not limit the size of
B-trees that can be handled
...
Let x be a pointer to an
object
...
If the object referred to
by x resides on disk, however, then we must perform the operation D ISK -R EAD
...
(We assume that if x is already in main memory, then D ISK -R EAD
...
”) Similarly, the operation D ISK -W RITE
...
That is, the typical
pattern for working with an object is as follows:
x D a pointer to some object
D ISK -R EAD
...
x/
other operations that access but do not modify attributes of x
The system can keep only a limited number of pages in main memory at any one
time
...

Since in most systems the running time of a B-tree algorithm depends primarily on the number of D ISK -R EAD and D ISK -W RITE operations it performs, we
typically want each of these operations to read or write as much information as
possible
...

For a large B-tree stored on a disk, we often see branching factors between 50
and 2000, depending on the size of a key relative to the size of a page
...
Figure 18
...


488

Chapter 18 B-Trees

T:root
1 node,
1000 keys

1000
1001

1000

1000

1001



1001

1000

1000

1000

1001 nodes,
1,001,000 keys

1001



1000

1,002,001 nodes,
1,002,001,000 keys

Figure 18
...
Shown inside each node x
is x: n, the number of keys in x
...
This B-tree has
1001 nodes at depth 1 and over one million leaves at depth 2
...
1 Definition of B-trees
To keep things simple, we assume, as we have for binary search trees and red-black
trees, that any “satellite information” associated with a key resides in the same
node as the key
...
The pseudocode
in this chapter implicitly assumes that the satellite information associated with a
key, or the pointer to such satellite information, travels with the key whenever the
key is moved from node to node
...

A B-tree T is a rooted tree (whose root is T:root) having the following properties:
1
...
x:n, the number of keys currently stored in node x,
b
...
x:leaf , a boolean value that is TRUE if x is a leaf and FALSE if x is an internal
node
...
Each internal node x also contains x:n C 1 pointers x:c1 ; x:c2 ; : : : ; x:cx: nC1 to
its children
...


18
...
The keys x:keyi separate the ranges of keys stored in each subtree: if ki is any
key stored in the subtree with root x:ci , then
k1 Ä x:key1 Ä k2 Ä x:key2 Ä

Ä x:keyx: n Ä kx: nC1 :

4
...

5
...

We express these bounds in terms of a fixed integer t 2 called the minimum
degree of the B-tree:
a
...
Every internal
node other than the root thus has at least t children
...

b
...
Therefore, an internal node
may have at most 2t children
...
2
The simplest B-tree occurs when t D 2
...
In practice, however, much larger values
of t yield B-trees with smaller height
...
We now analyze the worst-case height of a B-tree
...
1
If n 1, then for any n-key B-tree T of height h and minimum degree t
h Ä log t

2,

nC1
:
2

Proof The root of a B-tree T contains at least one key, and all other nodes contain
at least t 1 keys
...
Figure 18
...
Thus, the

2 Another

common variant on a B-tree, known as a B -tree, requires each internal node to be at
least 2=3 full, rather than at least half full, as a B-tree requires
...
4 A B-tree of height 3 containing a minimum possible number of keys
...


number n of keys satisfies the inequality
n

1 C
...
t
D 2t h

th 1
1/
t 1

Ã

1:

By simple algebra, we get t h Ä
...
Taking base-t logarithms of both sides
proves the theorem
...
Although
the height of the tree grows as O
...
Thus, B-trees save a
factor of about lg t over red-black trees in the number of nodes examined for most
tree operations
...

Exercises
18
...
1-2
For what values of t is the tree of Figure 18
...
2 Basic operations on B-trees

491

18
...

18
...
1-5
Describe the data structure that would result if each black node in a red-black tree
were to absorb its red children, incorporating their children with its own
...
2 Basic operations on B-trees
In this section, we present the details of the operations B-T REE -S EARCH, BT REE -C REATE, and B-T REE -I NSERT
...

Any nodes that are passed as parameters must already have had a D ISK -R EAD
operation performed on them
...

Searching a B-tree
Searching a B-tree is much like searching a binary search tree, except that instead
of making a binary, or “two-way,” branching decision at each node, we make a
multiway branching decision according to the number of the node’s children
...
x:n C 1/-way branching decision
...
B-T REE -S EARCH takes as input a pointer
to the root node x of a subtree and a key k to be searched for in that subtree
...
T:root; k/
...
y; i/ consisting of a node y and an
index i such that y:keyi D k
...


492

Chapter 18 B-Trees

B-T REE -S EARCH
...
x; i/
6 elseif x:leaf
7
return NIL
8 else D ISK -R EAD
...
x:ci ; k/
Using a linear-search procedure, lines 1–3 find the smallest index i such that
k Ä x:keyi , or else they set i to x:n C 1
...
Otherwise, lines 6–9 either terminate the search unsuccessfully (if x is a leaf) or recurse to search the appropriate
subtree of x, after performing the necessary D ISK -R EAD on that child
...
1 illustrates the operation of B-T REE -S EARCH
...

As in the T REE -S EARCH procedure for binary search trees, the nodes encountered during the recursion form a simple path downward from the root of the
tree
...
h/ D O
...

Since x:n < 2t, the while loop of lines 2–3 takes O
...
th/ D O
...

Creating an empty B-tree
To build a B-tree T , we first use B-T REE -C REATE to create an empty root node
and then call B-T REE -I NSERT to add new keys
...
1/ time
...

B-T REE -C REATE
...
/
2 x:leaf D TRUE
3 x:n D 0
4 D ISK -W RITE
...
1/ disk operations and O
...


18
...
As with binary search trees, we search for the leaf position
at which to insert the new key
...

Instead, we insert the new key into an existing leaf node
...
The median key moves up into y’s parent to identify the dividing point
between the two new trees
...

As with a binary search tree, we can insert a key into a B-tree in a single pass
down the tree from the root to a leaf
...
Instead, as we
travel down the tree searching for the position where the new key belongs, we split
each full node we come to along the way (including the leaf itself)
...

Splitting a node in a B-tree
The procedure B-T REE -S PLIT-C HILD takes as input a nonfull internal node x (assumed to be in main memory) and an index i such that x:ci (also assumed to be in
main memory) is a full child of x
...
To split a full root, we will first make the
root a child of a new empty root node, so that we can use B-T REE -S PLIT-C HILD
...

Figure 18
...
We split the full node y D x:ci about its
median key S, which moves up into y’s parent node x
...


494

Chapter 18 B-Trees

1

1

x

y i y i y iC
ke ke ke
x: x: x:

yi yi
ke ke
x: x:

x
… N S W …

… N W …

y D x:ci

1

y D x:ci

´ D x:ci C1

P Q R S T U V

P Q R

T U V

T1 T2 T3 T4 T5 T6 T7 T8

T1 T2 T3 T4

T5 T6 T7 T8

Figure 18
...
Node y D x: ci splits into two nodes, y and ´, and the
median key S of y moves up into y’s parent
...
x; i/
1 ´ D A LLOCATE -N ODE
...
y/
19 D ISK -W RITE
...
x/
B-T REE -S PLIT-C HILD works by straightforward “cutting and pasting
...
Node y originally has 2t
children (2t 1 keys) but is reduced to t children (t 1 keys) by this operation
...
2 Basic operations on B-trees

495

of x, positioned just after y in x’s table of children
...

Lines 1–9 create node ´ and give it the largest t 1 keys and corresponding t
children of y
...
Finally, lines 11–17 insert ´ as
a child of x, move the median key from y up to x in order to separate y from ´,
and adjust x’s key count
...
The
CPU time used by B-T REE -S PLIT-C HILD is ‚
...
(The other loops run for O
...
) The procedure performs O
...

Inserting a key into a B-tree in a single pass down the tree
We insert a key k into a B-tree T of height h in a single pass down the tree, requiring O
...
The CPU time required is O
...
t log t n/
...

B-T REE -I NSERT
...
/
4
T:root D s
5
s:leaf D FALSE
6
s:n D 0
7
s:c1 D r
8
B-T REE -S PLIT-C HILD
...
s; k/
10 else B-T REE -I NSERT-N ONFULL
...
Splitting the root is the only
way to increase the height of a B-tree
...
6 illustrates this case
...

The procedure finishes by calling B-T REE -I NSERT-N ONFULL to insert key k into
the tree rooted at the nonfull root node
...

The auxiliary recursive procedure B-T REE -I NSERT-N ONFULL inserts key k into
node x, which is assumed to be nonfull when the procedure is called
...


496

Chapter 18 B-Trees

T:root
s
H

T:root
r

r

A D F H L N P

A D F

L N P

T1 T2 T3 T4 T5 T6 T7 T8

T1 T2 T3 T4

T5 T6 T7 T8

Figure 18
...
Root node r splits in two, and a new root node s is
created
...
The
B-tree grows in height by one when the root is split
...
x; k/
1 i D x:n
2 if x:leaf
3
while i 1 and k < x:keyi
4
x:keyi C1 D x:keyi
5
i Di 1
6
x:keyi C1 D k
7
x:n D x:n C 1
8
D ISK -W RITE
...
x:ci /
13
if x:ci :n == 2t 1
14
B-T REE -S PLIT-C HILD
...
x:ci ; k/
The B-T REE -I NSERT-N ONFULL procedure works as follows
...
If x is not a leaf
node, then we must insert k into the appropriate leaf node in the subtree rooted
at internal node x
...
Line 13 detects whether the recursion would descend to a full
child, in which case line 14 uses B-T REE -S PLIT-C HILD to split that child into two
nonfull children, and lines 15–16 determine which of the two children is now the

18
...
(Note that there is no need for a D ISK -R EAD
...
) The net effect of lines 13–16 is thus
to guarantee that the procedure never recurses to a full node
...
Figure 18
...

For a B-tree of height h, B-T REE -I NSERT performs O
...
1/ D ISK -R EAD and D ISK -W RITE operations occur between calls to
B-T REE -I NSERT-N ONFULL
...
th/ D O
...

Since B-T REE -I NSERT-N ONFULL is tail-recursive, we can alternatively implement it as a while loop, thereby demonstrating that the number of pages that need
to be in main memory at any time is O
...

Exercises
18
...
Draw only the configurations of the tree just before some node must split, and also draw the final configuration
...
2-2
Explain under what circumstances, if any, redundant D ISK -R EAD or D ISK -W RITE
operations occur during the course of executing a call to B-T REE -I NSERT
...

A redundant D ISK -W RITE writes to disk a page of information that is identical to
what is already stored there
...
2-3
Explain how to find the minimum key stored in a B-tree and how to find the predecessor of a given key stored in a B-tree
...
2-4 ?
Suppose that we insert the keys f1; 2; : : : ; ng into an empty B-tree with minimum
degree 2
...
2-5
Since leaf nodes require no pointers to children, they could conceivably use a different (larger) t value than internal nodes for the same disk page size
...


498

Chapter 18 B-Trees

(a) initial tree
A C D E

G M P X
J K

N O

(b) B inserted

R S T U V

Y Z

G M P X

A B C D E

J K

(c) Q inserted

N O

R S T U V

G M P T X

A B C D E

J K

N O

Q R S

(d) L inserted

U V

Y Z

P
G M

A B C D E

T X

J K L

(e) F inserted

N O

Q R S

U V

Y Z

P
C G M

A B

Y Z

D E F

J K L

T X
N O

Q R S

U V

Y Z

Figure 18
...
The minimum degree t for this B-tree is 3, so a node can
hold at most 5 keys
...
(a) The
initial tree for this example
...
(c) The result of inserting Q into the previous tree
...
(d) The result of inserting L into the previous tree
...
Then L is inserted into the
leaf containing JK
...
The node ABCDE splits
before F is inserted into the rightmost of the two halves (the DE node)
...
3 Deleting a key from a B-tree

499

18
...
Show that this change makes the CPU time
required O
...

18
...

Describe how to choose t so as to minimize (approximately) the B-tree search time
...


18
...
As in
insertion, we must guard against deletion producing a tree whose structure violates
the B-tree properties
...

Just as a simple insertion algorithm might have to back up if a node on the path
to where the key was to be inserted was full, a simple approach to deletion might
have to back up if a node (other than the root) along the path to where the key is to
be deleted has the minimum number of keys
...

We design this procedure to guarantee that whenever it calls itself recursively on a
node x, the number of keys in x is at least the minimum degree t
...
This strengthened condition allows us to delete a
key from the tree in one downward pass without having to “back up” (with one exception, which we’ll explain)
...


500

Chapter 18 B-Trees

(a) initial tree

P
C G M

A B

D E F

T X

J K L

N O

(b) F deleted: case 1

Q R S

D E

T X

J K L

N O

(c) M deleted: case 2a

Q R S

D E

Y Z

J K

T X
N O

(d) G deleted: case 2c

Q R S

D E J K

U V

Y Z

P

C L
A B

U V

P

C G L
A B

Y Z

P

C G M
A B

U V

T X
N O

Q R S

U V

Y Z

Figure 18
...
The minimum degree for this B-tree is t D 3, so a node
(other than the root) cannot have fewer than 2 keys
...

(a) The B-tree of Figure 18
...
(b) Deletion of F
...

(c) Deletion of M
...
(d) Deletion of G
...


We sketch how deletion works instead of presenting the pseudocode
...
8
illustrates the various cases of deleting keys from a B-tree
...
If the key k is in node x and x is a leaf, delete the key k from x
...
If the key k is in node x and x is an internal node, do the following:

18
...
8, continued (e) Deletion of D
...
(e0 ) After (e), we delete the root and the tree shrinks
in height by one
...
This is case 3a: C moves to fill B’s position and E moves to
fill C ’s position
...
If the child y that precedes k in node x has at least t keys, then find the
predecessor k 0 of k in the subtree rooted at y
...
(We can find k 0 and delete it in a single downward
pass
...
If y has fewer than t keys, then, symmetrically, examine the child ´ that
follows k in node x
...
Recursively delete k 0 , and replace k by k 0 in x
...
)
c
...

Then free ´ and recursively delete k from y
...
If the key k is not present in internal node x, determine the root x:ci of the
appropriate subtree that must contain k, if k is in the tree at all
...
Then finish by recursing on the appropriate
child of x
...
If x:ci has only t 1 keys but has an immediate sibling with at least t keys,
give x:ci an extra key by moving a key from x down into x:ci , moving a
key from x:ci ’s immediate left or right sibling up into x, and moving the
appropriate child pointer from the sibling into x:ci
...
If x:ci and both of x:ci ’s immediate siblings have t 1 keys, merge x:ci
with one sibling, which involves moving a key from x down into the new
merged node to become the median key for that node
...
The
B-T REE -D ELETE procedure then acts in one downward pass through the tree,
without having to back up
...

Although this procedure seems complicated, it involves only O
...
1/ calls to D ISK -R EAD and D ISK W RITE are made between recursive invocations of the procedure
...
th/ D O
...

Exercises
18
...
8(f)
...
3-2
Write pseudocode for B-T REE -D ELETE
...
The
operations P USH and P OP work on single-word values
...

A simple, but inefficient, stack implementation keeps the entire stack on disk
...
If the pointer has value p, the top element is the
...


Problems for Chapter 18

503

To implement the P USH operation, we increment the stack pointer, read the appropriate page into memory from disk, copy the element to be pushed to the appropriate word on the page, and write the page back to disk
...
We decrement the stack pointer, read in the appropriate page from disk,
and return the top of the stack
...

Because disk operations are relatively expensive, we count two costs for any
implementation: the total number of disk accesses and the total CPU time
...
m/ CPU
time
...
Asymptotically, what is the worst-case number of disk accesses for n stack
operations using this simple implementation? What is the CPU time for n stack
operations? (Express your answer in terms of m and n for this and subsequent
parts
...
(We also maintain a small amount of memory to keep track of which page
is currently in memory
...
If necessary, we can write the page currently in memory
to the disk and read in the new page from the disk to memory
...

b
...
What is the worst-case number of disk accesses required for n stack operations?
What is the CPU time?
Suppose that we now implement the stack by keeping two pages in memory (in
addition to a small number of words for bookkeeping)
...
Describe how to manage the stack pages so that the amortized number of disk
accesses for any stack operation is O
...
1/
...
It returns a set
S D S 0 [ fxg [ S 00
...
In this problem, we investigate

504

Chapter 18 B-Trees

how to implement these operations on 2-3-4 trees
...

a
...
Make sure that your implementation does
not affect the asymptotic running times of searching, insertion, and deletion
...
Show how to implement the join operation
...
1 C jh0 h00 j/ time, where h0
and h00 are the heights of T 0 and T 00 , respectively
...
Consider the simple path p from the root of a 2-3-4 tree T to a given key k,
the set S 0 of keys in T that are less than k, and the set S 00 of keys in T that are
0
greater than k
...
What is the relationship between the heights
of Ti0 1 and Ti0 ? Describe how p breaks S 00 into sets of trees and keys
...
Show how to implement the split operation on T
...
The running time of the split operation should be O
...
(Hint: The costs for joining should telescope
...
Comer [74] provides a comprehensive survey of B-trees
...

In 1970, J
...
Hopcroft invented 2-3 trees, a precursor to B-trees and 2-3-4
trees, in which every internal node has either two or three children
...

Bender, Demaine, and Farach-Colton [40] studied how to make B-trees perform
well in the presence of memory-hierarchy effects
...


19

Fibonacci Heaps

The Fibonacci heap data structure serves a dual purpose
...
” Second, several
Fibonacci-heap operations run in constant amortized time, which makes this data
structure well suited for applications that invoke these operations frequently
...
/ creates and returns a new heap containing no elements
...
H; x/ inserts element x, whose key has already been filled in, into heap H
...
H / returns a pointer to the element in heap H whose key is minimum
...
H / deletes the element from heap H whose key is minimum, returning a pointer to the element
...
H1 ; H2 / creates and returns a new heap that contains all the elements of
heaps H1 and H2
...

In addition to the mergeable-heap operations above, Fibonacci heaps also support
the following two operations:
D ECREASE -K EY
...
1
D ELETE
...


1 As

mentioned in the introduction to Part V, our default mergeable heaps are mergeable minheaps, and so the operations M INIMUM, E XTRACT-M IN, and D ECREASE -K EY apply
...


506

Chapter 19 Fibonacci Heaps

Procedure
M AKE -H EAP
I NSERT
M INIMUM
E XTRACT-M IN
U NION
D ECREASE -K EY
D ELETE

Binary heap
(worst-case)

Fibonacci heap
(amortized)


...
lg n/

...
lg n/

...
lg n/

...
1/

...
1/
O
...
1/

...
lg n/

Figure 19
...
The number of items in the heap(s) at the time of an operation is denoted by n
...
1 shows, if we don’t need the U NION operation, ordinary binary heaps, as used in heapsort (Chapter 6), work fairly well
...
lg n/ on a binary heap
...
By concatenating the two arrays that hold the binary heaps to be merged and then running
B UILD -M IN -H EAP (see Section 6
...
n/ time in the
worst case
...
Note, however, that the running times for Fibonacci heaps in Figure 19
...
The U NION operation takes
only constant amortized time in a Fibonacci heap, which is significantly better
than the linear worst-case time required in a binary heap (assuming, of course, that
an amortized time bound suffices)
...
This situation arises in many applications
...
For dense graphs, which have many edges, the ‚
...
lg n/ worst-case
time of binary heaps
...


19
...
Thus, Fibonacci heaps are predominantly of theoretical interest
...

Both binary heaps and Fibonacci heaps are inefficient in how they support the
operation S EARCH; it can take a while to find an element with a given key
...
As in our discussion of
priority queues in Section 6
...
The exact nature of these handles depends on the application
and its implementation
...
We represent each element by a node within a tree, and each
node has a key attribute
...
” We shall also ignore issues of allocating nodes prior
to insertion and freeing nodes following deletion, assuming instead that the code
calling the heap procedures deals with these details
...
1 defines Fibonacci heaps, discusses how we represent them, and
presents the potential function used for their amortized analysis
...
2
shows how to implement the mergeable-heap operations and achieve the amortized
time bounds shown in Figure 19
...
The remaining two operations, D ECREASE K EY and D ELETE, form the focus of Section 19
...
Finally, Section 19
...


19
...
That
is, each tree obeys the min-heap property: the key of a node is greater than or equal
to the key of its parent
...
2(a) shows an example of a Fibonacci heap
...
2(b) shows, each node x contains a pointer x:p to its parent and
a pointer x:child to any one of its children
...
Each child y in
a child list has pointers y:left and y:right that point to y’s left and right siblings,
respectively
...
Siblings may
appear in a child list in any order
...
2 (a) A Fibonacci heap consisting of five min-heap-ordered trees and 14 nodes
...
The minimum node of the heap is the node containing the key 3
...
The potential of this particular Fibonacci heap is 5 C 2 3 D 11
...
The remaining figures in this chapter omit these details, since all the information
shown here can be determined from what appears in part (a)
...
2) have two advantages for use in
Fibonacci heaps
...
1/ time
...
1/ time
...

Each node has two other attributes
...
The boolean-valued attribute x:mark indicates whether
node x has lost a child since the last time x was made the child of another node
...
Until we look at the D ECREASE -K EY operation
in Section 19
...

We access a given Fibonacci heap H by a pointer H:min to the root of a tree
containing the minimum key; we call this node the minimum node of the Fibonacci

19
...
If more than one root has a key with the minimum value, then any such root
may serve as the minimum node
...

The roots of all the trees in a Fibonacci heap are linked together using their
left and right pointers into a circular, doubly linked list called the root list of the
Fibonacci heap
...
Trees may appear in any order within a root list
...

Potential function
As mentioned, we shall use the potential method of Section 17
...
For a given Fibonacci heap H , we
indicate by t
...
H / the number
of marked nodes in H
...
H / of Fibonacci heap H
by
ˆ
...
H / C 2 m
...
1)

(We will gain some intuition for this potential function in Section 19
...
) For example, the potential of the Fibonacci heap shown in Figure 19
...
The
potential of a set of Fibonacci heaps is the sum of the potentials of its constituent
Fibonacci heaps
...

We assume that a Fibonacci heap application begins with no heaps
...
1), the potential is nonnegative at
all subsequent times
...
3), an upper bound on the total amortized
cost provides an upper bound on the total actual cost for the sequence of operations
...
n/ on the maximum degree of any node
in an n-node Fibonacci heap
...
n/ Ä blg nc
...
) In Sections 19
...
4, we shall show that when we support
D ECREASE -K EY and D ELETE as well, D
...
lg n/
...
2 Mergeable-heap operations
The mergeable-heap operations on Fibonacci heaps delay work as long as possible
...
For example, we insert a node
by adding it to the root list, which takes just constant time
...
The trade-off is that if we then perform
an E XTRACT-M IN operation on Fibonacci heap H , after removing the node that
H:min points to, we would have to look through each of the remaining k 1 nodes
in the root list to find the new minimum node
...
We shall see that, no
matter what the root list looks like before a E XTRACT-M IN operation, afterward
each node in the root list has a degree that is unique within the root list, which leads
to a root list of size at most D
...

Creating a new Fibonacci heap
To make an empty Fibonacci heap, the M AKE -F IB -H EAP procedure allocates and
returns the Fibonacci heap object H , where H:n D 0 and H:min D NIL; there
are no trees in H
...
H / D 0 and m
...
H / D 0
...
1/ actual cost
...

F IB -H EAP -I NSERT
...
2 Mergeable-heap operations

511

H:min
23

7

H:min

3
18
39

52

17
38
41

30

24
26

23

7

21

46

35

3
18

52

39

(a)

17
38

30

41

24
26

46

35

(b)

Figure 19
...
(a) A Fibonacci heap H
...
The node becomes its own min-heap-ordered tree and is then
added to the root list, becoming the left sibling of the root
...
Line 5 tests to see
whether Fibonacci heap H is empty
...
Otherwise, lines 8–10 insert x
into H ’s root list and update H:min if necessary
...
Figure 19
...
2
...
Then, t
...
H / C 1
and m
...
H /, and the increase in potential is

...
H / C 1/ C 2 m
...
t
...
H // D 1 :

Since the actual cost is O
...
1/ C 1 D O
...

Finding the minimum node
The minimum node of a Fibonacci heap H is given by the pointer H:min, so we
can find the minimum node in O
...
Because the potential of H does
not change, the amortized cost of this operation is equal to its O
...

Uniting two Fibonacci heaps
The following procedure unites Fibonacci heaps H1 and H2 , destroying H1 and H2
in the process
...
Afterward, the objects representing H1 and H2 will
never be used again
...
H1 ; H2 /
1 H D M AKE -F IB -H EAP
...
H1 :min == NIL / or
...
Lines
2, 4, and 5 set the minimum node of H , and line 6 sets H:n to the total number
of nodes
...
As in the F IB -H EAP I NSERT procedure, all roots remain roots
...
H /


...
H2 //
D
...
H / C 2 m
...
t
...
H1 // C
...
H2 / C 2 m
...
H / D t
...
H2 / and m
...
H1 / C m
...
The amortized
cost of F IB -H EAP -U NION is therefore equal to its O
...

Extracting the minimum node
The process of extracting the minimum node is the most complicated of the operations presented in this section
...
The following pseudocode extracts the minimum node
...
It also calls the auxiliary procedure C ONSOLIDATE ,
which we shall see shortly
...
2 Mergeable-heap operations

513

F IB -H EAP -E XTRACT-M IN
...
H /
11
H:n D H:n 1
12 return ´
As Figure 19
...
It then consolidates the root list by linking roots of equal degree until
at most one root remains of each degree
...
If ´ is NIL, then Fibonacci heap H is already empty
and we are done
...
If ´ is its own right sibling after line 6, then ´ was the
only node on the root list and it had no children, so all that remains is to make
the Fibonacci heap empty in line 8 before returning ´
...
Figure 19
...
4(a) after executing line 9
...
H / accomplishes
...
Find two roots x and y in the root list with the same degree
...

2
...
This procedure increments the attribute x:degree
and clears the mark on y
...
4 The action of F IB -H EAP -E XTRACT-M IN
...
(b) The situation after removing the minimum node ´ from the root list and adding its children to the root list
...
The procedure processes the root list by starting at the node pointed
to by H: min and following right pointers
...
(f)–(h) The next iteration of the for loop, with the values of w and x shown at the end of
each iteration of the while loop of lines 7–13
...
The node with key 23 has been linked to the node with key 7, which x now points to
...
In
part (h), the node with key 24 has been linked to the node with key 7
...


19
...
4, continued (i)–(l) The situation after each of the next four iterations of the for loop
...


The procedure C ONSOLIDATE uses an auxiliary array AŒ0 : : D
...
If AŒi D y, then y is currently a root
with y:degree D i
...
H:n/ on the maximum degree, but we will see how
to do so in Section 19
...


516

Chapter 19 Fibonacci Heaps

C ONSOLIDATE
...
H:n/ be a new array
2 for i D 0 to D
...
H; y; x/
12
AŒd  D NIL
13
d D d C1
14
AŒd  D x
15 H:min D NIL
16 for i D 0 to D
...
H; y; x/
1 remove y from the root list of H
2 make y a child of x, incrementing x:degree
3 y:mark D FALSE
In detail, the C ONSOLIDATE procedure works as follows
...
The for loop of lines 4–14
processes each root w in the root list
...
Nevertheless, w is always in a tree
rooted at some node x, which may or may not be w itself
...
If it does, then we link the roots x and y but
guaranteeing that x remains a root after linking
...
After
we link y to x, the degree of x has increased by 1, and so we continue this process,
linking x and another root whose degree equals x’s new degree, until no other root

19
...
We then set the appropriate entry
of A to point to x, so that as we process roots later on, we have recorded that x is
the unique root of its degree that we have already processed
...

The while loop of lines 7–13 repeatedly links the root x of the tree containing
node w to another tree whose root has the same degree as x, until no other root has
the same degree
...

We use this loop invariant as follows:
Initialization: Line 6 ensures that the loop invariant holds the first time we enter
the loop
...

Because d D x:degree D y:degree, we want to link x and y
...

Next, we link y to x by the call F IB -H EAP -L INK
...
This
call increments x:degree but leaves y:degree as d
...
Because the call of F IB H EAP -L INK increments the value of x:degree, line 13 restores the invariant
that d D x:degree
...

After the while loop terminates, we set AŒd  to x in line 14 and perform the next
iteration of the for loop
...
4(c)–(e) show the array A and the resulting trees after the first three
iterations of the for loop of lines 4–14
...
4(f)–(h)
...
4(i)–(l) show
the result of the next four iterations of the for loop
...
Once the for loop of lines 4–14 completes,
line 15 empties the root list, and lines 16–23 reconstruct it from the array A
...
4(m)
...

We are now ready to show that the amortized cost of extracting the minimum
node of an n-node Fibonacci heap is O
...
n//
...

We start by accounting for the actual cost of extracting the minimum node
...
D
...
n/ children of the minimum node and from the work in lines 2–3 and
16–23 of C ONSOLIDATE
...
The size
of the root list upon calling C ONSOLIDATE is at most D
...
H / 1, since it
consists of the original t
...
n/
...
But we know that every time through the while
loop, one of the roots is linked to another, and thus the total number of iterations
of the while loop over all iterations of the for loop is at most the number of roots
in the root list
...
n/ C t
...
Thus, the total actual work in extracting the
minimum node is O
...
n/ C t
...

The potential before extracting the minimum node is t
...
H /, and the
potential afterward is at most
...
n/ C 1/ C 2 m
...
n/ C 1 roots
remain and no nodes become marked during the operation
...
D
...
H // C
...
n/ C 1/ C 2 m
...
D
...
t
...
H /
D O
...
n// ;


...
H / C 2 m
...
t
...
Intuitively, the cost of performing each link is paid for by the reduction in potential due to the link’s reducing the number of roots by one
...
4 that D
...
lg n/, so that the amortized cost of extracting
the minimum node is O
...

Exercises
19
...
4(m)
...
3 Decreasing a key and deleting a node
In this section, we show how to decrease the key of a node in a Fibonacci heap
in O
...
D
...
In Section 19
...
3 Decreasing a key and deleting a node

519

mum degree D
...
lg n/, which will imply that F IB -H EAP -E XTRACT-M IN
and F IB -H EAP -D ELETE run in O
...

Decreasing a key
In the following pseudocode for the operation F IB -H EAP -D ECREASE -K EY, we
assume as before that removing a node from a linked list does not change any of
the structural attributes in the removed node
...
H; x; k/
1 if k > x:key
2
error “new key is greater than current key”
3 x:key D k
4 y D x:p
5 if y ¤ NIL and x:key < y:key
6
C UT
...
H; y/
8 if x:key < H:min:key
9
H:min D x
C UT
...
H; y/
1 ´ D y:p
2 if ´ ¤ NIL
3
if y:mark == FALSE
4
y:mark D TRUE
5
else C UT
...
H; ´/
The F IB -H EAP -D ECREASE -K EY procedure works as follows
...
If x is a root or if x:key
y:key, where y is x’s parent, then no structural
changes need occur, since min-heap order has not been violated
...

If min-heap order has been violated, many changes may occur
...
The C UT procedure “cuts” the link between x and its parent y,
making x a root
...
They record a little
piece of the history of each node
...
at some time, x was a root,
2
...
then two children of x were removed by cuts
...
The attribute x:mark is TRUE if steps 1 and 2 have occurred and one child
of x has been cut
...
(We can now see why line 3 of F IB -H EAP -L INK clears y:mark:
node y is being linked to another node, and so step 2 is being performed
...
)
We are not yet done, because x might be the second child cut from its parent y
since the time that y was linked to another node
...
If y is a
root, then the test in line 2 of C ASCADING -C UT causes the procedure to just return
...
If y is marked, however, it has just lost its second child; y is cut
in line 5, and C ASCADING -C UT calls itself recursively in line 6 on y’s parent ´
...

Once all the cascading cuts have occurred, lines 8–9 of F IB -H EAP -D ECREASE K EY finish up by updating H:min if necessary
...
Thus, the new minimum node is either the
original minimum node or node x
...
5 shows the execution of two calls of F IB -H EAP -D ECREASE -K EY,
starting with the Fibonacci heap shown in Figure 19
...
The first call, shown
in Figure 19
...
The second call, shown in Figures 19
...

We shall now show that the amortized cost of F IB -H EAP -D ECREASE -K EY is
only O
...
We start by determining its actual cost
...
1/ time, plus the time to perform the cascading cuts
...
Each call of C ASCADING C UT takes O
...
Thus, the actual cost of F IB H EAP -D ECREASE -K EY, including all recursive calls, is O
...

We next compute the change in potential
...
The call to C UT in line 6 of

19
...
5 Two calls of F IB -H EAP -D ECREASE -K EY
...
(b) The
node with key 46 has its key decreased to 15
...
(c)–(e) The node with key 35 has its key
decreased to 5
...
Its parent, with key 26,
is marked, so a cascading cut occurs
...
Another cascading cut occurs, since the node with key 24 is marked as well
...
The cascading cuts stop
at this point, since the node with key 7 is a root
...
) Part (e) shows the result of the F IB -H EAP -D ECREASE -K EY
operation, with H: min pointing to the new minimum node
...
Each call of C ASCADING -C UT,
except for the last one, cuts a marked node and clears the mark bit
...
H /Cc trees (the original t
...
H / c C2 marked nodes
(c 1 were unmarked by cascading cuts and the last call of C ASCADING -C UT may
have marked a node)
...
t
...
m
...
t
...
H // D 4

c:

522

Chapter 19 Fibonacci Heaps

Thus, the amortized cost of F IB -H EAP -D ECREASE -K EY is at most
O
...
1/ ;

since we can scale up the units of potential to dominate the constant hidden in O
...

You can now see why we defined the potential function to include a term that is
twice the number of marked nodes
...
One unit of potential
pays for the cut and the clearing of the mark bit, and the other unit compensates
for the unit increase in potential due to node y becoming a root
...
D
...
We assume that there is no key value of 1 currently
in the Fibonacci heap
...
H; x/
1 F IB -H EAP -D ECREASE -K EY
...
H /
F IB -H EAP -D ELETE makes x become the minimum node in the Fibonacci heap by
giving it a uniquely small key of 1
...
The amortized time of F IB -H EAP D ELETE is the sum of the O
...
D
...
Since we shall see
in Section 19
...
n/ D O
...
lg n/
...
3-1
Suppose that a root x in a Fibonacci heap is marked
...
Argue that it doesn’t matter to the analysis that x is marked, even
though it is not a root that was first linked to another node and then lost one child
...
3-2
Justify the O
...


19
...
4 Bounding the maximum degree
To prove that the amortized time of F IB -H EAP -E XTRACT-M IN and F IB -H EAP D ELETE is O
...
n/ on the degree of
any node of an n-node Fibonacci heap is O
...
In particular, we shall show that
˘
D
...
24) as
p
D
...
For each node x within a Fibonacci heap,
define size
...
(Note that x need not be in the root list—it can be any node at all
...
x/ is exponential in x:degree
...

Lemma 19
...
Let
y1 ; y2 ; : : : ; yk denote the children of x in the order in which they were linked to x,
0 and yi :degree
i 2 for
from the earliest to the latest
...

Proof Obviously, y1 :degree 0
...
Because node yi is
linked to x (by C ONSOLIDATE) only if x:degree D yi :degree, we must have also
i 1 at that time
...
We conclude that yi :degree i 2
...
” Recall from Section 3
...


524

Chapter 19 Fibonacci Heaps

Lemma 19
...
When k D 0,

Proof
1C

0
X

Fi

D 1 C F0

i D0

D 1C0
D F2 :
We now assume the inductive hypothesis that FkC1 D 1 C
have
FkC2 D Fk C FkC1
D Fk C 1 C

k 1
X

Pk

1
i D0

Fi , and we

!
Fi

i D0

D 1C

k
X

Fi :

i D0

Lemma 19
...
k C 2/nd Fibonacci number satisfies FkC2

k


...
The base cases are for k D 0 and k D 1
...
The inductive step is for k
i D 0; 1; : : : ; k 1
...
23), x 2 D x C1
...
C 1/
D
k 2
2
(by equation (3
...


19
...
4
Let x be any node in a Fibonacci heap, and let k D x:degree
...
x/
p
k
, where D
...

FkC2
Proof Let sk denote the minimum possible size of any node of degree k in any
Fibonacci heap
...
The number sk is at most size
...
Consider some node ´, in any Fibonacci
heap, such that ´:degree D k and size
...
Because sk Ä size
...
x/ by computing a lower bound on sk
...
1, let y1 ; y2 ; : : : ; yk denote the children of ´ in the order in which they
were linked to ´
...
y1 / 1), giving
sk

size
...
1 (so that yi :degree i 2) and the
monotonicity of sk (so that syi : degree si 2 )
...

The bases, for k D 0 and k D 1, are trivial
...
We have
sk

2C

k
X

si

2

i D2

2C

k
X

Fi

i D2

D 1C

k
X

Fi

i D0

D FkC2
k

(by Lemma 19
...
3)
...
x/

sk

FkC2

k


...
5
The maximum degree D
...
lg n/
...

k

...
4, we have n
size
...
(In fact, because k is an integer, k Ä log n
...
n/ of any node is thus O
...

Exercises
19
...
lg n/
...

19
...
(The rule in Section 19
...
) For what values of k is D
...
lg n/?

Problems
19-1 Alternative implementation of deletion
Professor Pisano has proposed the following variant of the F IB -H EAP -D ELETE
procedure, claiming that it runs faster when the node being deleted is not the node
pointed to by H:min
...
H; x/
1 if x == H:min
2
F IB -H EAP -E XTRACT-M IN
...
H; x; y/
6
C ASCADING -C UT
...
The professor’s claim that this procedure runs faster is based partly on the assumption that line 7 can be performed in O
...
What is wrong with
this assumption?
b
...
Your bound should be in terms of x:degree and the number c of
calls to the C ASCADING -C UT procedure
...
Suppose that we call P ISANO -D ELETE
...
Assuming that node x is not a root, bound the potential of H 0 in
terms of x:degree, c, t
...
H /
...
Conclude that the amortized time for P ISANO -D ELETE is asymptotically no
better than for F IB -H EAP -D ELETE, even when x ¤ H:min
...
5
...

As shown in Figure 19
...
The
binomial tree Bk consists of two binomial trees Bk 1 that are linked together so
that the root of one is the leftmost child of the root of the other
...
6(b)
shows the binomial trees B0 through B4
...
Show that for the binomial tree Bk ,
1
...

3
...


there are 2k nodes,
the height of the tree is k,
there are exactly k nodes at depth i for i D 0; 1; : : : ; k, and
i
the root has degree k, which is greater than that of any other node; moreover,
as Figure 19
...


A binomial heap H is a set of binomial trees that satisfies the following properties:
1
...

2
...

3
...

b
...
Discuss the relationship
between the binomial trees that H contains and the binary representation of n
...


528

Chapter 19 Fibonacci Heaps

(a)
Bk–1

Bk–1
B0

Bk

depth
0
1
2

(b)

3
4
B0

B1

B2

B3

B4

(c)
B0

Bk–1

B2

Bk–2

B1

Bk

Figure 19
...
Triangles represent rooted subtrees
...
Node depths in B4 are shown
...


Suppose that we represent a binomial heap as follows
...
4 represents each binomial tree within a binomial
heap
...
The roots form a
singly linked root list, ordered by the degrees of the roots (from low to high), and
we access the binomial heap by a pointer to the first node on the root list
...
Complete the description of how to represent a binomial heap (i
...
, name the
attributes, describe when attributes have the value NIL, and define how the root
list is organized), and show how to implement the same seven operations on
binomial heaps as this chapter implemented on Fibonacci heaps
...
lg n/ worst-case time, where n is the number of nodes in

Problems for Chapter 19

529

the binomial heap (or in the case of the U NION operation, in the two binomial
heaps that are being united)
...

d
...
e
...
How would the trees in a Fibonacci heap resemble those in a binomial
heap? How would they differ? Show that the maximum degree in an n-node
Fibonacci heap would be at most blg nc
...
Professor McGee has devised a new data structure based on Fibonacci heaps
...
The implementations of the operations are the
same as for Fibonacci heaps, except that insertion and union consolidate the
root list as their last step
...

a
...
H; x; k/ changes the key of node x
to the value k
...

b
...
H; r/, which deletes
q D min
...
You may choose any q nodes to delete
...
(Hint: You may need
to modify the data structure and potential function
...
In
this problem, we shall implement 2-3-4 heaps, which support the mergeable-heap
operations
...
In 2-3-4 heaps,
only leaves store keys, and each leaf x stores exactly one key in the attribute x:key
...
Each internal node x contains
a value x:small that is equal to the smallest key stored in any leaf in the subtree
rooted at x
...
Finally, 2-3-4 heaps are designed to be kept in main memory, so that disk
reads and writes are not needed
...
In parts (a)–(e), each operation
should run in O
...
The U NION operation
in part (f) should run in O
...

a
...

b
...

c
...

d
...

e
...

f
...


Chapter notes
Fredman and Tarjan [114] introduced Fibonacci heaps
...

Subsequently, Driscoll, Gabow, Shrairman, and Tarjan [96] developed “relaxed
heaps” as an alternative to Fibonacci heaps
...
One gives the same amortized time bounds as Fibonacci heaps
...
1/ worst-case (not amortized) time and
E XTRACT-M IN and D ELETE to run in O
...
Relaxed heaps
also have some advantages over Fibonacci heaps in parallel algorithms
...


20

van Emde Boas Trees

In previous chapters, we saw data structures that support the operations of a priority
queue—binary heaps in Chapter 6, red-black trees in Chapter 13,1 and Fibonacci
heaps in Chapter 19
...
lg n/ time, either worst case or amortized
...
n lg n/ lower
bound for sorting in Section 8
...
lg n/ time
...
lg n/ time, then we could sort n keys in o
...

We saw in Chapter 8, however, that sometimes we can exploit additional information about the keys to sort in o
...
In particular, with counting sort
we can sort n keys, each an integer in the range 0 to k, in time ‚
...
n/ when k D O
...

Since we can circumvent the
...
lg n/ time in a similar scenario
...
lg lg n/ worst-case time
...

Specifically, van Emde Boas trees support each of the dynamic set operations
listed on page 230—S EARCH, I NSERT, D ELETE, M INIMUM, M AXIMUM, S UC CESSOR , and P REDECESSOR —in O
...
In this chapter, we will omit
discussion of satellite data and focus only on storing keys
...


532

Chapter 20 van Emde Boas Trees

operation, we will implement the simpler operation M EMBER
...

So far, we have used the parameter n for two distinct purposes: the number of
elements in the dynamic set, and the range of the possible values
...
lg lg u/ time
...
We assume
throughout this chapter that u is an exact power of 2, i
...
, u D 2k for some integer
k 1
...
1 starts us out by examining some simple approaches that will get
us going in the right direction
...
2,
introducing proto van Emde Boas structures, which are recursive but do not achieve
our goal of O
...
Section 20
...
lg lg u/ time
...
1 Preliminary approaches
In this section, we shall examine various approaches for storing a dynamic set
...
lg lg u/ time bounds that we desire, we will gain
insights that will help us understand van Emde Boas trees when we see them later
in this chapter
...
1, provides the simplest approach to
storing a dynamic set
...
1-2
...
The
entry AŒx holds a 1 if the value x is in the dynamic set, and it holds a 0 otherwise
...
1/ time with a bit vector, the remaining operations—M INIMUM, M AXIMUM,
S UCCESSOR, and P REDECESSOR—each take ‚
...
1 Preliminary approaches

533

1
1

1

1

1

0

1

0

1

1

1

0

0

0

A 0

0

1

1

1

1

0 1

0

0

0

0

1

2

3

4

5

6

8

9

10 11 12 13 14 15

7

0

0

1
0

1

1

Figure 20
...
Each internal node contains a 1 if and only if some leaf in
its subtree contains a 1
...


we might have to scan through ‚
...
2 For example, if a set contains only
the values 0 and u 1, then to find the successor of 0, we would have to scan
entries 1 through u 2 before finding a 1 in AŒu 1
...
Figure 20
...
The entries of the bit vector form the
leaves of the binary tree, and each internal node contains a 1 if and only if any leaf
in its subtree contains a 1
...

The operations that took ‚
...

To find the maximum value in the set, start at the root and head down toward
the leaves, always taking the rightmost node containing a 1
...


534

Chapter 20 van Emde Boas Trees

To find the successor of x, start at the leaf indexed by x, and head up toward the
root until we enter a node from the left and this node has a 1 in its right child ´
...
e
...

To find the predecessor of x, start at the leaf indexed by x, and head up toward
the root until we enter a node from the right and this node has a 1 in its left
child ´
...
e
...

Figure 20
...

We also augment the I NSERT and D ELETE operations appropriately
...
When deleting a value, we go from the appropriate leaf up to
the root, recomputing the bit in each internal node on the path as the logical-or of
its two children
...
lg u/
time in the worst case
...
We can
still perform the M EMBER operation in O
...
lg n/ time
...

Superimposing a tree of constant height
What happens if we superimpose a tree with greater degree? Let us assume that
p
the size of the universe is u D 22k for some integer k, so that u is an integer
...
Figure 20
...
1
...

As before, each internal node stores the logical-or of the bits within p subits
p
tree, so that the u internal nodes at depth 1 summarize each group of u values
...
2(b) demonstrates, we can think of these nodes as an array
p
summaryŒ0 : : u 1, where summaryŒi contains a 1 if and only if the subarp
p
p
ray AŒi u : :
...
We call this u-bit subarray of A
the ith p
cluster
...
1/-time operation: to insert x, set
ber bx= uc
...
We can use the summary array to perform

20
...
2 (a) A tree of degree u superimposed on top of the same bit vector as in Figure 20
...

Each internal node stores the logical-or of the bits in its subtree
...
i C 1/ u 1
...
u/ time:
To find the minimum (maximum) value, find the leftmost (rightmost) entry in
summary that contains a 1, say summaryŒi, and then do a linear search within
the ith cluster for the leftmost (rightmost) 1
...
If we find a 1, that position gives the result
...
The first
position that holds a 1 gives the index of a cluster
...
That position holds the successor (predecessor)
...
Set AŒx to 0 and then set summaryŒi
to the logical-or of the bits in the ith cluster
...
u/ time
...
Superimposing a binary tree gave us O
...
u/ time
...
We continue down this path in the next section
...
1-1
Modify the data structures in this section to support duplicate keys
...
1-2
Modify the data structures in this section to support keys that have associated satellite data
...
1-3
Observe that, using the structures in this section, the way we find the successor and
predecessor of a value x does not depend on whether x is in the set at the time
...

20
...
What would be the height of
such a tree, and how long would each of the operations take?

20
...
In the previous section, we used a summary structure of size u, with
p
each entry pointing to another stucture of size u
...

level
Starting with a universe of size u, we make structures holding u D u1=2 items,
which themselves hold structures of u1=4 items, which hold structures of u1=8 items,
and so on, down to a base size of 2
...
This restriction would be quite severe in practice,
allowing only values of u in the sequence 2; 4; 16; 256; 65536; : : :
...
Since the structure we examine in this section is only a precursor
to the true van Emde Boas tree structure, we tolerate this restriction in favor of
aiding our understanding
...
lg lg u/ for the operations, let’s think about how we might obtain such running times
...
3, we saw that by changing variables, we could show that the recurrence
p ˘
n C lg n
(20
...
n/ D 2T
has the solution T
...
lg n lg lg n/
...
2)
T
...
u/ C O
...
2 A recursive structure

537

If we use the same technique, changing variables, we can show that recurrence (20
...
u/ D O
...
Let m D lg u, so that u D 2m
and we have
T
...
2m=2 / C O
...
m/ D T
...
m/ D S
...
1/ :
By case 2 of the master method, this recurrence has the solution S
...
lg m/
...
m/ to T
...
u/ D T
...
m/ D O
...
lg lg u/
...
2) will guide our search for a data structure
...

When an operation traverses this data structure, it will spend a constant amount of
time at each level before recursing to the level below
...
2) will then
characterize the running time of the operation
...
2)
...
If we consider how many bits
we need to store the universe size at each level, we need lg u at the top level, and
each level needs half the bits of the previous level
...
Since b D lg u, we see that after lg lg u levels, we have a universe
size of 2
...
2, a given value x resides in
the
cluster number bx= uc
...
lg u/=2 bits of x
...
lg u/=2 bits of x
...
x/ D x= u ;
p
low
...
x; y/ D x u C y :
The function high
...
lg u/=2 bits of x, producing the
number of x’s cluster
...
x/ gives the least significant
...
The function index
...
lg u/=2 bits of the
element number and y as the least significant
...
We have the identity
x D index
...
x/; low
...
The value of u used by each of these functions will

538

Chapter 20 van Emde Boas Trees

proto- EB
...
u/ structure

p
p
u proto- EB
...
3 The information in a proto- EB
...
The structure contains the
p
p
universe size u, a pointer summary to a proto- EB
...
u/ structures
...

20
...
1

Proto van Emde Boas structures

Taking our cue from recurrence (20
...
Although this data structure will fail to achieve our goal of
O
...
3
...
u/, recursively as
follows
...
u/ structure contains an attribute u giving its universe
size
...

k

Otherwise, u D 22 for some integer k
1, so that u
4
...
u/ contains the following
attributes, illustrated in Figure 20
...
u/ structure and
p
p
p
an array clusterŒ0 : : u 1 of u pointers, each to a proto- EB
...

The element x, where 0 Ä x < u, is recursively stored in the cluster numbered
high
...
x/ within that cluster
...
From thep
entry, we can compute the starting index of the subarray of size u that the bit
summarizes
...
2 A recursive structure

539

proto-vEB(16)

elements 0,1

u
2
A
0
1 0
0

proto-vEB(2)

proto-vEB(4)
u
summary
4

u
2
A
0
1 0
0

u
2
A
1
1 1
0

proto-vEB(4)
u
summary
4

u
2
A
1
1 1
0

elements 2,3

cluster
0

1

u
2
A
0
1 0
0

elements 8,9 elements 10,11

u
2
A
1
1 1
0

elements 4,5

proto-vEB(4)
u
summary
4

u
2
A
0
1 1
0

u
2
A
0
1 0
0

cluster
0

proto-vEB(2)

A
0
1 0
0

1

proto-vEB(2)

u
2

0

1

u
2
A
0
1 1
0

elements 6,7

cluster
0

proto-vEB(2)

clusters 2,3

3

proto-vEB(2)

A
0
1 1
0

2

proto-vEB(2)

u
2

cluster

proto-vEB(2)

A
0
1 1
0

proto-vEB(4)
u
summary
4

proto-vEB(2)

u
2

1

cluster

proto-vEB(2)

clusters 0,1

1

proto-vEB(2)

A
1
1 1
0

0

proto-vEB(2)

A
1
1 1
0

u
2

cluster

proto-vEB(2)

u
2

proto-vEB(2)

proto-vEB(2)

proto-vEB(4)
u
summary
4

0

summary

proto-vEB(2)

u 16

1

u
2
A
1
1 1
0

elements 12,13 elements 14,15

Figure 20
...
16/ structure representing the set f2; 3; 4; 5; 7; 14; 15g
...
4/ structures in clusterŒ0 : : 3, and to a summary structure, which is also a proto- EB
...

Each proto- EB
...
2/ structures in clusterŒ0 : : 1, and to a
proto- EB
...
Each proto- EB
...

The proto- EB
...
2/ structures above “clusters i,j ” store the summary bits for clusters i and j in the
top-level proto- EB
...
For clarity, heavy shading indicates the top level of a proto-vEB
structure that stores summary information for its parent structure; such a proto-vEB structure is
otherwise identical to any other proto-vEB structure with the same universe size
...
The array summary contains the summary bits stored recursively in a
p
proto-vEB structure, and the array cluster contains u pointers
...
4 shows a fully expanded proto- EB
...
If the value i is in the proto-vEB structure pointed to by
summary, then the ith cluster contains some value in the set being p
represented
...
i C 1/ u 1, which form the ith cluster
...
2/ structures, and the remaining proto- EB
...
Beneath each of the non-summary base structures, the figure indicates which bits it stores
...
2/ structure labeled
“elements 6,7” stores bit 6 (0, since element 6 is not in the set) in its AŒ0 and
bit 7 (1, since element 7 is in the set) in its AŒ1
...
u/ structure
...
16/ structure are in the leftmost proto- EB
...
2/ structures
...
2/ structure labeled “clusters 2,3” has AŒ0 D 0, indicating that
cluster 2 of the proto- EB
...
Each proto- EB
...
2/ structure
...
2/ structure just to the left of the one labeled “elements 0,1
...

20
...
2

Operations on a proto van Emde Boas structure

We shall now describe how to perform operations on a proto-vEB structure
...
We then discuss
I NSERT and D ELETE
...
2-1
...
Each of these
operations assumes that 0 Ä x < V:u
...
x/, we need to find the bit corresponding to x within the
appropriate proto- EB
...
We can do so in O
...
2 A recursive structure

541

the summary structures altogether
...

P ROTO - V EB-M EMBER
...
V:clusterŒhigh
...
x//
The P ROTO - V EB-M EMBER procedure works as follows
...
2/ structure
...
Line 3 deals with the
recursive case, “drilling down” into the appropriate smaller proto-vEB structure
...
x/ deThe value high
...
u/ p
termines which element within that proto- EB
...

Let’s see what happens when we call P ROTO - V EB-M EMBER
...
16/ structure in Figure 20
...
Since high
...
4/ structure in the upper right, and we ask about element low
...
In this recursive call, u D 4, and so we recurse
again
...
2/ D 1 and low
...
2/ structure in the upper right
...
Thus, we get the result that P ROTO - V EB-M EMBER
...

To determine the running time of P ROTO - V EB-M EMBER, let T
...
u/ structure
...

When P ROTO - V EB-M EMBER makes a recursive call, it makes a call on a
p
proto- EB
...
Thus, we can characterize the running time by the recurp
rence T
...
u/ C O
...
2)
...
u/ D O
...
lg lg u/
...
The procedure
P ROTO - V EB-M INIMUM
...


542

Chapter 20 van Emde Boas Trees

P ROTO - V EB-M INIMUM
...
V:summary/
8
if min-cluster == NIL
9
return NIL
10
else offset D P ROTO - V EB-M INIMUM
...
min-cluster; offset/
This procedure works as follows
...
Lines 7–11 handle the recursive case
...
It does so by recurp
sively calling P ROTO - V EB-M INIMUM on V:summary, which is a proto- EB
...
Line 7 assigns this cluster number to the variable min-cluster
...
Otherwise,
the minimum element of the set is somewhere in cluster number min-cluster
...
Finally, line 11 constructs the value of the minimum element from
the cluster number and offset, and it returns this value
...
u/ structures, it does not run in O
...
Letting T
...
u/ structure, we have the recurrence
p
(20
...
u/ D 2T
...
1/ :
Again, we use a change of variables to solve this recurrence, letting m D lg u,
which gives
T
...
2m=2 / C O
...
m/ D T
...
m/ D 2S
...
1/ ;
which, by case 1 of the master method, has the solution S
...
m/
...
m/ to T
...
u/ D T
...
m/ D ‚
...
lg u/
...
lg u/ time rather than the desired O
...


20
...
In the worst case, it makes two recursive
calls, along with a call to P ROTO - V EB-M INIMUM
...
V; x/ returns the smallest element in the proto-vEB structure V that
is greater than x, or NIL if no element in V is greater than x
...

P ROTO - V EB-S UCCESSOR
...
V:clusterŒhigh
...
x//
6
if offset ¤ NIL
7
return index
...
x/; offset/
8
else succ-cluster D P ROTO - V EB-S UCCESSOR
...
x//
9
if succ-cluster == NIL
10
return NIL
11
else offset D P ROTO - V EB-M INIMUM
...
succ-cluster; offset/
The P ROTO - V EB-S UCCESSOR procedure works as follows
...
2/ structure is when x D 0 and AŒ1
is 1
...
Line 5 searches for a successor to x
within x’s cluster, assigning the result to offset
...
Otherwise, we have to search in other clusters
...
Line 9 tests whether succ-cluster is NIL, with line 10 returning NIL
if all succeeding clusters are empty
...

In the worst case, P ROTO - V EB-S UCCESSOR calls itself recursively twice on
p
proto- EB
...
u/ structure
...
u/ of P ROTO - V EB-S UCCESSOR is
p
p
T
...
u/ C ‚
...
u/ C ‚
...
1) to show
that this recurrence has the solution T
...
lg u lg lg u/
...

Inserting an element
To insert an element, we need to insert it into the appropriate cluster and also set
the summary bit for that cluster to 1
...
V; x/
inserts the value x into the proto-vEB structure V
...
V; x/
1 if V:u == 2
2
V:AŒx D 1
3 else P ROTO - V EB-I NSERT
...
x/; low
...
V:summary; high
...
In the recursive
case, the recursive call in line 3 inserts x into the appropriate cluster, and line 4
sets the summary bit for that cluster to 1
...
3) characterizes its running time
...
lg u/ time
...
Whereas we can always
set a summary bit to 1 when inserting, we cannot always reset the same summary
bit to 0 when deleting
...
As we have defined proto-vEB structures, we would have to examine
p
all u bits within a cluster to determine whether any of them are 1
...
We leave implementation of P ROTO - V EB-D ELETE as Exercises
20
...
2-3
...
We will see in the next section how to do so
...
2-1
Write pseudocode for the procedures P ROTO - V EB-M AXIMUM and P ROTO - V EBP REDECESSOR
...
3 The van Emde Boas tree

545

20
...
It should update the appropriate
summary bit by scanning the related bits within the cluster
...
2-3
Add the attribute n to each proto-vEB structure, giving the number of elements
currently in the set it represents, and write pseudocode for P ROTO - V EB-D ELETE
that uses the attribute n to decide when to reset summary bits to 0
...
2-4
Modify the proto-vEB structure to support duplicate keys
...
2-5
Modify the proto-vEB structure to support keys that have associated satellite data
...
2-6
Write pseudocode for a procedure that creates a proto- EB
...

20
...

20
...
What would the running times of each operation be?

20
...
lg lg u/ running times
...
In this section, we shall design a data structure that
is similar to the proto-vEB structure but stores a little more information, thereby
removing the need for some of the recursion
...
2, we observed that the assumption that we made about the unik
verse size—that u D 22 for some integer k—is unduly restrictive, confining the
possible values of u an overly sparse set
...
u/

min

max
0

summary

p
EB
...
# u/ trees

Figure 20
...
u/ tree when u > 2
...
" u/ tree, and an array
p
p
clusterŒ0 : : " u 1 of " u pointers to EB
...


ger—that is, if u is an odd power of 2 (u D 22kC1 for some integer k 0)—then
we will divide the lg u bits of a number into the most significant d
...
lg u/=2c bits
...
lg u/=2e (the “upp
p
per square root” of u) by " u and 2b
...
Because we now allow u to be an odd power of 2,
we must redefine our helpful functions from Section 20
...
x/ D x= # u ;
p
low
...
x; y/ D x # u C y :
20
...
1

van Emde Boas trees

The van Emde Boas tree, or vEB tree, modifies the proto-vEB structure
...
u/ and, unless u equals the
p
base size of 2, the attribute summary points to a EB
...
# u/ trees
...
5 illustrates, a
vEB tree contains two attributes not found in a proto-vEB structure:
min stores the minimum element in the vEB tree, and
max stores the maximum element in the vEB tree
...
# u/ trees that the cluster array points to
...
u/ tree V , therefore, are V:min plus all the elements recursively stored in
p
p
the EB
...
Note that when a vEB
tree contains two or more elements, we treat min and max differently: the element

20
...

Since the base size is 2, a EB
...
2/ structure has
...
In a vEB tree with no elements, regardless of its
universe size u, both min and max are NIL
...
6 shows a EB
...
Because the smallest element is 2, V:min equals 2, and even though high
...
4/ tree pointed to by V:clusterŒ0: notice
that V:clusterŒ0:min equals 3, and so 2 is not in this vEB tree
...
2/ clusters within V:clusterŒ0 are empty
...
These attributes will help us in
four ways:
1
...

2
...
x/
...
A symmetric argument holds for P REDECESSOR and
min
...
We can tell whether a vEB tree has no elements, exactly one element, or at least
two elements in constant time from its min and max values
...
If min and max are both NIL, then
the vEB tree has no elements
...
Otherwise, both min and max
are non-NIL but are unequal, and the vEB tree has two or more elements
...
If we know that a vEB tree is empty, we can insert an element into it by updating
only its min and max attributes
...
Similarly, if we know that a vEB tree has only one element, we
can delete that element in constant time by updating only min and max
...

Even if the universe size u is an odd power of 2, the difference in the sizes
of the summary vEB tree and the clusters will not turn out to affect the asymptotic
running times of the vEB-tree operations
...
4)
T
...
" u/ C O
...
6 A EB
...
4
...
Slashes indicate NIL values
...
Heavy shading serves the same purpose here as in Figure 20
...


20
...
2), and we will solve it in a similar
fashion
...
2m / Ä T
...
1/ :
Noting that dm=2e Ä 2m=3 for all m

2, we have

T
...
22m=3 / C O
...
m/ D T
...
m/ Ä S
...
1/ ;
which, by case 2 of the master method, has the solution S
...
lg m/
...
u/ D T
...
m/ D
O
...
lg lg u/
...
As Problem 20-1 asks you to show, the total space requirement of
a van Emde Boas tree is O
...
u/ time
...

Therefore, we might not want to use a van Emde Boas tree when we perform only
a small number of operations, since the time to create the data structure would
exceed the time saved in the individual operations
...

20
...
2

Operations on a van Emde Boas tree

We are now ready to see how to perform operations on a van Emde Boas tree
...
Due to the slight asymmetry between
the minimum and maximum elements in a vEB tree—when a vEB tree contains
at least two elements, the minumum element does not appear within a cluster but
the maximum element does—we will provide pseudocode for all five querying operations
...

Finding the minimum and maximum elements
Because we store the minimum and maximum in the attributes min and max, two
of the operations are one-liners, taking constant time:

550

Chapter 20 van Emde Boas Trees

V EB-T REE -M INIMUM
...
V /

1

return V:max

Determining whether a value is in the set
The procedure V EB-T REE -M EMBER
...
We also check directly whether x equals the minimum or maximum element
...

V EB-T REE -M EMBER
...
V:clusterŒhigh
...
x//

Line 1 checks to see whether x equals either the minimum or maximum element
...
Otherwise, line 3 tests for the base case
...
2/ tree has no elements other than those in min and max, if it is the base
case, line 4 returns FALSE
...

Recurrence (20
...
lg lg u/ time
...
Recall that the procedure P ROTO - V EB-S UCCESSOR
...
Because we can access the
maximum value in a vEB tree quickly, we can avoid making two recursive calls,
and instead make one recursive call on either a cluster or on the summary, but not
on both
...
3 The van Emde Boas tree

551

V EB-T REE -S UCCESSOR
...
V:clusterŒhigh
...
x/ < max-low
offset D V EB-T REE -S UCCESSOR
...
x/; low
...
high
...
V:summary; high
...
V:clusterŒsucc-cluster/
return index
...
We start with the
base case in lines 2–4, which returns 1 in line 3 if we are trying to find the successor
of 0 and 1 is in the 2-element set; otherwise, the base case returns NIL in line 4
...
If so, then we simply return the minimum element in
line 6
...
Line 7 assigns to
max-low the maximum element in x’s cluster
...
Line 8 tests for this condition
...

We get to line 11 if x is greater than or equal to the greatest element in its
cluster
...

It is easy to see how recurrence (20
...
Depending on the result of the test in line 7, the procedure
p
calls itself recursively in either line 9 (on pvEB tree with universe size # u) or
a
In
line 11 (on a vEB tree with universe size " u)
...
The remainder of the procedure, including the calls to V EB-T REE -M INIMUM and V EB-T REE -M AXIMUM,
takes O
...
Hence, V EB-T REE -S UCCESSOR runs in O
...


552

Chapter 20 van Emde Boas Trees

The V EB-T REE -P REDECESSOR procedure is symmetric to the V EB-T REE S UCCESSOR procedure, but with one additional case:
V EB-T REE -P REDECESSOR
...
V:clusterŒhigh
...
x/ > min-low
9
offset D V EB-T REE -P REDECESSOR
...
x/; low
...
high
...
V:summary; high
...
V:clusterŒpred-cluster/
17
return index
...
This case occurs when x’s predecessor,
if it exists, does not reside in x’s cluster
...
But if x’s predecessor is the minimum value in vEB
tree V , then the successor resides in no cluster at all
...

This extra case does not affect the asymptotic running time of V EB-T REE P REDECESSOR when compared with V EB-T REE -S UCCESSOR, and so V EBT REE -P REDECESSOR runs in O
...

Inserting an element
Now we examine how to insert an element into a vEB tree
...
The V EB-T REE -I NSERT procedure will make only one recursive call
...
If the cluster already has another element, then the cluster number
is already in the summary, and so we do not need to make that recursive call
...
3 The van Emde Boas tree

553

the cluster does not already have another element, then the element being inserted
becomes the only element in the cluster, and we do not need to recurse to insert an
element into an empty vEB tree:
V EB-E MPTY-T REE -I NSERT
...
V; x/,
which assumes that x is not already an element in the set represented by vEB
tree V :
V EB-T REE -I NSERT
...
V; x/
else if x < V:min
exchange x with V:min
if V:u > 2
if V EB-T REE -M INIMUM
...
x// == NIL
V EB-T REE -I NSERT
...
x//
V EB-E MPTY-T REE -I NSERT
...
x/; low
...
V:clusterŒhigh
...
x//
if x > V:max
V:max D x

This procedure works as follows
...
Lines 3–11 assume that V is not
empty, and therefore some element will be inserted into one of V ’s clusters
...

If x < min, as tested in line 3, then x needs to become the new min
...
In this case, line 4 exchanges x with min, so that we insert the original
min into one of V ’s clusters
...
Line 6 determines
whether the cluster that x will go into is currently empty
...
If x’s cluster is not currently empty, then line 9
inserts x into its cluster
...

Finally, lines 10–11 take care of updating max if x > max
...


554

Chapter 20 van Emde Boas Trees

Once again, we can easily see how recurrence (20
...
Depending on the result of the test in line 6, either the recursive call in line 7
p
(run on a vEB tree with universe size " u) or the recursive call in line 9 (run on
p
a vEB with universe size # u) executes
...
Because the remainder of V EBT REE -I NSERT takes O
...
4) applies, and so the running time
is O
...

Deleting an element
Finally, we look at how to delete an element from a vEB tree
...
V; x/ assumes that x is currently an element in the set represented by the vEB tree V
...
V; x/

1 if V:min == V:max
2
V:min D NIL
3
V:max D NIL
4 elseif V:u == 2
5
if x == 0
6
V:min D 1
7
else V:min D 0
8
V:max D V:min
9 else if x == V:min
10
first-cluster D V EB-T REE -M INIMUM
...
first-cluster;
V EB-T REE -M INIMUM
...
V:clusterŒhigh
...
x//
13
14
if V EB-T REE -M INIMUM
...
x// == NIL
15
V EB-T REE -D ELETE
...
x//
16
if x == V:max
17
summary-max D V EB-T REE -M AXIMUM
...
summary-max;
V EB-T REE -M AXIMUM
...
high
...
V:clusterŒhigh
...
3 The van Emde Boas tree

555

The V EB-T REE -D ELETE procedure works as follows
...
Lines 1–3 handle this case
...
Line 4 tests whether V is a base-case vEB
tree and, if so, lines 5–8 set min and max to the one remaining element
...
In this
case, we will have to delete an element from a cluster
...
If the test in line 9 reveals
that we are in this case, then line 10 sets first-cluster to the number of the cluster
that contains the lowest element other than min, and line 11 sets x to the value of
the lowest element in that cluster
...

When we reach line 13, we know that we need to delete element x from its
cluster, whether x was the value originally passed to V EB-T REE -D ELETE or x
is the element becoming the new minimum
...

That cluster might now become empty, which line 14 tests, and if it does, then
we need to remove x’s cluster number from the summary, which line 15 handles
...
Line 16 checks to see
whether we are deleting the maximum element in V and, if we are, then line 17 sets
summary-max to the number of the highest-numbered nonempty cluster
...
V:summary/ works because we have already recursively
called V EB-T REE -D ELETE on V:summary, and therefore V:summary:max has already been updated as necessary
...
Otherwise, line 20 sets max to the maximum element in the
highest-numbered cluster
...
)
Finally, we have to handle the case in which x’s cluster did not become empty
due to x being deleted
...
Line 21 tests for this case, and if we have to
update max, line 22 does so (again relying on the recursive call to have corrected
max in the cluster)
...
lg lg u/ time in the worst
case
...
4) does not always apply,
because a single call of V EB-T REE -D ELETE can make two recursive calls: one
on line 13 and one on line 15
...
In order for the recursive call on

556

Chapter 20 van Emde Boas Trees

line 15 to occur, the test on line 14 must show that x’s cluster is empty
...
But if x was the only element in its cluster,
then that recursive call took O
...
Thus,
we have two mutually exclusive possibilities:
The recursive call on line 13 took constant time
...

In either case, recurrence (20
...
lg lg u/
...
3-1
Modify vEB trees to support duplicate keys
...
3-2
Modify vEB trees to support keys that have associated satellite data
...
3-3
Write pseudocode for a procedure that creates an empty van Emde Boas tree
...
3-4
What happens if you call V EB-T REE -I NSERT with an element that is already in
the vEB tree? What happens if you call V EB-T REE -D ELETE with an element that
is not in the vEB tree? Explain why the procedures exhibit the behavior that they
do
...

20
...
If we were to modify the operations appropriately, what would be their
running times? For the purpose of analysis, assume that u1=k and u1 1=k are always
integers
...
3-6
Creating a vEB tree with universe size u requires O
...
Suppose we wish to
explicitly account for that time
...
lg lg u/?

Problems for Chapter 20

557

Problems
20-1 Space requirements for van Emde Boas trees
This problem explores the space requirements for van Emde Boas trees and suggests a way to modify the data structure to make its space requirement depend on
the number n of elements actuallyp
stored in the tree, rather than on the universe
size u
...

a
...
u/
of a van Emde Boas tree with universe size u:
p
p
p
(20
...
u/ D
...
u/ C ‚
...
Prove that recurrence (20
...
u/ D O
...

In order to reduce the space requirements, let us define a reduced-space van Emde
Boas tree, or RS-vEB tree, as a vEB tree V but with the following changes:
The attribute V:cluster, rather than being stored as a simple array of pointers to
p
vEB trees with universe size u, is a hash table (see Chapter 11) stored as a dynamic table (see Section 17
...
Corresponding to the array version p V:cluster,
of
the hash table stores pointers to RS-vEB trees with universe size u
...

The hash table stores only pointers to nonempty clusters
...

The attribute V:summary is NIL if all clustersp empty
...

Because the hash table is implemented with a dynamic table, the space it requires
is proportional to the number of nonempty clusters
...
u/
1 allocate a new vEB tree V
2 V:u D u
3 V:min D NIL
4 V:max D NIL
5 V:summary D NIL
6 create V:cluster as an empty dynamic hash table
7 return V

558

Chapter 20 van Emde Boas Trees

c
...
V; x/, which inserts x into the RS-vEB tree V ,
calling C REATE -N EW-RS- V EB-T REE as appropriate
...
Modify the V EB-T REE -S UCCESSOR procedure to produce pseudocode for
the procedure RS- V EB-T REE -S UCCESSOR
...

e
...
lg lg u/
expected time
...
Assuming that elements are never deleted from a vEB tree, prove that the space
requirement for the RS-vEB tree structure is O
...

g
...
How long does it take to create an empty RS-vEB tree?
20-2 y-fast tries
This problem investigates D
...
lg lg u/ worst-case time
...
lg lg u/
amortized time
...
n/ space to store n elements
...
5)
...
For example, if u D 16, so that lg u D 4, and
x D 13 is in the set, then because the binary representation of 13 is 1101, the
perfect hash table would contain the strings 1, 11, 110, and 1101
...

a
...
Show how to perform the M INIMUM and M AXIMUM operations in O
...
lg lg u/ time;
and the I NSERT and D ELETE operations in O
...

To reduce the space requirement to O
...
(Assume for now
that lg u divides n
...

We designate a “representative” value for each group
...
i C1/st group
...
) Note that a representative
might be a value not currently in the set
...
Each representative points to the balanced binary search
tree for its group, and each balanced binary search tree points to its group’s
representative
...

We call this structure a y-fast trie
...
Show that a y-fast trie requires only O
...

d
...
lg lg u/
time with a y-fast trie
...
Show how to perform the M EMBER operation in O
...

f
...
lg lg u/ time
...
Explain why the I NSERT and D ELETE operations take


...


h
...
lg lg u/ amortized time
without affecting the asymptotic running times of the other operations
...
van Emde Boas, who described
an early form of the idea in 1975 [339]
...

Mehlhorn and N¨ her [252] subsequently extended the ideas to apply to universe
a

560

Chapter 20 van Emde Boas Trees

sizes that are prime
...

Using the ideas behind van Emde Boas trees, Dementiev et al
...

Wang and Lin [347] designed a hardware-pipelined version of van Emde Boas
trees, which achieves constant amortized time per operation and uses O
...

A lower bound by Pˇ trascu and Thorup [273, 274] for finding the predecessor
a ¸
shows that van Emde Boas trees are optimal for this operation, even if randomization is allowed
...
These applications often need to perform two operations in particular: finding
the unique set that contains a given element and uniting two sets
...

Section 21
...
In Section 21
...
Section 21
...
The running time using the tree representation is theoretically superlinear, but for all practical purposes it is linear
...
4 defines
and discusses a very quickly growing function and its very slowly growing inverse,
which appears in the running time of operations on the tree-based implementation,
and then, by a complex amortized analysis, proves an upper bound on the running
time that is just barely superlinear
...
1 Disjoint-set operations
A disjoint-set data structure maintains a collection S D fS1 ; S2 ; : : : ; Sk g of disjoint dynamic sets
...
In some applications, it doesn’t matter which member is used as the
representative; we care only that if we ask for the representative of a dynamic set
twice without modifying the set between the requests, we get the same answer both
times
...

As in the other dynamic-set implementations we have studied, we represent each
element of a set by an object
...
x/ creates a new set whose only member (and thus representative)
is x
...

U NION
...
We assume that the two sets are disjoint prior to the operation
...
Since we require
the sets in the collection to be disjoint, conceptually we destroy sets Sx and Sy ,
removing them from the collection S
...

F IND -S ET
...

Throughout this chapter, we shall analyze the running times of disjoint-set data
structures in terms of two parameters: n, the number of M AKE -S ET operations,
and m, the total number of M AKE -S ET, U NION, and F IND -S ET operations
...

After n 1 U NION operations, therefore, only one set remains
...
Note also that since the M AKE -S ET
operations are included in the total number of operations m, we have m n
...

An application of disjoint-set data structures
One of the many applications of disjoint-set data structures arises in determining the connected components of an undirected graph (see Section B
...
Figure 21
...

The procedure C ONNECTED -C OMPONENTS that follows uses the disjoint-set
operations to compute the connected components of a graph
...
1
(In pseudocode, we denote the set of vertices of a graph G by G:V and the set of
edges by G:E
...
3-12)
...
In
this case, the implementation given here can be more efficient than running a new depth-first search
for each new edge
...
1 Disjoint-set operations

a

b

e

c

d

563

f

h

g

j

i
(a)

Edge processed
initial sets
(b,d)
(e,g)
(a,c)
(h,i)
(a,b)
(e, f )
(b,c)

{a}
{a}
{a}
{a,c}
{a,c}
{a,b,c,d}
{a,b,c,d}
{a,b,c,d}

Collection of disjoint sets
{b}
{c} {d} {e}
{f} {g}
{b,d} {c}
{e}
{f} {g}
{b,d} {c}
{e,g}
{f}
{b,d}
{e,g}
{f}
{b,d}
{e,g}
{f}
{e,g}
{f}
{e, f,g}
{e, f,g}

{h}
{h}
{h}
{h}
{h,i}
{h,i}
{h,i}
{h,i}

{i}
{i}
{i}
{i}

{j}
{j}
{j}
{j}
{j}
{j}
{j}
{j}

(b)

Figure 21
...

(b) The collection of disjoint sets after processing each edge
...
G/
1 for each vertex 2 G:V
2
M AKE -S ET
...
u; / 2 G:E
4
if F IND -S ET
...
/
5
U NION
...
u; /
1 if F IND -S ET
...
/
2
return TRUE
3 else return FALSE
The procedure C ONNECTED -C OMPONENTS initially places each vertex in its
own set
...
u; /, it unites the sets containing u and
...
1-2, after processing all the edges, two vertices are in the same connected component if and only if the corresponding objects are in the same set
...
Figure 21
...

In an actual implementation of this connected-components algorithm, the representations of the graph and the disjoint-set data structure would need to reference
each other
...
These programming details
depend on the implementation language, and we do not address them further here
...
1-1
Suppose that C ONNECTED -C OMPONENTS is run on the undirected graph G D

...
d; i/;
...
g; i/;
...
a; h/;
...
d; k/;
...
d; f /;

...
a; e/
...

21
...

21
...
V; E/ with k connected components, how many times is F IND -S ET called? How
many times is U NION called? Express your answers in terms of jV j, jEj, and k
...
2 Linked-list representation of disjoint sets
Figure 21
...
The object for each set has attributes head,
pointing to the first object in the list, and tail, pointing to the last object
...
Within each linked list, the objects may appear in
any order
...

With this linked-list representation, both M AKE -S ET and F IND -S ET are easy,
requiring O
...
To carry out M AKE -S ET
...
For F IND -S ET
...
For
example, in Figure 21
...
g/ would return f
...
2 Linked-list representation of disjoint sets

(a)

f

g

565

d

c

head

h

e

b

head

S1

S2
tail

tail

f

(b)

g

d

c

h

e

b

head
S1
tail

Figure 21
...
Set S1 contains members d , f , and g, with
representative f , and set S2 contains members b, c, e, and h, with representative c
...
Each set object has pointers head and tail to the first and last objects, respectively
...
g; e/, which appends the linked list containing e to the linked list containing g
...
The set object for e’s list, S2 , is destroyed
...
As Figure 21
...
x; y/ by appending y’s list onto the end
of x’s list
...
We use the tail pointer for x’s list to quickly find where to append y’s list
...

Unfortunately, we must update the pointer to the set object for each object originally on y’s list, which takes time linear in the length of y’s list
...
2, for
example, the operation U NION
...

In fact, we can easily construct a sequence of m operations on n objects that
requires ‚
...
Suppose that we have objects x1 ; x2 ; : : : ; xn
...
3, so that m D 2n 1
...
n/ time performing the n
M AKE -S ET operations
...
x1 /
M AKE -S ET
...
xn /
U NION
...
x3 ; x2 /
U NION
...
xn ; xn 1 /

Number of objects updated
1
1
:
:
:
1
1
2
3
:
:
:
n 1

Figure 21
...
n2 / time, or ‚
...

n 1
X

i D ‚
...
n/ time
...
n/
...
n/ time per call because we may be appending a longer list onto
a shorter list; we must update the pointer to the set object for each member of
the longer list
...
With this simple weighted-union heuristic, a single U NION operation can still take
...
n/ members
...
m C n lg n/
time
...
1
Using the linked-list representation of disjoint sets and the weighted-union heuristic, a sequence of m M AKE -S ET, U NION, and F IND -S ET operations, n of which
are M AKE -S ET operations, takes O
...


21
...
We now bound the total time taken by these
U NION operations
...
Consider a
particular object x
...
The first time x’s pointer was updated, therefore, the
resulting set must have had at least 2 members
...
Continuing on,
we observe that for any k Ä n, after x’s pointer has been updated dlg ke times,
the resulting set must have at least k members
...
Thus the total time spent updating object pointers over all U NION
operations is O
...
We must also account for updating the tail pointers and
the list lengths, which take only ‚
...
The total time
spent in all U NION operations is thus O
...

The time for the entire sequence of m operations follows easily
...
1/ time, and there are O
...
The
total time for the entire sequence is thus O
...

Exercises
21
...
Make sure to specify the attributes
that you assume for set objects and list objects
...
2-2
Show the data structure that results and the answers returned by the F IND -S ET
operations in the following program
...

1
2
3
4
5
6
7
8
9
10
11

for i D 1 to 16
M AKE -S ET
...
xi ; xi C1 /
for i D 1 to 13 by 4
U NION
...
x1 ; x5 /
U NION
...
x1 ; x10 /
F IND -S ET
...
x9 /

568

Chapter 21 Data Structures for Disjoint Sets

Assume that if the sets containing xi and xj have the same size, then the operation
U NION
...

21
...
1 to obtain amortized time bounds
of O
...
lg n/ for U NION using the linkedlist representation and the weighted-union heuristic
...
2-4
Give a tight asymptotic bound on the running time of the sequence of operations in
Figure 21
...

21
...
Show that the professor’s suspicion is well founded
by describing how to represent each set by a linked list such that each operation
has the same running time as the operations described in this section
...
Your scheme should allow for the weighted-union
heuristic, with the same effect as described in this section
...
)
21
...
Whether
or not the weighted-union heuristic is used, your change should not change the
asymptotic running time of the U NION procedure
...
)

21
...
In a disjointset forest, illustrated in Figure 21
...
The
root of each tree contains the representative and is its own parent
...


21
...
4 A disjoint-set forest
...
2
...
(b) The result of U NION
...


We perform the three disjoint-set operations as follows
...
We perform a F IND -S ET operation by
following parent pointers until we find the root of the tree
...
A U NION operation,
shown in Figure 21
...

Heuristics to improve the running time
So far, we have not improved on the linked-list implementation
...
By
using two heuristics, however, we can achieve a running time that is almost linear
in the total number of operations m
...
The obvious approach would be to make
the root of the tree with fewer nodes point to the root of the tree with more nodes
...
For each node, we maintain a
rank, which is an upper bound on the height of the node
...

The second heuristic, path compression, is also quite simple and highly effective
...
5, we use it during F IND -S ET operations to make each
node on the find path point directly to the root
...


570

Chapter 21 Data Structures for Disjoint Sets

f

e
f
d

c

a

b

c

d

e

b

a

(a)

(b)

Figure 21
...
Arrows and self-loops at roots are
omitted
...
a/
...
Each node has a pointer to its parent
...
a/
...


Pseudocode for disjoint-set forests
To implement a disjoint-set forest with the union-by-rank heuristic, we must keep
track of ranks
...
When M AKE -S ET creates a singleton set, the
single node in the corresponding tree has an initial rank of 0
...
The U NION operation has two cases, depending
on whether the roots of the trees have equal rank
...
If, instead, the roots have equal ranks, we
arbitrarily choose one of the roots as the parent and increment its rank
...
We designate the parent of node x
by x:p
...


21
...
x/
1 x:p D x
2 x:rank D 0
U NION
...
F IND -S ET
...
y//
L INK
...
x/
1 if x ¤ x:p
2
x:p D F IND -S ET
...
Each
call of F IND -S ET
...
If x is the root, then F IND -S ET skips
line 2 and instead returns x:p, which is x; this is the case in which the recursion
bottoms out
...
Line 2 updates node x to point directly to the root,
and line 3 returns this pointer
...
Alone, union by rank yields a running time
of O
...
4-4), and this bound is tight (see Exercise 21
...

Although we shall not prove it here, for a sequence of n M AKE -S ET operations (and hence at most n 1 U NION operations) and f F IND -S ET operations, the path-compression heuristic alone gives a worst-case running time of

...
1 C log2Cf =n n//
...
m ˛
...
n/ is a very slowly growing function, which we define in Section 21
...
In any conceivable application of a disjoint-set data structure,
˛
...
Strictly speaking, however, it is superlinear
...
4, we prove this
upper bound
...
3-1
Redo Exercise 21
...

21
...

21
...
m lg n/ time when we use union by rank
only
...
3-4
Suppose that we wish to add the operation P RINT-S ET
...
Show how we can add just
a single attribute to each node in a disjoint-set forest so that P RINT-S ET
...
Assume that we can print each member of
the set in O
...

21
...
m/ time if we use both path compression and union by rank
...
4 Analysis of union by rank with path compression

573

? 21
...
3, the combined union-by-rank and path-compression heuristic runs in time O
...
n// for m disjoint-set operations on n elements
...
Then we
prove this running time using the potential method of amortized analysis
...
j / as
(
j C1
if k D 0 ;
Ak
...
j C1/
Ak 1
...
j C1/
...

tion 3
...
Specifically, A
...
j / D j and A
...
j / D Ak 1
...
i 1
...

The function Ak
...
To see just how quickly
this function grows, we first obtain closed-form expressions for A1
...
j /
...
2
For any integer j

1, we have A1
...


Proof We first use induction on i to show that A
...
j / D j Ci
...
0/
...
For the inductive step, assume that A
...
j / D
0
0
j C
...
Then A
...
j / D A0
...
i 1/
...
j C
...
Finally,
0
0
we note that A1
...
j C1/
...
j C 1/ D 2j C 1
...
3
For any integer j

1, we have A2
...
j C 1/

1
...
i /
...
j C 1/ 1
...
0/
...
j C 1/ 1
...
i 1/
...
j C 1/ 1
...
i /
...
A
...
j // D
1
1
1
A1
...
j C 1/ 1/ D 2
...
j C1/ 1/C1 D 2i
...
j C1/ 1
...
j / D A
...
j / D 2j C1
...

1
Now we can see how quickly Ak
...
1/ for levels
k D 0; 1; 2; 3; 4
...
k/ and the above lemmas, we have
A0
...
1/ D 2 1 C 1 D 3, and A2
...
1 C 1/ 1 D 7
...
2/
...
A2
...
7/
28 8 1
211 1
2047

A3
...
2/
...
A3
...
2047/

D

A4
...
2048/
...
2047/
22048 2048 1
22048

...
(The symbol
“ ” denotes the “much-greater-than” relation
...
n/, for integer n 0, by
˛
...
1/

˚

ng :

In words, ˛
...
1/ is at least n
...
1/, we see that

˛
...
1/ :

It is only for values of n so large that the term “astronomical” understates them
(greater than A4
...
n/ > 4, and so ˛
...


21
...
m ˛
...
In order to
prove this bound, we first prove some simple properties of ranks
...
4
For all nodes x, we have x:rank Ä x:p:rank, with strict inequality if x ¤ x:p
...
The value of x:p:rank monotonically increases
over time
...
3
...
4-1
...
5
As we follow the simple path from any node toward a root, the node ranks strictly
increase
...
6
Every node has rank at most n

1
...

Because there are at most n 1 U NION operations, there are also at most n 1
L INK operations
...

Lemma 21
...
In fact, every node has rank at
most blg nc (see Exercise 21
...
The looser bound of Lemma 21
...

Proving the time bound
We shall use the potential method of amortized analysis (see Section 17
...
m ˛
...
In performing the amortized analysis, we will find it
convenient to assume that we invoke the L INK operation rather than the U NION
operation
...
The following lemma shows that even if we count the extra F IND -S ET operations induced by U NION calls, the asymptotic running time remains unchanged
...
7
Suppose we convert a sequence S 0 of m0 M AKE -S ET, U NION, and F IND -S ET operations into a sequence S of m M AKE -S ET, L INK, and F IND -S ET operations by
turning each U NION into two F IND -S ET operations followed by a L INK
...
m ˛
...
m0 ˛
...

Proof Since each U NION operation in sequence S 0 is converted into three operations in S, we have m0 Ä m Ä 3m0
...
m0 /, an O
...
n// time bound
for the converted sequence S implies an O
...
n// time bound for the original
sequence S 0
...
We now prove an O
...
n//
time bound for the converted sequence and appeal to Lemma 21
...
m0 ˛
...

Potential function
The potential function we use assigns a potential q
...
We sum the node potentials for the potenP
tial of the entire forest: ˆq D x q
...
The forest is empty prior to the first operation, and we
arbitrarily set ˆ0 D 0
...

The value of q
...

If it is, or if x:rank D 0, then q
...
n/ x:rank
...

We need to define two auxiliary functions on x before we can define q
...
First
we define
level
...
x:rank/g :

That is, level
...

We claim that
0 Ä level
...
n/ ;

(21
...
We have
x:p:rank

x:rank C 1 (by Lemma 21
...
x:rank/ (by definition of A0
...
x/

0, and we have

21
...
n/
...
n/
...
j / is strictly increasing)
n
(by the definition of ˛
...
6) ,

which implies that level
...
n/
...
x/
...
x/ D max i W x:p:rank A
...
x:rank/ :
level
...
x/ is the largest number of times we can iteratively apply Alevel
...

We claim that when x:rank 1, we have
1 Ä iter
...
2)

which we see as follows
...
x/
...
x/)
D A
...
x:rank/ (by definition of functional iteration) ,
level
...
x/

1, and we have

A
...
x:rank/ D Alevel
...
x:rank/ (by definition of Ak
...
x/
> x:p:rank
(by definition of level
...
x/ Ä x:rank
...
x/ to decrease, level
...
As long
as level
...
x/ must either increase or remain unchanged
...
n/ x:rank
if x is a root or x:rank D 0 ;
q
...
n/ level
...
x/ if x is not a root and x:rank 1 :
We next investigate some useful properties of node potentials
...
8
For every node x, and for all operation counts q, we have


q
...
n/ x:rank :

578

Chapter 21 Data Structures for Disjoint Sets

Proof If x is a root or x:rank D 0, then q
...
n/ x:rank by definition
...
We obtain a lower bound on q
...
x/ and iter
...
By the bound (21
...
x/ Ä ˛
...
2), iter
...
Thus,
q
...
n/ level
...
x/

...
n/ 1// x:rank x:rank
D x:rank x:rank
D 0:

Similarly, we obtain an upper bound on q
...
x/ and iter
...

By the bound (21
...
x/ 0, and by the bound (21
...
x/ 1
...
x/

Ä
...
n/ x:rank 1
< ˛
...
9
If node x is not a root and x:rank > 0, then

q
...
n/ x:rank
...

With an understanding of the change in potential due to each operation, we can
determine each operation’s amortized cost
...
10
Let x be a node that is not a root, and suppose that the qth operation is either a
L INK or F IND -S ET
...
x/ Ä q 1
...
Moreover, if
x:rank
1 and either level
...
x/ changes due to the qth operation, then
1
...
x/ Ä q 1
...
x/ or iter
...

Proof Because x is not a root, the qth operation does not change x:rank, and
because n does not change after the initial n M AKE -S ET operations, ˛
...
Hence, these components of the formula for x’s potential remain the same after the qth operation
...
x/ D q 1
...

Now assume that x:rank 1
...
x/ monotonically increases over time
...
x/ unchanged, then iter
...

If both level
...
x/ are unchanged, then q
...
x/
...
x/

21
...
x/ increases, then it increases by at least 1, and so
1
...
x/ Ä q 1
...
x/, it increases by at least 1, so that
the value of the term
...
x// x:rank drops by at least x:rank
...
x/ increased, the value of iter
...
2), the drop is by at most x:rank 1
...
x/ is less than the decrease in potential due to the
change in level
...
x/ Ä q 1
...

Our final three lemmas show that the amortized cost of each M AKE -S ET, L INK,
and F IND -S ET operation is O
...
Recall from equation (17
...

Lemma 21
...
1/
...
x/
...
x/ D 0
...
Noting that the actual cost of the M AKE -S ET operation is O
...

Lemma 21
...
n//
...
x; y/
...
1/
...

To determine the change in potential due to the L INK, we note that the only
nodes whose potentials may change are x, y, and the children of y just prior to the
operation
...
n/:
By Lemma 21
...

From the definition of q
...
x/ D ˛
...
If x:rank D 0, then q
...
x/ D 0
...
x/

< ˛
...
9)
D q 1
...


580

Chapter 21 Data Structures for Disjoint Sets

Because y is a root prior to the L INK, q 1
...
n/ y:rank
...
Therefore, either q
...
y/ or q
...
y/ C ˛
...

The increase in potential due to the L INK operation, therefore, is at most ˛
...

The amortized cost of the L INK operation is O
...
n/ D O
...

Lemma 21
...
n//
...
The actual cost of the F IND -S ET operation is O
...
We shall
show that no node’s potential increases due to the F IND -S ET and that at least
max
...
n/ C 2// nodes on the find path have their potential decrease by
at least 1
...
10 for all
nodes other than the root
...
n/ x:rank, which
does not change
...
0; s
...
Let x be a node on the find path such that x:rank > 0
and x is followed somewhere on the find path by another node y that is not a root,
where level
...
x/ just before the F IND -S ET operation
...
) All but at most ˛
...
Those that do not satisfy them are the first node
on the find path (if it has rank 0), the last node on the path (i
...
, the root), and the
last node w on the path for which level
...
n/ 1
...
Let k D level
...
y/
...
iter
...
x:rank/ (by definition of iter
...
y/) ,
Ak
...
5 and because
y follows x on the find path)
...
x/ before path
compression, we have
y:p:rank

Ak
...
x:p:rank/
D

(because Ak
...
A
...
x//
...
i C1/
...
4 Analysis of union by rank with path compression

581

Because path compression will make x and y have the same parent, we know
that after path compression, x:p:rank D y:p:rank and that the path compression
does not decrease y:p:rank
...
i C1/
...
Thus, path compression will cause eik
ther iter
...
x/ to increase (which occurs if
iter
...
In either case, by Lemma 21
...
Hence, x’s potential decreases by at least 1
...
x/ Ä q 1
...
The actual cost is O
...
0; s
...
The amortized cost, therefore, is at
most O
...
s
...
s/ s C O
...
n//, since we can
scale up the units of potential to dominate the constant hidden in O
...

Putting the preceding lemmas together yields the following theorem
...
14
A sequence of m M AKE -S ET, U NION, and F IND -S ET operations, n of which are
M AKE -S ET operations, can be performed on a disjoint-set forest with union by
rank and path compression in worst-case time O
...
n//
...
7, 21
...
12, and 21
...


Exercises
21
...
4
...
4-2
Prove that every node has rank at most blg nc
...
4-3
In light of Exercise 21
...
4-4
Using Exercise 21
...
m lg n/ time
...
4-5
Professor Dante reasons that because node ranks increase strictly along a simple
path to the root, node levels must monotonically increase along the path
...
x/ Ä level
...
Is the
professor correct?
21
...
n/ D min fk W Ak
...
n C 1/g
...
n/ Ä 3
for all practical values of n and, using Exercise 21
...
m ˛ 0
...


Problems
21-1 Off-line minimum
The off-line minimum problem asks us to maintain a dynamic set T of elements
from the domain f1; 2; : : : ; ng under the operations I NSERT and E XTRACT-M IN
...
We wish to determine which key
is returned by each E XTRACT-M IN call
...
The problem is “off-line” in the sense that we are
allowed to process the entire sequence S before determining any of the returned
keys
...
In the following instance of the off-line minimum problem, each operation
I NSERT
...

To develop an algorithm for this problem, we break the sequence S into homogeneous subsequences
...
For each subsequence Ij , we initially place
the keys inserted by these operations into a set Kj , which is empty if Ij is empty
...
m; n/
1 for i D 1 to n
2
determine j such that i 2 Kj
3
if j ¤ m C 1
4
extractedŒj  D i
5
let l be the smallest value greater than j
for which set Kl exists
6
Kl D Kj [ Kl , destroying Kj
7 return extracted
b
...

c
...
Give a tight bound on the worst-case running time of your
implementation
...
/ creates a tree whose only node is
...
/ returns the depth of node

within its tree
...
r; / makes node r, which is assumed to be the root of a tree, become the
child of node , which is assumed to be in a different tree than r but may or may
not itself be a root
...
Suppose that we use a tree representation similar to a disjoint-set forest: :p
is the parent of node , except that :p D if is a root
...
r; / by setting r:p D and F IND -D EPTH
...
Show that the worst-case running time of a sequence of m M AKE T REE, F IND -D EPTH, and G RAFT operations is ‚
...

By using the union-by-rank and path-compression heuristics, we can reduce the
worst-case running time
...
The tree
structure within a set Si , however, does not necessarily correspond to that of Ti
...

The key idea is to maintain in each node a “pseudodistance” :d, which is
defined so that the sum of the pseudodistances along the simple path from to the

584

Chapter 21 Data Structures for Disjoint Sets

root of its set Si equals the depth of in Ti
...

b
...

c
...
Your implementation should perform path compression, and its running time should be linear
in the length of the find path
...

d
...
r; /, which combines the sets containing r
and , by modifying the U NION and L INK procedures
...
Note that the root of a set Si
is not necessarily the root of the corresponding tree Ti
...
Give a tight bound on the worst-case running time of a sequence of m M AKE T REE, F IND -D EPTH, and G RAFT operations, n of which are M AKE -T REE operations
...
In the
off-line least-common-ancestors problem, we are given a rooted tree T and an
arbitrary set P D ffu; gg of unordered pairs of nodes in T , and we wish to determine the least common ancestor of each pair in P
...
T:root/
...

LCA
...
u/
2 F IND -S ET
...
/
5
U NION
...
u/:ancestor D u
7 u:color D BLACK
8 for each node such that fu; g 2 P
9
if :color == BLACK
10
print “The least common ancestor of”
u “and” “is” F IND -S ET
...
Argue that line 10 executes exactly once for each pair fu; g 2 P
...
Argue that at the time of the call LCA
...

c
...


for each

d
...
3
...
E
...
Using aggregate analysis, Tarjan [328, 330] gave the first tight
upper bound in terms of the very slowly growing inverse ˛
...
(The function Ak
...
4 is similar to Ackermann’s
function, and the function ˛
...
Both ˛
...
m; n/
y
are at most 4 for all conceivable values of m and n
...
m lg n/ upper bound
was proven earlier by Hopcroft and Ullman [5, 179]
...
4
is adapted from a later analysis by Tarjan [332], which is in turn based on an analysis by Kozen [220]
...

Tarjan and van Leeuwen [333] discuss variants on the path-compression heuristic, including “one-pass methods,” which sometimes offer better constant factors
in their performance than do two-pass methods
...
Harfst and Reingold [161] later showed how to make a small change
to the potential function to adapt their path-compression analysis to these one-pass
variants
...
m/ time
...
m ˛
...

This lower bound was later generalized by Fredman and Saks [113], who showed
that in the worst case,
...
m; n//
...

y

VI

Graph Algorithms

Introduction
Graph problems pervade computer science, and algorithms for working with them
are fundamental to the field
...
In this part, we touch on a few of the more significant
ones
...
The chapter gives two applications of depth-first search: topologically
sorting a directed acyclic graph and decomposing a directed graph into its strongly
connected components
...
The algorithms for computing minimum spanning
trees serve as good examples of greedy algorithms (see Chapter 16)
...
” Chapter 24 shows how to
find shortest paths from a given source vertex to all other vertices, and Chapter 25
examines methods to compute shortest paths between every pair of vertices
...
This general problem arises in many forms, and a
good algorithm for computing maximum flows can help solve a variety of related
problems efficiently
...
V; E/, we usually measure the size of the input in terms of the number of
vertices jV j and the number of edges jEj of the graph
...
We adopt a common notational
convention for these parameters
...
For example, we might say, “the algorithm runs in
time O
...
jV j jEj/
...

Another convention we adopt appears in pseudocode
...
That is, the pseudocode views vertex
and edge sets as attributes of a graph
...

Searching a graph means systematically following the edges of the graph so as to
visit the vertices of the graph
...
Many algorithms begin by searching their input
graph to obtain this structural information
...
Techniques for searching a graph lie at the heart of
the field of graph algorithms
...
1 discusses the two most common computational representations of
graphs: as adjacency lists and as adjacency matrices
...
2 presents a simple graph-searching algorithm called breadth-first search and shows how to create a breadth-first tree
...
3 presents depth-first search and proves some
standard results about the order in which depth-first search visits vertices
...
4 provides our first real application of depth-first search: topologically sorting a directed acyclic graph
...
5
...
1 Representations of graphs
We can choose between two standard ways to represent a graph G D
...
Either way applies
to both directed and undirected graphs
...
Most of the graph algorithms
presented in this book assume that an input graph is represented in adjacencylist form
...
For example, two of the all-pairs

590

Chapter 22 Elementary Graph Algorithms

1

1
2
3
4
5

2
3

5

4

2
1
2
2
4

(a)

5
5
4
5
1

3

1
2
3
4
5

4

3
2

1
0
1
0
0
1

2
1
0
1
1
1

(b)

3
0
1
0
1
0

4
0
1
1
0
1

5
1
1
0
1
0

(c)

Figure 22
...
(a) An undirected graph G with 5 vertices
and 7 edges
...
(c) The adjacency-matrix representation
of G
...
2 Two representations of a directed graph
...
(b) An adjacency-list representation of G
...


shortest-paths algorithms presented in Chapter 25 assume that their input graphs
are represented by adjacency matrices
...
V; E/ consists of an array Adj of jV j lists, one for each vertex in V
...
u; / 2 E
...
(Alternatively, it may contain
pointers to these vertices
...
In pseudocode, therefore, we will see notation such as G:AdjŒu
...
1(b) is an adjacency-list representation of the undirected graph in Figure 22
...
Similarly, Figure 22
...
2(a)
...
u; / is represented by having appear in AdjŒu
...
1 Representations of graphs

591

an undirected graph, the sum of the lengths of all the adjacency lists is 2 jEj, since
if
...

For both directed and undirected graphs, the adjacency-list representation has the
desirable property that the amount of memory it requires is ‚
...

We can readily adapt adjacency lists to represent weighted graphs, that is, graphs
for which each edge has an associated weight, typically given by a weight function
w W E ! R
...
V; E/ be a weighted graph with weight
function w
...
u; / of the edge
...
The adjacency-list representation is quite robust in
that we can modify it to support many other graph variants
...
u; / is present in the graph
than to search for in the adjacency list AdjŒu
...
(See Exercise 22
...
)
For the adjacency-matrix representation of a graph G D
...
Then the
adjacency-matrix representation of a graph G consists of a jV j jV j matrix
A D
...
i; j / 2 E ;
aij D
0 otherwise :
Figures 22
...
2(c) are the adjacency matrices of the undirected and directed graphs in Figures 22
...
2(a), respectively
...
V 2 / memory, independent of the number of edges in the graph
...
1(c)
...
u; / and
...

In some applications, it pays to store only the entries on and above the diagonal of
the adjacency matrix, thereby cutting the memory needed to store the graph almost
in half
...
For example, if G D
...
u; / of the edge
...
If an edge does not
exist, we can store a NIL value as its corresponding matrix entry, though for many
problems it is convenient to use a value such as 0 or 1
...
Moreover, adja-

592

Chapter 22 Elementary Graph Algorithms

cency matrices carry a further advantage for unweighted graphs: they require only
one bit per entry
...
We indicate these attributes using our usual notation, such as :d
for an attribute d of a vertex
...
For example, if edges have an attribute f , then we
denote this attribute for edge
...
u; /:f
...

Implementing vertex and edge attributes in real programs can be another story
entirely
...

For a given situation, your decision will likely depend on the programming language you are using, the algorithm you are implementing, and how the rest of your
program uses the graph
...
If the vertices adjacent to u are in AdjŒu, then what we call
the attribute u:d would actually be stored in the array entry d Œu
...
For example, in an object-oriented programming language, vertex attributes might be represented as instance variables
within a subclass of a Vertex class
...
1-1
Given an adjacency-list representation of a directed graph, how long does it take
to compute the out-degree of every vertex? How long does it take to compute the
in-degrees?
22
...
Give
an equivalent adjacency-matrix representation
...

22
...
V; E/ is the graph G T D
...
; u/ 2 V V W
...
Thus, G T is G with all its edges reversed
...
Analyze the running times of your
algorithms
...
1 Representations of graphs

593

22
...
V; E/, describe an
O
...
V; E 0 /, where E 0 consists of the edges in E
with all multiple edges between two vertices replaced by a single edge and with all
self-loops removed
...
1-5
The square of a directed graph G D
...
V; E 2 / such that

...

Describe efficient algorithms for computing G 2 from G for both the adjacencylist and adjacency-matrix representations of G
...

22
...
V 2 /, but there are some exceptions
...
V /, given an adjacency matrix for G
...
1-7
The incidence matrix of a directed graph G D
...
bij / such that

€

bij D

1 if edge j leaves vertex i ;
1
if edge j enters vertex i ;
0
otherwise :

Describe what the entries of the matrix product BB T represent, where B T is the
transpose of B
...
1-8
Suppose that instead of a linked list, each array entry AdjŒu is a hash table containing the vertices for which
...
If all edge lookups are equally likely, what
is the expected time to determine whether an edge is in the graph? What disadvantages does this scheme have? Suggest an alternate data structure for each edge list
that solves these problems
...
2 Breadth-first search
Breadth-first search is one of the simplest algorithms for searching a graph and
the archetype for many important graph algorithms
...
2) and Dijkstra’s single-source shortest-paths algorithm
(Section 24
...

Given a graph G D
...
It computes the distance (smallest number of edges) from s
to each reachable vertex
...
For any vertex reachable from s, the simple path
in the breadth-first tree from s to corresponds to a “shortest path” from s to
in G, that is, a path containing the smallest number of edges
...

Breadth-first search is so named because it expands the frontier between discovered and undiscovered vertices uniformly across the breadth of the frontier
...

To keep track of progress, breadth-first search colors each vertex white, gray, or
black
...
A
vertex is discovered the first time it is encountered during the search, at which time
it becomes nonwhite
...
1 If
...
Gray vertices may have some adjacent white vertices; they represent
the frontier between discovered and undiscovered vertices
...
Whenever the search discovers a white vertex
in the course of scanning the adjacency list of an already discovered vertex u, the
vertex and the edge
...
We say that u is the predecessor
or parent of in the breadth-first tree
...
Ancestor and descendant relationships in the breadth-first
tree are defined relative to the root s as usual: if u is on the simple path in the tree
from the root s to vertex , then u is an ancestor of and is a descendant of u
...
In fact, as Exercise 22
...


22
...
V; E/ is represented using adjacency lists
...
We store the color of each vertex u 2 V
in the attribute u:color and the predecessor of u in the attribute u:
...

The attribute u:d holds the distance from the source s to vertex u computed by the
algorithm
...
1)
to manage the set of gray vertices
...
G; s/
1 for each vertex u 2 G:V fsg
2
u:color D WHITE
3
u:d D 1
4
u: D NIL
5 s:color D GRAY
6 s:d D 0
7 s: D NIL
8 QD;
9 E NQUEUE
...
Q/
12
for each 2 G:AdjŒu
13
if :color == WHITE
14
:color D GRAY
15
:d D u:d C 1
16
: Du
17
E NQUEUE
...
3 illustrates the progress of BFS on a sample graph
...
With the exception of the source vertex s,
lines 1–4 paint every vertex white, set u:d to be infinity for each vertex u, and set
the parent of every vertex to be NIL
...
Line 6 initializes s:d to 0, and line 7 sets the
predecessor of the source to be NIL
...

The while loop of lines 10–18 iterates as long as there remain gray vertices,
which are discovered vertices that have not yet had their adjacency lists fully examined
...


596

Chapter 22 Elementary Graph Algorithms

r


s
0

t



v


w


x

s
0

t
2


v

1
w

2
x

r
1

s
0

t
2

2
v

1
w

2
x

r
1

s
0

t
2

(g)

Q
2
v

1
w

2
x

3
y

r
1

s
0

t
2


y

s
0

t
2

u


1
w

2
x


y

s
0

t
2

u
3

2
v

u
3

Q


x

1
w

2
x

3
y

r
1


y

(e)

r
1

1
w

2
v

u
3

Q

u


r
1


y

(c)

s
0

t


r
1

Q

0


v

u


(a)

s

1


y

r
1

r

u


s
0

t
2

u
3

(b)

t x
2 2

x v u
2 2 3

u y
3 3

Q

Q

(h)

Q
2
v

1
w

2
x

1
w

2
x

3
y

y
3

u
3

(i)

2
v

v u y
2 3 3

Q

(f)

t x v
2 2 2

Q

(d)

w r
1 1

;

3
y

Figure 22
...
Tree edges are shown shaded as they
are produced by BFS
...
The queue Q is shown at the
beginning of each iteration of the while loop of lines 10–18
...


Although we won’t use this loop invariant to prove correctness, it is easy to see
that it holds prior to the first iteration and that each iteration of the loop maintains
the invariant
...
Line 11 determines the gray vertex u at the head of
the queue Q and removes it from Q
...
If is white, then it has not yet been discovered,
and the procedure discovers it by executing lines 14–17
...
Once the procedure has examined all the vertices on u’s

22
...
The loop invariant is maintained because
whenever a vertex is painted gray (in line 14) it is also enqueued (in line 17), and
whenever a vertex is dequeued (in line 11) it is also painted black (in line 18)
...
(See Exercise 22
...
)
Analysis
Before proving the various properties of breadth-first search, we take on the somewhat easier job of analyzing its running time on an input graph G D
...
We
use aggregate analysis, as we saw in Section 17
...
After initialization, breadth-first
search never whitens a vertex, and thus the test in line 13 ensures that each vertex
is enqueued at most once, and hence dequeued at most once
...
1/ time, and so the total time devoted to queue
operations is O
...
Because the procedure scans the adjacency list of each vertex
only when the vertex is dequeued, it scans each adjacency list at most once
...
E/, the total time spent in
scanning adjacency lists is O
...
The overhead for initialization is O
...
V C E/
...

Shortest paths
At the beginning of this section, we claimed that breadth-first search finds the distance to each reachable vertex in a graph G D
...
Define the shortest-path distance ı
...
s; / D 1
...
s; / from s to a shortest path2
from s to
...


2 In

Chapters 24 and 25, we shall generalize our study of shortest paths to weighted graphs, in which
every edge has a real-valued weight and the weight of a path is the sum of the weights of its constituent edges
...


598

Chapter 22 Elementary Graph Algorithms

Lemma 22
...
V; E/ be a directed or undirected graph, and let s 2 V be an arbitrary
vertex
...
u; / 2 E,
ı
...
s; u/ C 1 :
Proof If u is reachable from s, then so is
...
u; /,
and thus the inequality holds
...
s; u/ D 1, and
the inequality holds
...
s; / for each vertex 2 V
...
s; / from above
...
2
Let G D
...
Then upon termination, for each vertex 2 V , the value :d computed by BFS satisfies :d ı
...

Proof We use induction on the number of E NQUEUE operations
...
s; / for all 2 V
...
The inductive hypothesis holds here, because s:d D 0 D ı
...
s; / for all 2 V fsg
...
The inductive hypothesis implies that u:d ı
...
From
the assignment performed by line 15 and from Lemma 22
...
s; u/ C 1
ı
...
Thus, the
value of :d never changes again, and the inductive hypothesis is maintained
...
s; /, we must first show more precisely how the queue Q
operates during the course of BFS
...


22
...
3
Suppose that during the execution of BFS on a graph G D
...

Then, r :d Ä 1 :d C 1 and i :d Ä i C1 :d for i D 1; 2; : : : ; r 1
...
Initially,
when the queue contains only s, the lemma certainly holds
...
If the head 1 of the queue is dequeued, 2 becomes the
new head
...
) By the
inductive hypothesis, 1 :d Ä 2 :d
...
Thus, the lemma follows with 2 as
the head
...
When we enqueue a vertex in line 17 of BFS, it
becomes rC1
...
Thus, rC1 :d D :d D u:d C1 Ä 1 :d C1
...
Thus, the lemma
follows when is enqueued
...

Corollary 22
...
Then i :d Ä j :d at the time that j is enqueued
...
3 and the property that each vertex receives a
finite d value at most once during the course of BFS
...

Theorem 22
...
V; E/ be a directed or undirected graph, and suppose that BFS is run
on G from a given source vertex s 2 V
...
s; / for all 2 V
...


is a shortest path from s to :

Proof Assume, for the purpose of contradiction, that some vertex receives a d
value not equal to its shortest-path distance
...
s; / that receives such an incorrect d value; clearly
¤ s
...
2, :d ı
...
s; /
...
s; / D 1
:d
...
s; / D ı
...

Because ı
...
s; /, and because of how we chose , we have u:d D ı
...

Putting these properties together, we have
:d > ı
...
s; u/ C 1 D u:d C 1 :

(22
...
At this time, vertex is either white, gray, or black
...
1)
...
1)
...
4, we have
:d Ä u:d, again contradicting inequality (22
...
If is gray, then it was painted
gray upon dequeuing some vertex w, which was removed from Q earlier than u
and for which :d D w:d C 1
...
4, however, w:d Ä u:d, and so we
have :d D w:d C 1 Ä u:d C 1, once again contradicting inequality (22
...

Thus we conclude that :d D ı
...
All vertices reachable
from s must be discovered, for otherwise they would have 1 D :d > ı
...
To
conclude the proof of the theorem, observe that if : D u, then :d D u:d C 1
...


Breadth-first trees
The procedure BFS builds a breadth-first tree as it searches the graph, as Figure 22
...
The tree corresponds to the attributes
...
V; E/ with source s, we define the predecessor subgraph of G as
G D
...
: ; / W

2V

fsgg :

The predecessor subgraph G is a breadth-first tree if V consists of the vertices
reachable from s and, for all 2 V , the subgraph G contains a unique simple

22
...
A breadth-first tree
is in fact a tree, since it is connected and jE j D jV j 1 (see Theorem B
...
We
call the edges in E tree edges
...

Lemma 22
...
V; E/, procedure BFS constructs so that the predecessor subgraph G D
...

Proof Line 16 of BFS sets : D u if and only if
...
s; / < 1—
that is, if is reachable from s—and thus V consists of the vertices in V reachable
from s
...
2, it contains a unique simple path
from s to each vertex in V
...
5 inductively, we conclude
that every such path is a shortest path in G
...
G; s; /
1 if == s
2
print s
3 elseif : == NIL
4
print “no path from” s “to”
5 else P RINT-PATH
...

Exercises
22
...
2(a), using vertex 3 as the source
...
2-2
Show the d and values that result from running breadth-first search on the undirected graph of Figure 22
...


602

Chapter 22 Elementary Graph Algorithms

22
...

22
...
2-5
Argue that in a breadth-first search, the value u:d assigned to a vertex u is independent of the order in which the vertices appear in each adjacency list
...
3 as an example, show that the breadth-first tree computed by BFS can
depend on the ordering within adjacency lists
...
2-6
Give an example of a directed graph G D
...
V; E / from s to is a shortest path in G, yet the set of edges E
cannot be produced by running BFS on G, no matter how the vertices are ordered
in each adjacency list
...
2-7
There are two types of professional wrestlers: “babyfaces” (“good guys”) and
“heels” (“bad guys”)
...
Suppose we have n professional wrestlers and we have a list
of r pairs of wrestlers for which there are rivalries
...
n C r/-time algorithm that determines whether it is possible to designate some of the wrestlers as
babyfaces and the remainder as heels such that each rivalry is between a babyface
and a heel
...

22
...
V; E/ is defined as maxu; 2V ı
...
Give an efficient algorithm to
compute the diameter of a tree, and analyze the running time of your algorithm
...
2-9
Let G D
...
Give an O
...
Describe how you can find your way out of a maze if you are given a
large supply of pennies
...
3 Depth-first search

603

22
...
Depth-first search explores edges out
of the most recently discovered vertex that still has unexplored edges leaving it
...
This process continues until we
have discovered all the vertices that are reachable from the original source vertex
...
The algorithm repeats this
entire process until it has discovered every vertex
...
Unlike breadth-first search,
whose predecessor subgraph forms a tree, the predecessor subgraph produced by
a depth-first search may be composed of several trees, because the search may
repeat from multiple sources
...
V; E /, where
E D f
...
The edges in E are tree edges
...
Each vertex is initially white, is grayed when it is discovered
in the search, and is blackened when it is finished, that is, when its adjacency list
has been examined completely
...

Besides creating a depth-first forest, depth-first search also timestamps each vertex
...
These timestamps

3 It

may seem arbitrary that breadth-first search is limited to only one source whereas depth-first
search may search from multiple sources
...
Breadth-first search usually serves to find shortestpath distances (and the associated predecessor subgraph) from a given source
...


604

Chapter 22 Elementary Graph Algorithms

provide important information about the structure of the graph and are generally
helpful in reasoning about the behavior of depth-first search
...
These timestamps are integers
between 1 and 2 jV j, since there is one discovery event and one finishing event for
each of the jV j vertices
...
2)

Vertex u is WHITE before time u:d, GRAY between time u:d and time u:f , and
BLACK thereafter
...
The input
graph G may be undirected or directed
...

DFS
...
G; u/
DFS-V ISIT
...
G; /
8 u:color D BLACK
9 time D time C 1
10 u:f D time

/ white vertex u has just been discovered
/

/ explore edge
...
4 illustrates the progress of DFS on the graph shown in Figure 22
...

Procedure DFS works as follows
...
Line 4 resets the global time counter
...
Every time DFS-V ISIT
...
3 Depth-first search
u
1/

v

w

u
1/

x

y
(a)

z

x

u
1/

v
2/

w

u
1/

B

4/
x

v
2/

u
1/

v
2/

y
(b)

z

x

3/
y
(c)

v
2/

z

4/5
x

w

u
1/

v
2/

4/5
x

z

w

B

u
1/8
F

u
1/

3/
y
(d)

u
1/

w

v
2/

4/
x

z

v
2/7

4/5
x

z

w

u
1/8

B

F

z

w

3/6
y

(g)

v
2/7

w

B

3/6
y

(f)

v
2/7

w

B

3/
y

(e)

F

w

B

3/
y

u
1/

605

z

(h)

v
2/7

w
9/

u
1/8

B

F

v
2/7
B

w
9/
C

4/5
x

3/6
y
(i)

z

4/5
x

3/6
y
(j)

z

4/5
x

3/6
y
(k)

z

4/5
x

3/6
y
(l)

z

u
1/8

v
2/7

w
9/

u
1/8

v
2/7

w
9/

u
1/8

v
2/7

w
9/

u
1/8

v
2/7

w
9/12

F

B

C

F

B

C

F

B

C

F

B

4/5
x

3/6
y
(m)

10/
z

4/5
x

3/6
y
(n)

10/
z

B

C

B

4/5
x

3/6
y
(o)

10/11
z

B

4/5
x

3/6
y

10/11
z

(p)

Figure 22
...
As edges
are explored by the algorithm, they are shown as either shaded (if they are tree edges) or dashed
(otherwise)
...
Timestamps within vertices indicate discovery time/finishing times
...
When DFS returns, every vertex u
has been assigned a discovery time u:d and a finishing time u:f
...
G; u/, vertex u is initially white
...
Lines 4–7 examine each vertex adjacent to u
and recursively visit if it is white
...
u; / is explored by the depth-first search
...

Note that the results of depth-first search may depend upon the order in which
line 5 of DFS examines the vertices and upon the order in which line 4 of DFSV ISIT visits the neighbors of a vertex
...

What is the running time of DFS? The loops on lines 1–3 and lines 5–7 of DFS
take time ‚
...
As we did
for breadth-first search, we use aggregate analysis
...

During an execution of DFS-V ISIT
...
Since
X
jAdjŒ j D ‚
...
E/
...
V C E/
...
Perhaps the most basic property of depth-first search is that the predecessor subgraph G does indeed form a forest of trees, since the structure of the depthfirst trees exactly mirrors the structure of recursive calls of DFS-V ISIT
...
G; / was called during a search of u’s adjacency list
...

Another important property of depth-first search is that discovery and finishing
times have parenthesis structure
...
u” and represent its finishing by a right parenthesis “u/”, then
the history of discoveries and finishings makes a well-formed expression in the
sense that the parentheses are properly nested
...
5(a) corresponds to the parenthesization shown in Figure 22
...
The
following theorem provides another way to characterize the parenthesis structure
...
7 (Parenthesis theorem)
In any depth-first search of a (directed or undirected) graph G D
...


22
...
5 Properties of depth-first search
...
Vertices are timestamped and edge types are indicated as in Figure 22
...
(b) Intervals for
the discovery time and finishing time of each vertex correspond to the parenthesization shown
...

Only tree edges are shown
...

(c) The graph of part (a) redrawn with all tree and forward edges going down within a depth-first tree
and all back edges going up from a descendant to an ancestor
...
We consider two subcases,
according to whether :d < u:f or not
...
Moreover, since was discovered more recently than u, all of its outgoing edges are explored, and is finished, before the search returns to and finishes u
...
In the other subcase, u:f < :d, and by inequality (22
...

Because the intervals are disjoint, neither vertex was discovered while the other
was gray, and so neither vertex is a descendant of the other
...

Corollary 22
...

Proof

Immediate from Theorem 22
...


The next theorem gives another important characterization of when one vertex
is a descendant of another in the depth-first forest
...
9 (White-path theorem)
In a depth-first forest of a (directed or undirected) graph G D
...

Proof ): If D u, then the path from u to contains just vertex u, which is still
white when we set the value of u:d
...
By Corollary 22
...
Since can be any descendant of u, all vertices on the unique simple
path from u to in the depth-first forest are white at time u:d
...
Without loss of generality, assume that every vertex other than along the path becomes a descendant of u
...
) Let w be the predecessor of in the path, so that w is a descendant
of u (w and u may in fact be the same vertex)
...
8, w:f Ä u:f
...
Theorem 22
...
3 Depth-first search

is contained entirely within the interval Œu:d; u:f 
...
8,
all be a descendant of u
...
V; E/
...
For example, in the next section, we
shall see that a directed graph is acyclic if and only if a depth-first search yields no
“back” edges (Lemma 22
...

We can define four edge types in terms of the depth-first forest G produced by
a depth-first search on G:
1
...
Edge
...
u; /
...
Back edges are those edges
...
We consider self-loops, which may occur in directed graphs, to
be back edges
...
Forward edges are those nontree edges
...

4
...
They can go between vertices in the same
depth-first tree, as long as one vertex is not an ancestor of the other, or they can
go between vertices in different depth-first trees
...
4 and 22
...
Figure 22
...
5(a) so that all tree and forward edges head
downward in a depth-first tree and all back edges go up
...

The DFS algorithm has enough information to classify some edges as it encounters them
...
u; /, the color of
vertex tells us something about the edge:
1
...
GRAY indicates a back edge, and
3
...

The first case is immediate from the specification of the algorithm
...
Exploration always proceeds from the deepest gray vertex, so

610

Chapter 22 Elementary Graph Algorithms

an edge that reaches another gray vertex has reached an ancestor
...
3-5 asks you to show that such an
edge
...

An undirected graph may entail some ambiguity in how we classify edges,
since
...
; u/ are really the same edge
...
Equivalently (see Exercise 22
...
u; / or
...

We now show that forward and cross edges never occur in a depth-first search of
an undirected graph
...
10
In a depth-first search of an undirected graph G, every edge of G is either a tree
edge or a back edge
...
u; / be an arbitrary edge of G, and suppose without loss of generality
that u:d < :d
...
If the first time that the search
explores edge
...
Thus,
...
If the
search explores
...
u; / is a back edge,
since u is still gray at the time the edge is first explored
...

Exercises
22
...
In
each cell
...

For each possible edge, indicate what edge types it can be
...

22
...
6
...
Show the
discovery and finishing times for each vertex, and show the classification of each
edge
...
3 Depth-first search

611

q
s
v

r
t

w

x

u
y

z

Figure 22
...
3-2 and 22
...


22
...
4
...
3-4
Show that using a single bit to store each vertex color suffices by arguing that
the DFS procedure would produce the same result if line 3 of DFS-V ISIT was
removed
...
3-5
Show that edge
...
a tree edge or forward edge if and only if u:d < :d < :f < u:f ,
b
...
a cross edge if and only if :d < :f < u:d < u:f
...
3-6
Show that in an undirected graph, classifying an edge
...
u; / or
...

22
...

22
...


612

Chapter 22 Elementary Graph Algorithms

22
...

22
...
Show what modifications, if any, you need
to make if G is undirected
...
3-11
Explain how a vertex u of a directed graph can end up in a depth-first tree containing only u, even though u has both incoming and outgoing edges in G
...
3-12
Show that we can use a depth-first search of an undirected graph G to identify the
connected components of G, and that the depth-first forest contains as many trees
as G has connected components
...

22
...
V; E/ is singly connected if u ; implies that G contains
at most one simple path from u to for all vertices u; 2 V
...


22
...
A topological sort
of a dag G D
...
u; /, then u appears before in the ordering
...
) We can view a topological sort of a graph as
an ordering of its vertices along a horizontal line so that all directed edges go from
left to right
...

Many applications use directed acyclic graphs to indicate precedences among
events
...
7 gives an example that arises when Professor Bumstead gets
dressed in the morning
...
g
...
Other items may be put on in any order (e
...
, socks and

22
...
7 (a) Professor Bumstead topologically sorts his clothing when getting dressed
...
u; / means that garment u must be put on before garment
...
(b) The same graph shown
topologically sorted, with its vertices arranged from left to right in order of decreasing finishing time
...


pants)
...
u; / in the dag of Figure 22
...
A topological sort of this dag therefore gives an
order for getting dressed
...
7(b) shows the topologically sorted dag as an
ordering of vertices along a horizontal line such that all directed edges go from left
to right
...
G/
1 call DFS
...
7(b) shows how the topologically sorted vertices appear in reverse order
of their finishing times
...
V C E/, since depth-first search
takes ‚
...
1/ time to insert each of the jV j vertices onto
the front of the linked list
...


614

Chapter 22 Elementary Graph Algorithms

Lemma 22
...

Proof ): Suppose that a depth-first search produces a back edge
...
Then
vertex is an ancestor of vertex u in the depth-first forest
...
u; / completes a cycle
...
We show that a depth-first search of G
yields a back edge
...
u; / be
the preceding edge in c
...
By the white-path theorem, vertex u becomes a descendant of in the
depth-first forest
...
u; / is a back edge
...
12
T OPOLOGICAL -S ORT produces a topological sort of the directed acyclic graph
provided as its input
...
V; E/ to determine finishing times for its vertices
...
Consider any
edge
...
G/
...
u; / would be a back edge, contradicting Lemma 22
...
Therefore, must be either white or black
...
If is black, it has already been
finished, so that :f has already been set
...
Thus, for any edge
...

Exercises
22
...
8, under the assumption of Exercise 22
...

22
...
V; E/ and two vertices s and t, and returns the number of simple paths from s
to t in G
...
8 contains exactly
four simple paths from vertex p to vertex : po , pory , posry , and psry
...
)

22
...
8 A dag for topological sorting
...
4-3
Give an algorithm that determines whether or not a given undirected graph G D

...
Your algorithm should run in O
...

22
...
G/ produces a vertex ordering that minimizes the number of “bad” edges
that are inconsistent with the ordering produced
...
4-5
Another way to perform topological sorting on a directed acyclic graph G D

...
Explain how to implement this idea so
that it runs in time O
...
What happens to this algorithm if G has cycles?

22
...
This section shows how to do
so using two depth-first searches
...
After decomposing the graph into strongly connected components, such algorithms run separately on each one and then combine
the solutions according to the structure of connections among components
...
V; E/ is a maximal set of vertices C Â V such that for every pair
of vertices u and in C , we have both u ; and ; u; that is, vertices u and
are reachable from each other
...
9 shows an example
...
9 (a) A directed graph G
...

Each vertex is labeled with its discovery and finishing times in a depth-first search, and tree edges
are shaded
...
Each strongly connected
component corresponds to one depth-first tree
...
(c) The acyclic component
graph G SCC obtained by contracting all edges within each strongly connected component of G so
that only a single vertex remains in each component
...
V; E/ uses the transpose of G, which we defined in Exercise 22
...
V; E T /, where E T D f
...
; u/ 2 Eg
...
Given an adjacency-list representation of G, the time to create G T is O
...
It is interesting to observe that G
and G T have exactly the same strongly connected components: u and are reachable from each other in G if and only if they are reachable from each other in G T
...
9(b) shows the transpose of the graph in Figure 22
...


22
...
e
...
V; E/ using two depth-first
searches, one on G and one on G T
...
G/
1 call DFS
...
G T /, but in the main loop of DFS, consider the vertices
in order of decreasing u:f (as computed in line 1)
4 output the vertices of each tree in the depth-first forest formed in line 3 as a
separate strongly connected component
The idea behind this algorithm comes from a key property of the component
graph G SCC D
...
Suppose that G
has strongly connected components C1 ; C2 ; : : : ; Ck
...
There is an edge
...
x; y/
for some x 2 Ci and some y 2 Cj
...
Figure 22
...
9(a)
...

Lemma 22
...
V; E/, let u; 2 C , let u0 ; 0 2 C 0 , and suppose that G contains a path u ; u0
...

Proof If G contains a path 0 ; , then it contains paths u ; u0 ; 0 and
0
; ; u
...

We shall see that by considering vertices in the second depth-first search in decreasing order of the finishing times that were computed in the first depth-first
search, we are, in essence, visiting the vertices of the component graph (each of
which corresponds to a strongly connected component of G) in topologically sorted
order
...
In this section, these values always refer to the discovery and finishing
times as computed by the first call of DFS, in line 1
...

If U Â V , then we define d
...
U / D maxu2U fu:f g
...
U / and f
...

The following lemma and its corollary give a key property relating strongly connected components and finishing times in the first depth-first search
...
14
Let C and C 0 be distinct strongly connected components in directed graph G D

...
Suppose that there is an edge
...
Then
f
...
C 0 /
...

If d
...
C 0 /, let x be the first vertex discovered in C
...
At that time, G contains a path from x to each vertex
in C consisting only of white vertices
...
u; / 2 E, for any vertex w 2 C 0 ,
there is also a path in G at time x:d from x to w consisting only of white vertices:
x ; u ! ; w
...
By Corollary 22
...
C / > f
...

If instead we have d
...
C 0 /, let y be the first vertex discovered in C 0
...
By the white-path theorem, all vertices in C 0 become descendants of y in the depth-first tree, and by Corollary 22
...
C 0 /
...
Since there is an edge
...
13 implies that there cannot be a path from C 0 to C
...
At time y:f , therefore, all vertices in C
are still white
...
C / > f
...

The following corollary tells us that each edge in G T that goes between different
strongly connected components goes from a component with an earlier finishing
time (in the first depth-first search) to a component with a later finishing time
...
15
Let C and C 0 be distinct strongly connected components in directed graph G D

...
Suppose that there is an edge
...
Then
f
...
C 0 /
...
5 Strongly connected components

619

Proof Since
...
; u/ 2 E
...
14 implies that
f
...
C 0 /
...
15 provides the key to understanding why the strongly connected
components algorithm works
...
We start with the strongly connected
component C whose finishing time f
...
The search starts from
some vertex x 2 C , and it visits all vertices in C
...
15, G T contains
no edges from C to any other strongly connected component, and so the search
from x will not visit vertices in any other component
...
Having completed visiting all vertices in C ,
the search in line 3 selects as a root a vertex from some other strongly connected
component C 0 whose finishing time f
...
Again, the search will visit all vertices in C 0 , but by Corollary 22
...
In general, when the depth-first search of G T in line 3 visits
any strongly connected component, any edges out of that component must be to
components that the search already visited
...
The following theorem formalizes this
argument
...
16
The S TRONGLY-C ONNECTED -C OMPONENTS procedure correctly computes the
strongly connected components of the directed graph G provided as its input
...
The inductive hypothesis is that the first k trees produced
in line 3 are strongly connected components
...

In the inductive step, we assume that each of the first k depth-first trees produced
in line 3 is a strongly connected component, and we consider the
...
Let the root of this tree be vertex u, and let u be in strongly connected
component C
...
C / > f
...
By the inductive hypothesis, at the time that the search
visits u, all other vertices of C are white
...
Moreover, by the
inductive hypothesis and by Corollary 22
...
Thus, no vertex

620

Chapter 22 Elementary Graph Algorithms

in any strongly connected component other than C will be a descendant of u during
the depth-first search of G T
...

Here is another way to look at how the second depth-first search operates
...
G T /SCC of G T
...
G T /SCC , the second depth-first search visits vertices of
...
If we reverse the edges of
...
G T /SCC /T
...
G T /SCC /T D G SCC (see Exercise 22
...

Exercises
22
...
5-2
Show how the procedure S TRONGLY-C ONNECTED -C OMPONENTS works on the
graph of Figure 22
...
Specifically, show the finishing times computed in line 1 and
the forest produced in line 3
...

22
...
Does this simpler algorithm always produce correct results?
22
...
G T /SCC /T D G SCC
...

22
...
V C E/-time algorithm to compute the component graph of a directed
graph G D
...
Make sure that there is at most one edge between two vertices
in the component graph your algorithm produces
...
5-6
Given a directed graph G D
...
V; E 0 / such that (a) G 0 has the same strongly connected components as G, (b) G 0
has the same component graph as G, and (c) E 0 is as small as possible
...

22
...
V; E/ is semiconnected if, for all pairs of vertices u; 2 V ,
we have u ; or ; u
...
Prove that your algorithm is correct, and analyze its
running time
...
A breadth-first tree can also be used to classify the edges reachable
from the source of the search into the same four categories
...
Prove that in a breadth-first search of an undirected graph, the following properties hold:
1
...

2
...
u; /, we have :d D u:d C 1
...
For each cross edge
...

b
...

2
...

4
...

For each tree edge
...

For each cross edge
...

For each back edge
...


22-2 Articulation points, bridges, and biconnected components
Let G D
...
An articulation point of G is
a vertex whose removal disconnects G
...
A biconnected component of G is a maximal set of edges such
that any two edges in the set lie on a common simple cycle
...
10 illustrates

622

Chapter 22 Elementary Graph Algorithms

2
1

6

4
3
5

Figure 22
...
The articulation points are the heavily shaded vertices, the
bridges are the heavily shaded edges, and the biconnected components are the edges in the shaded
regions, with a bcc numbering shown
...
We can determine articulation points, bridges, and biconnected
components using depth-first search
...
V; E / be a depth-first tree of G
...
Prove that the root of G is an articulation point of G if and only if it has at
least two children in G
...
Let be a nonroot vertex of G
...

c
...
u; w/ is a back edge for some descendant u of

Show how to compute :low for all vertices

:

2 V in O
...


d
...
E/ time
...
Prove that an edge of G is a bridge if and only if it does not lie on any simple
cycle of G
...
Show how to compute all the bridges of G in O
...

g
...

h
...
E/-time algorithm to label each edge e of G with a positive integer e:bcc such that e:bcc D e 0 :bcc if and only if e and e 0 are in the same
biconnected component
...
V; E/ is a cycle that
traverses each edge of G exactly once, although it may visit a vertex more than
once
...
Show that G has an Euler tour if and only if in-degree
...
/ for
each vertex 2 V
...
Describe an O
...
(Hint:
Merge edge-disjoint cycles
...
V; E/ be a directed graph in which each vertex u 2 V is labeled with
a unique integer L
...
For each vertex u 2 V , let
R
...
Define
min
...
u/ whose label is minimum, i
...
, min
...
/ D min fL
...
u/g
...
V C E/-time algorithm that
computes min
...


Chapter notes
Even [103] and Tarjan [330] are excellent references for graph algorithms
...
Lee [226] independently discovered the same algorithm in
the context of routing wires on circuit boards
...
Depth-first search has
been widely used since the late 1950s, especially in artificial intelligence programs
...
The algorithm for strongly connected components in Section 22
...
R
...
Sharir [314]
...
Knuth [209] was the first to give a linear-time algorithm for
topological sorting
...
To interconnect a set of n pins, we can
use an arrangement of n 1 wires, each connecting two pins
...

We can model this wiring problem with a connected, undirected graph G D

...
u; / 2 E, we have a weight w
...
We then wish to find an
acyclic subset T Â E that connects all of the vertices and whose total weight
X
w
...
T / D

...
Since T is acyclic and connects all of the vertices, it must form a tree,
which we call a spanning tree since it “spans” the graph G
...
1 Figure 23
...

In this chapter, we shall examine two algorithms for solving the minimumspanning-tree problem: Kruskal’s algorithm and Prim’s algorithm
...
E lg V / using ordinary binary heaps
...
E C V lg V /, which improves
over the binary-heap implementation if jV j is much smaller than jEj
...
Each
step of a greedy algorithm must make one of several possible choices
...
Such a strategy does not generally guarantee that it will always find globally optimal solutions

1 The phrase “minimum spanning tree” is a shortened form of the phrase “minimum-weight spanning
tree
...
2
...
1 Growing a minimum spanning tree

4
a

8

b
11

i
7

8
h

7

c
2

4

d
14

6
1

625

9
e
10

g

2

f

Figure 23
...
The weights on edges are shown,
and the edges in a minimum spanning tree are shaded
...
This
minimum spanning tree is not unique: removing the edge
...
a; h/
yields another spanning tree with weight 37
...
For the minimum-spanning-tree problem, however, we can prove that
certain greedy strategies do yield a spanning tree with minimum weight
...

Section 23
...
Section 23
...
The first algorithm, due to Kruskal, is similar
to the connected-components algorithm from Section 21
...
The second, due to
Prim, resembles Dijkstra’s shortest-paths algorithm (Section 24
...

Because a tree is a type of graph, in order to be precise we must define a tree in
terms of not just its edges, but its vertices as well
...


23
...
V; E/ with a weight
function w W E ! R, and we wish to find a minimum spanning tree for G
...

This greedy strategy is captured by the following generic method, which grows
the minimum spanning tree one edge at a time
...

At each step, we determine an edge
...
u; /g is also a subset of a minimum spanning

626

Chapter 23 Minimum Spanning Trees

tree
...

G ENERIC -MST
...
u; / that is safe for A
4
A D A [ f
...

Maintenance: The loop in lines 2–4 maintains the invariant by adding only safe
edges
...

The tricky part is, of course, finding a safe edge in line 3
...
Within the while loop body, A must be a proper subset of T , and
therefore there must be an edge
...
u; / 62 A and
...

In the remainder of this section, we provide a rule (Theorem 23
...
The next section describes two algorithms that use this rule to find
safe edges efficiently
...
A cut
...
V; E/ is a partition of V
...
2 illustrates this notion
...
u; / 2 E crosses the cut
...
We say that a cut respects a set A of edges if no edge in A crosses the
cut
...
Note that there can be more than one light edge crossing a cut in
the case of ties
...

Our rule for recognizing safe edges is given by the following theorem
...
1
Let G D
...
Let A be a subset of E that is included in some minimum
spanning tree for G, let
...
u; /
be a light edge crossing
...
Then, edge
...


23
...
2 Two ways of viewing a cut
...
1
...
The edges crossing the cut are those
connecting white vertices with black vertices
...
d; c/ is the unique light edge crossing the
cut
...
S; V S/ respects A, since no edge of A
crosses the cut
...
An edge crosses the cut if it connects a vertex on the left with a vertex on the
right
...
u; /, since if it does, we are done
...
u; /g by using a
cut-and-paste technique, thereby showing that
...

The edge
...
3 illustrates
...
S; V S/, at least one edge in T lies on the simple path p and also crosses
the cut
...
x; y/ be any such edge
...
x; y/ is not in A, because the cut
respects A
...
x; y/ is on the unique simple path from u to in T , removing
...
Adding
...
x; y/g [ f
...

We next show that T 0 is a minimum spanning tree
...
u; / is a light edge
crossing
...
x; y/ also crosses this cut, w
...
x; y/
...
T 0 / D w
...
x; y/ C w
...
T / :

628

Chapter 23 Minimum Spanning Trees

x
p

u

y

v

Figure 23
...
1
...

The edges in the minimum spanning tree T are shown, but the edges in the graph G are not
...
u; / is a light edge crossing the cut
...
The edge
...
To form a minimum spanning tree T 0 that
contains
...
x; y/ from T and add the edge
...


But T is a minimum spanning tree, so that w
...
T 0 /; thus, T 0 must be a
minimum spanning tree also
...
u; / is actually a safe edge for A
...
x; y/ 62 A; thus, A [ f
...
Consequently, since T 0 is a
minimum spanning tree,
...

Theorem 23
...
V; E/
...
At any point in the execution, the graph
GA D
...

(Some of the trees may contain just one vertex, as is the case, for example, when
the method begins: A is empty and the forest contains jV j trees, one for each
vertex
...
u; / for A connects distinct components of GA ,
since A [ f
...

The while loop in lines 2–4 of G ENERIC -MST executes jV j 1 times because
it finds one of the jV j 1 edges of a minimum spanning tree in each iteration
...
When the forest contains only a single tree, the method terminates
...
2 use the following corollary to Theorem 23
...


23
...
2
Let G D
...
Let A be a subset of E that is included in some minimum
spanning tree for G, and let C D
...
V; A/
...
u; / is a light edge connecting C to some other component
in GA , then
...

VC / respects A, and
...

Proof The cut
...
u; / is safe for A
...
1-1
Let
...
Show that
...

23
...
1
...
V; E/ be a connected, undirected graph with a real-valued weight function w defined on E
...
S; V S/ be any cut of G that respects A, and let
...
S; V S/
...
u; / is a light edge for the cut
...

23
...
u; / is contained in some minimum spanning tree, then it is
a light edge crossing some cut of the graph
...
1-4
Give a simple example of a connected graph such that the set of edges f
...
S; V S/ such that
...
S; V S/g
does not form a minimum spanning tree
...
1-5
Let e be a maximum-weight edge on some cycle of connected graph G D
...

Prove that there is a minimum spanning tree of G 0 D
...
That is, there is a minimum spanning tree of G that
does not include e
...
1-6
Show that a graph has a unique minimum spanning tree if, for every cut of the
graph, there is a unique light edge crossing the cut
...

23
...
Give an example
to show that the same conclusion does not follow if we allow some weights to be
nonpositive
...
1-8
Let T be a minimum spanning tree of a graph G, and let L be the sorted list of the
edge weights of T
...

23
...
V; E/, and let V 0 be a subset
of V
...
Show that if T 0 is connected, then T 0 is a minimum spanning tree
of G 0
...
1-10
Given a graph G and a minimum spanning tree T , suppose that we decrease the
weight of one of the edges in T
...
More formally, let T be a minimum spanning tree for G with edge weights
given by weight function w
...
x; y/ 2 T and a positive number k,
and define the weight function w 0 by
(
w
...
u; / ¤
...
u; / D
w
...
u; / D
...

23
...
Give an algorithm for finding the minimum
spanning tree in the modified graph
...
2 The algorithms of Kruskal and Prim

631

23
...
They each use a specific rule to determine a safe edge in line 3
of G ENERIC -MST
...
The safe edge added to A is always a least-weight
edge in the graph that connects two distinct components
...
The safe edge added to A is always a least-weight edge
connecting the tree to a vertex not in the tree
...
u; / of least weight
...
u; /
...
u; / must
be a light edge connecting C1 to some other tree, Corollary 23
...
u; /
is a safe edge for C1
...

Our implementation of Kruskal’s algorithm is like the algorithm to compute
connected components from Section 21
...
It uses a disjoint-set data structure to
maintain several disjoint sets of elements
...
The operation F IND -S ET
...
Thus, we can determine whether two vertices u and
belong to the same tree by testing whether F IND -S ET
...
To
combine trees, Kruskal’s algorithm calls the U NION procedure
...
G; w/
1 AD;
2 for each vertex 2 G:V
3
M AKE -S ET
...
u; / 2 G:E, taken in nondecreasing order by weight
6
if F IND -S ET
...
/
7
A D A [ f
...
u; /
9 return A
Figure 23
...
Lines 1–3 initialize the set A
to the empty set and create jV j trees, one containing each vertex
...
The loop

632

Chapter 23 Minimum Spanning Trees

4

8

b

7

c

d

9

4

8

b

2
(a)

a

11
8

4

h

4

e

(b)

a

2
7

c

8

f

d

11

i
7

10
g

8

b

14

6
1

9

4

h

a

11
8

4

h

4

e

(d)

a

8

b

2
7

c

8

f

d

11

a

11
8

4

h

4

9

4
e

(f)

a

h

a

11

2
7

c

8

h

9

14

e
10

g

8

b

d

11
8

f

4

9

14

6
1

d

2
7

c

f

d

9

i

4

h

4

8

e
10

g

1

b

14

6
2
7

c

f

d

9

2

i
7

4

1

2
(g)

7

f

6

7

10
g

8

b

14

6
1

e

2

i
7

2

c

i

2
(e)

14
10

g

1

7

10
g

8

b

14

6
1

9

2

i
7

4
6

2
(c)

d

2

i
7

7

c

e
10

g

2

f

(h)

a

11

i
7

8

h

4

14

6
1

e
10

g

2

f

Figure 23
...
1
...
The algorithm considers each edge in sorted order by weight
...
If the edge joins two
distinct trees in the forest, it is added to the forest, thereby merging the two trees
...
u; /, whether the endpoints u and belong to the same
tree
...
u; / cannot be added to the forest without creating
a cycle, and the edge is discarded
...
In this case, line 7 adds the edge
...


23
...
4, continued Further steps in the execution of Kruskal’s algorithm
...
V; E/ depends
on how we implement the disjoint-set data structure
...
3 with the union-by-rank and
path-compression heuristics, since it is the asymptotically fastest implementation
known
...
1/ time, and the time to sort the
edges in line 4 is O
...
(We will account for the cost of the jV j M AKE -S ET
operations in the for loop of lines 2–3 in a moment
...
E/ F IND -S ET and U NION operations on the disjoint-set forest
...
V C E/ ˛
...
4
...
E ˛
...
Moreover, since ˛
...
lg V / D O
...
E lg E/
...
lg V /, and so we can restate the running time of Kruskal’s
algorithm as O
...


634

Chapter 23 Minimum Spanning Trees

Prim’s algorithm
Like Kruskal’s algorithm, Prim’s algorithm is a special case of the generic minimum-spanning-tree method from Section 23
...
Prim’s algorithm operates much
like Dijkstra’s algorithm for finding shortest paths in a graph, which we shall see in
Section 24
...
Prim’s algorithm has the property that the edges in the set A always
form a single tree
...
5 shows, the tree starts from an arbitrary root
vertex r and grows until the tree spans all the vertices in V
...
By Corollary 23
...
This strategy qualifies as greedy since at each step it adds to the tree an edge
that contributes the minimum amount possible to the tree’s weight
...
In the pseudocode below,
the connected graph G and the root r of the minimum spanning tree to be grown
are inputs to the algorithm
...
For
each vertex , the attribute :key is the minimum weight of any edge connecting
to a vertex in the tree; by convention, :key D 1 if there is no such edge
...
The algorithm implicitly maintains
the set A from G ENERIC -MST as
A D f
...
; : / W

2V

frgg :

MST-P RIM
...
Q/
8
for each 2 G:AdjŒu
9
if 2 Q and w
...
u; /

23
...
5 The execution of Prim’s algorithm on the graph from Figure 23
...
The root vertex
is a
...
At each step of
the algorithm, the vertices in the tree determine a cut of the graph, and a light edge crossing the cut
is added to the tree
...
b; c/ or edge
...


636

Chapter 23 Minimum Spanning Trees

Figure 23
...
Lines 1–5 set the key of each
vertex to 1 (except for the root r, whose key is set to 0 so that it will be the
first vertex processed), set the parent of each vertex to NIL, and initialize the minpriority queue Q to contain all the vertices
...
A D f
...

2
...

3
...
; : / connecting to some vertex already
placed into the minimum spanning tree
...
V Q; Q/ (with the exception of the first iteration, in which u D r due to line 4)
...
u; u: / to A
...

The running time of Prim’s algorithm depends on how we implement the minpriority queue Q
...
V / time
...
lg V / time, the total time for all calls to E XTRACT-M IN is O
...

The for loop in lines 8–11 executes O
...
Within the for loop, we can implement the
test for membership in Q in line 9 in constant time by keeping a bit for each vertex
that tells whether or not it is in Q, and updating the bit when the vertex is removed
from Q
...
lg V / time
...
V lg V C E lg V / D O
...

We can improve the asymptotic running time of Prim’s algorithm by using Fibonacci heaps
...
lg V / amortized time and a D ECREASE -K EY
operation (to implement line 11) takes O
...
Therefore, if we use a
Fibonacci heap to implement the min-priority queue Q, the running time of Prim’s
algorithm improves to O
...


23
...
2-1
Kruskal’s algorithm can return different spanning trees for the same input graph G,
depending on how it breaks ties when the edges are sorted into order
...

23
...
V; E/ as an adjacency matrix
...
V 2 / time
...
2-3
For a sparse graph G D
...
V /, is the implementation of
Prim’s algorithm with a Fibonacci heap asymptotically faster than the binary-heap
implementation? What about for a dense graph, where jEj D ‚
...
2-4
Suppose that all edge weights in a graph are integers in the range from 1 to jV j
...
2-5
Suppose that all edge weights in a graph are integers in the range from 1 to jV j
...
2-6 ?
Suppose that the edge weights in a graph are uniformly distributed over the halfopen interval Œ0; 1/
...
2-7 ?
Suppose that a graph G has a minimum spanning tree already computed
...
2-8
Professor Borden proposes a new divide-and-conquer algorithm for computing
minimum spanning trees, which goes as follows
...
V; E/,
partition the set V of vertices into two sets V1 and V2 such that jV1 j and jV2 j differ

638

Chapter 23 Minimum Spanning Trees

by at most 1
...
Recursively solve
a minimum-spanning-tree problem on each of the two subgraphs G1 D
...
V2 ; E2 /
...
V1 ; V2 /, and use this edge to unite the resulting two minimum spanning trees
into a single spanning tree
...


Problems
23-1 Second-best minimum spanning tree
Let G D
...

We define a second-best minimum spanning tree as follows
...
Then
a second-best minimum spanning tree is a spanning tree T such that w
...
T 00 /g
...
Show that the minimum spanning tree is unique, but that the second-best minimum spanning tree need not be unique
...
Let T be the minimum spanning tree of G
...
u; / 2 T and
...
u; /g [ f
...

c
...
Describe an O
...

d
...

23-2 Minimum spanning tree in sparse graphs
For a very sparse connected graph G D
...
E C V lg V / running time of Prim’s algorithm with Fibonacci heaps by preprocessing G to decrease the number of vertices before running Prim’s algorithm
...
u; / incident
on u, and we put
...
We

Problems for Chapter 23

639

then contract all chosen edges (see Section B
...
Rather than contracting these
edges one at a time, we first identify sets of vertices that are united into the same
new vertex
...
Several edges from the original graph may
be renamed the same as each other
...

Initially, we set the minimum spanning tree T being constructed to be empty,
and for each edge
...
u; /:orig D
...
u; /:c D w
...
We use the orig attribute to reference the edge from the
initial graph that is associated with an edge in the contracted graph
...
The procedure MST-R EDUCE takes
inputs G and T , and it returns a contracted graph G 0 with updated attributes orig0
and c 0
...

MST-R EDUCE
...
/
4 for each u 2 G:V
5
if u:mark == FALSE
6
choose 2 G:AdjŒu such that
...
u; /
8
T D T [ f
...
/ W 2 G:Vg
11 G 0 :E D ;
12 for each
...
x/
14
D F IND -S ET
...
u; / 62 G 0 :E
16
G 0 :E D G 0 :E [ f
...
u; /:orig0 D
...
u; /:c 0 D
...
x; y/:c <
...
u; /:orig0 D
...
u; /:c 0 D
...
Let T be the set of edges returned by MST-R EDUCE, and let A be the minimum
spanning tree of the graph G 0 formed by the call MST-P RIM
...
Prove
that T [ f
...
x; y/ 2 Ag is a minimum spanning tree of G
...
Argue that jG 0 :Vj Ä jV j =2
...
Show how to implement MST-R EDUCE so that it runs in O
...
(Hint:
Use simple data structures
...
Suppose that we run k phases of MST-R EDUCE, using the output G 0 produced
by one phase as the input G to the next phase and accumulating edges in T
...
kE/
...
Suppose that after running k phases of MST-R EDUCE, as in part (d), we run
Prim’s algorithm by calling MST-P RIM
...
Show how
to pick k so that the overall running time is O
...
Argue that your
choice of k minimizes the overall asymptotic running time
...
For what values of jEj (in terms of jV j) does Prim’s algorithm with preprocessing asymptotically beat Prim’s algorithm without preprocessing?
23-3 Bottleneck spanning tree
A bottleneck spanning tree T of an undirected graph G is a spanning tree of G
whose largest edge weight is minimum over all spanning trees of G
...

a
...

Part (a) shows that finding a bottleneck spanning tree is no harder than finding
a minimum spanning tree
...

b
...

c
...
(Hint: You may want to use a subroutine
that contracts sets of edges, as in the MST-R EDUCE procedure described in
Problem 23-2
...
Each one takes
a connected graph and a weight function as input and returns a set of edges T
...
Also describe the most efficient implementation of
each algorithm, whether or not it computes a minimum spanning tree
...
M AYBE -MST-A
...
M AYBE -MST-B
...
M AYBE -MST-C
...
Graham and Hell [151] compiled a history of the minimumspanning-tree problem
...
Bor˙ vka
...
lg V / iterations of the
u
u

642

Chapter 23 Minimum Spanning Trees

procedure MST-R EDUCE described in Problem 23-2
...
The algorithm commonly known as Prim’s
algorithm was indeed invented by Prim [285], but it was also invented earlier by
V
...

ı
The reason underlying why greedy algorithms are effective at finding minimum
spanning trees is that the set of forests of a graph forms a graphic matroid
...
4
...
V lg V /, Prim’s algorithm, implemented with Fibonacci heaps,
runs in O
...
For sparser graphs, using a combination of the ideas from
Prim’s algorithm, Kruskal’s algorithm, and Bor˙ vka’s algorithm, together with adu
vanced data structures, Fredman and Tarjan [114] give an algorithm that runs in
O
...
Gabow, Galil, Spencer, and Tarjan [120] improved this algorithm to run in O
...
Chazelle [60] gives an algorithm that runs
in O
...
E; V // time, where ˛
...
(See the chapter notes for Chapter 21 for a brief discussion of Ackermann’s function and its inverse
...

A related problem is spanning-tree verification, in which we are given a graph
G D
...
King [203] gives a linear-time algorithm to verify a spanning
tree, building on earlier work of Koml´ s [215] and Dixon, Rauch, and Tarjan [90]
...
Karger, Klein, and Tarjan [195] give a randomized
minimum-spanning-tree algorithm that runs in O
...
This
algorithm uses recursion in a manner similar to the linear-time selection algorithm
in Section 9
...
Another recursive call
on E E 0 then finds the minimum spanning tree
...

u
Fredman and Willard [116] showed how to find a minimum spanning tree in
O
...
Their
algorithm assumes that the data are b-bit integers and that the computer memory
consists of addressable b-bit words
...
Given a road map of the United States on which the distance between
each pair of adjacent intersections is marked, how can she determine this shortest
route?
One possible way would be to enumerate all the routes from Phoenix to Indianapolis, add up the distances on each route, and select the shortest
...
For example, a route from Phoenix to Indianapolis
that passes through Seattle is obviously a poor choice, because Seattle is several
hundred miles out of the way
...
In a shortest-paths problem, we are given a weighted, directed graph
G D
...
The weight w
...
p/ D

k
X

w
...
u; / from u to by
(
p
minfw
...
u; / D
1
otherwise :

;

A shortest path from vertex u to vertex is then defined as any path p with weight
w
...
u; /
...
Our goal is to find a shortest path
from a given intersection in Phoenix to a given intersection in Indianapolis
...

The breadth-first-search algorithm from Section 22
...
Because many of the concepts from breadth-first search arise in the study
of shortest paths in weighted graphs, you might want to review Section 22
...

Variants
In this chapter, we shall focus on the single-source shortest-paths problem: given
a graph G D
...
The algorithm for the single-source problem can
solve many other problems, including the following variants
...
By reversing the direction of each edge in
the graph, we can reduce this problem to a single-source problem
...
If we solve the single-source problem with source vertex u,
we solve this problem also
...

All-pairs shortest-paths problem: Find a shortest path from u to for every pair
of vertices u and
...
Additionally, its structure is interesting in its own right
...

Optimal substructure of a shortest path
Shortest-paths algorithms typically rely on the property that a shortest path between two vertices contains other shortest paths within it
...
) Recall
that optimal substructure is one of the key indicators that dynamic programming
(Chapter 15) and the greedy method (Chapter 16) might apply
...
3, is a greedy algorithm, and the FloydWarshall algorithm, which finds shortest paths between all pairs of vertices (see
Section 25
...
The following lemma states
the optimal-substructure property of shortest paths more precisely
...
1 (Subpaths of shortest paths are shortest paths)
Given a weighted, directed graph G D
...
Then, pij is a shortest path from i to j
...
p/ D w
...
pij / C w
...
Now, assume that there is a path pij from i

p

0
pij

pj k

0i
0
to j with weight w
...
pij /
...
p0i /Cw
...
pjk / is less than w
...


Negative-weight edges
Some instances of the single-source shortest-paths problem may include edges
whose weights are negative
...
V; E/ contains no negativeweight cycles reachable from the source s, then for all 2 V , the shortest-path
weight ı
...
If the graph
contains a negative-weight cycle reachable from s, however, shortest-path weights
are not well defined
...
If there is a negativeweight cycle on some path from s to , we define ı
...

Figure 24
...
Because there is only one path from s to a (the
path hs; ai), we have ı
...
s; a/ D 3
...
s; b/ D w
...
a; b/ D 3 C
...
There are
infinitely many paths from s to c: hs; ci, hs; c; d; ci, hs; c; d; c; d; ci, and so on
...
3/ D 3 > 0, the shortest path from s
to c is hs; ci, with weight ı
...
s; c/ D 5
...
s; d / D w
...
c; d / D 11
...
Because the cycle he; f; ei has weight 3 C
...
By traversing the negative-weight cycle he; f; ei
arbitrarily many times, we can find paths from s to e with arbitrarily large negative
weights, and so ı
...
Similarly, ı
...
Because g is reachable
from f , we can also find paths with arbitrarily large negative weights from s to g,
and so ı
...
Vertices h, i, and j also form a negative-weight cycle
...
s; h/ D ı
...
s; j / D 1
...
1 Negative edge weights in a directed graph
...
Because vertices e and f form a negative-weight cycle reachable from s,
they have shortest-path weights of 1
...
Vertices such as h, i, and j are not
reachable from s, and so their shortest-path weights are 1, even though they lie on a negative-weight
cycle
...
Others, such as the Bellman-Ford algorithm, allow negative-weight edges in the input graph and produce a correct answer as long as no negative-weight cycles are
reachable from the source
...

Cycles
Can a shortest path contain a cycle? As we have just seen, it cannot contain a
negative-weight cycle
...
That is, if p D h 0 ; 1 ; : : : ; k i is a path and
c D h i ; i C1 ; : : : ; j i is a positive-weight cycle on this path (so that i D j and
w
...
p 0 / D w
...
c/ < w
...

That leaves only 0-weight cycles
...
Thus, if there is a shortest
path from a source vertex s to a destination vertex that contains a 0-weight cycle,
then there is another shortest path from s to without this cycle
...
Therefore, without loss of
generality we can assume that when we are finding shortest paths, they have no
cycles, i
...
, they are simple paths
...
V; E/

Chapter 24

Single-Source Shortest Paths

647

contains at most jV j distinct vertices, it also contains at most jV j 1 edges
...

Representing shortest paths
We often wish to compute not only shortest-path weights, but the vertices on shortest paths as well
...
2
...
V; E/, we maintain for
each vertex 2 V a predecessor : that is either another vertex or NIL
...

Thus, given a vertex for which : ¤ NIL , the procedure P RINT-PATH
...
2 will print a shortest path from s to
...
As in breadth-first search, we shall be interested in the
predecessor subgraph G D
...
Here again, we
define the vertex set V to be the set of vertices of G with non-NIL predecessors,
plus the source s:
V Df 2V W :

¤ NIL g [ fsg :

The directed edge set E is the set of edges induced by the
in V :
E D f
...
A shortest-paths tree is like the breadth-first tree from Section 22
...
To be precise, let G D
...
A shortest-paths tree rooted at s is a directed subgraph G 0 D
...
V 0 is the set of vertices reachable from s in G,
2
...
for all 2 V 0 , the unique simple path from s to
to in G
...
2 (a) A weighted, directed graph with shortest-path weights from source s
...
(c) Another shortest-paths tree with
the same root
...
For
example, Figure 24
...

Relaxation
The algorithms in this chapter use the technique of relaxation
...
We call :d a shortest-path estimate
...
V /-time
procedure:
I NITIALIZE -S INGLE -S OURCE
...

The process of relaxing an edge
...
A relaxation step1 may decrease the value of the shortest-path

1 It may seem strange that the term “relaxation” is used for an operation that tightens an upper

bound
...
The outcome of a relaxation step can be viewed as a relaxation
of the constraint : d Ä u: d C w
...
10), must be
satisfied if u: d D ı
...
s; /
...
u; /, there is no “pressure”
to satisfy this constraint, so the constraint is “relaxed
...
3 Relaxing an edge
...
u; / D 2
...
(a) Because : d > u: d C w
...
(b) Here, : d Ä u: d C w
...


estimate :d and update ’s predecessor attribute :
...
u; / in O
...
u; ; w/
1 if :d > u:d C w
...
u; /
3
: Du
Figure 24
...

Each algorithm in this chapter calls I NITIALIZE -S INGLE -S OURCE and then repeatedly relaxes edges
...
The algorithms in this chapter differ in
how many times they relax each edge and the order in which they relax edges
...
The Bellman-Ford algorithm relaxes each edge jV j 1
times
...
We state these properties here, and Section 24
...
For your reference, each property stated here includes the appropriate lemma or corollary number from Section 24
...
The latter
five of these properties, which refer to shortest-path estimates or the predecessor
subgraph, implicitly assume that the graph is initialized with a call to I NITIALIZE S INGLE -S OURCE
...


650

Chapter 24 Single-Source Shortest Paths

Triangle inequality (Lemma 24
...
u; / 2 E, we have ı
...
s; u/ C w
...

Upper-bound property (Lemma 24
...
s; / for all vertices
value ı
...


2 V , and once :d achieves the

No-path property (Corollary 24
...
s; / D 1
...
14)
If s ; u ! is a shortest path in G for some u; 2 V , and if u:d D ı
...
u; /, then :d D ı
...

Path-relaxation property (Lemma 24
...
0 ; 1 /;
...
k 1 ; k /, then k :d D ı
...

This property holds regardless of any other relaxation steps that occur, even if
they are intermixed with relaxations of the edges of p
...
17)
Once :d D ı
...

Chapter outline
Section 24
...

The Bellman-Ford algorithm is remarkably simple, and it has the further benefit
of detecting whether a negative-weight cycle is reachable from the source
...
2 gives a linear-time algorithm for computing shortest paths from a single
source in a directed acyclic graph
...
3 covers Dijkstra’s algorithm, which
has a lower running time than the Bellman-Ford algorithm but requires the edge
weights to be nonnegative
...
4 shows how we can use the Bellman-Ford
algorithm to solve a special case of linear programming
...
5
proves the properties of shortest paths and relaxation stated above
...
We shall assume that for any real number a ¤ 1, we have a C 1 D 1 C a D 1
...
1/ D
...

All algorithms in this chapter assume that the directed graph G is stored in the
adjacency-list representation
...
1/
time per edge
...
1 The Bellman-Ford algorithm

651

24
...
Given a weighted, directed graph G D
...
If there is such a cycle, the algorithm indicates that no solution exists
...

The algorithm relaxes edges, progressively decreasing an estimate :d on the
weight of a shortest path from the source s to each vertex 2 V until it achieves
the actual shortest-path weight ı
...
The algorithm returns TRUE if and only if
the graph contains no negative-weight cycles that are reachable from the source
...
G; w; s/
1 I NITIALIZE -S INGLE -S OURCE
...
u; / 2 G:E
4
R ELAX
...
u; / 2 G:E
6
if :d > u:d C w
...
4 shows the execution of the Bellman-Ford algorithm on a graph
with 5 vertices
...
Each pass is
one iteration of the for loop of lines 2–4 and consists of relaxing each edge of the
graph once
...
4(b)–(e) show the state of the algorithm after each of the
four passes over the edges
...
(We’ll see a little
later why this check works
...
VE/, since the initialization in
line 1 takes ‚
...
E/ time, and the for loop of lines 5–7 takes O
...

To prove the correctness of the Bellman-Ford algorithm, we start by showing that
if there are no negative-weight cycles, the algorithm computes correct shortest-path
weights for all vertices reachable from the source
...
4 The execution of the Bellman-Ford algorithm
...
The d values appear within the vertices, and shaded edges indicate predecessor values: if edge
...
In this particular example, each pass relaxes the edges in the order

...
t; y/;
...
x; t/;
...
y; ´/;
...
´; s/;
...
s; y/
...
(b)–(e) The situation after each successive pass over the edges
...
The Bellman-Ford algorithm returns TRUE in this
example
...
2
Let G D
...
Then, after the jV j 1 iterations of the for loop of lines 2–4
of B ELLMAN -F ORD, we have :d D ı
...

Proof We prove the lemma by appealing to the path-relaxation property
...
Because shortest paths are
simple, p has at most jV j 1 edges, and so k Ä jV j 1
...
Among the edges relaxed in
the ith iteration, for i D 1; 2; : : : ; k, is
...
By the path-relaxation property,
therefore, :d D k :d D ı
...
s; /
...
1 The Bellman-Ford algorithm

653

Corollary 24
...
V; E/ be a weighted, directed graph with source vertex s and weight
function w W E ! R, and assume that G contains no negative-weight cycles that
are reachable from s
...

Proof

The proof is left as Exercise 24
...


Theorem 24
...
V; E/ with
source s and weight function w W E ! R
...
s; /
for all vertices 2 V , and the predecessor subgraph G is a shortest-paths tree
rooted at s
...

Proof Suppose that graph G contains no negative-weight cycles that are reachable from the source s
...
s; /
for all vertices 2 V
...
2 proves this
claim
...
Thus, the claim is proven
...
Now we use the claim to show that
B ELLMAN -F ORD returns TRUE
...
u; / 2 E,
:d D ı
...
s; u/ C w
...
u; / ;
and so none of the tests in line 6 causes B ELLMAN -F ORD to return FALSE
...

Now, suppose that graph G contains a negative-weight cycle that is reachable
from the source s; let this cycle be c D h 0 ; 1 ; : : : ; k i, where 0 D k
...


i 1;

i/

<0:

(24
...
Thus, i :d Ä i 1 :d C w
...
Summing the
inequalities around cycle c gives us

654

Chapter 24 Single-Source Shortest Paths
k
X

i :d

Ä

k
X

i D1


...


i //

k
X

w
...
3,


i 1;

i D1

k
X

w
...
Thus,

;

i D1

which contradicts inequality (24
...
We conclude that the Bellman-Ford algorithm
returns TRUE if graph G contains no negative-weight cycles reachable from the
source, and FALSE otherwise
...
1-1
Run the Bellman-Ford algorithm on the directed graph of Figure 24
...
In each pass, relax edges in the same order as in the figure, and
show the d and values after each pass
...
´; x/
to 4 and run the algorithm again, using s as the source
...
1-2
Prove Corollary 24
...

24
...
V; E/ with no negative-weight cycles,
let m be the maximum over all vertices 2 V of the minimum number of edges
in a shortest path from the source s to
...
) Suggest a simple change to the Bellman-Ford algorithm that
allows it to terminate in m C 1 passes, even if m is not known in advance
...
1-4
Modify the Bellman-Ford algorithm so that it sets :d to 1 for all vertices
which there is a negative-weight cycle on some path from the source to
...
2 Single-source shortest paths in directed acyclic graphs

655

24
...
V; E/ be a weighted, directed graph with weight function w W E ! R
...
VE/-time algorithm to find, for each vertex 2 V , the value ı
...
u; /g
...
1-6 ?
Suppose that a weighted, directed graph G D
...

Give an efficient algorithm to list the vertices of one such cycle
...


24
...
V; E/
according to a topological sort of its vertices, we can compute shortest paths from
a single source in ‚
...
Shortest paths are always well defined in a dag,
since even if there are negative-weight edges, no negative-weight cycles can exist
...
4) to impose a linear ordering on the vertices
...
We make just one pass over the
vertices in the topologically sorted order
...

DAG -S HORTEST-PATHS
...
G; s/
3 for each vertex u, taken in topologically sorted order
4
for each vertex 2 G:AdjŒu
5
R ELAX
...
5 shows the execution of this algorithm
...
As shown in Section 22
...
V C E/ time
...
V / time
...
Altogether, the for loop of lines 4–5 relaxes each edge exactly
once
...
) Because each iteration of the
inner for loop takes ‚
...
V C E/, which is linear
in the size of an adjacency-list representation of the graph
...


656

Chapter 24 Single-Source Shortest Paths

r


5

s
0

2

6
t


7

3

x


–1

1
y


4

–2

z


r


5

2

s
0

2

6
t


3

5

s
0

2

6
t
2

7

3

5

s
0

x
6

–1

1
y


4

2

6
t
2

7

3

–2

z


r


5

2

s
0

2

6
t
2

5

s
0

2

7

3

z


2

x
6

–1

1
y
6

4

–2

z
4

2

(d)

x
6

–1

1
y
5

4

6
t
2

7

3

–2

z
4

2

r


5

s
0

2

6
t
2

7

3

(e)

r


–2

(b)

(c)

r


–1

1
y


4

(a)

r


7

x


x
6
4

–1

1
y
5

–2

z
3

2

(f)

x
6
4

–1

1
y
5

–2

z
3

2

(g)

Figure 24
...
The
vertices are topologically sorted from left to right
...
The d values appear
within the vertices, and shaded edges indicate the values
...
(b)–(g) The situation after each iteration of the for loop of lines 3–5
...
The values shown in
part (g) are the final values
...
5
If a weighted, directed graph G D
...
s; / for all
vertices 2 V , and the predecessor subgraph G is a shortest-paths tree
...
s; / for all vertices
2 V at termination
...
s; / D 1 by the no-path
property
...
Because we pro-

24
...
0 ; 1 /;
...
k 1 ; k /
...
s; i / at termination for i D 0; 1; : : : ; k
...

An interesting application of this algorithm arises in determining critical paths
in PERT chart2 analysis
...
If edge
...
; x/ leaves , then job
...
; x/
...
A critical path is a longest path through the dag, corresponding
to the longest time to perform any sequence of jobs
...
We can find
a critical path by either
negating the edge weights and running DAG -S HORTEST-PATHS, or
running DAG -S HORTEST-PATHS, with the modification that we replace “1”
by “ 1” in line 2 of I NITIALIZE -S INGLE -S OURCE and “>” by “<” in the
R ELAX procedure
...
2-1
Run DAG -S HORTEST-PATHS on the directed graph of Figure 24
...

24
...

24
...
In a more natural structure, vertices would represent jobs and edges would represent sequencing
constraints; that is, edge
...
We would then assign weights to vertices, not edges
...

2 “PERT” is

an acronym for “program evaluation and review technique
...
2-4
Give an efficient algorithm to count the total number of paths in a directed acyclic
graph
...


24
...
V; E/ for the case in which all edge weights are nonnegative
...
u; / 0 for each edge
...
As
we shall see, with a good implementation, the running time of Dijkstra’s algorithm
is lower than that of the Bellman-Ford algorithm
...
The algorithm repeatedly selects the vertex u 2 V S with the minimum shortest-path estimate, adds u
to S, and relaxes all edges leaving u
...

D IJKSTRA
...
G; s/
2 S D;
3 Q D G:V
4 while Q ¤ ;
5
u D E XTRACT-M IN
...
u; ; w/
Dijkstra’s algorithm relaxes edges as shown in Figure 24
...
Line 1 initializes
the d and values in the usual way, and line 2 initializes the set S to the empty
set
...
Line 3 initializes the min-priority queue Q
to contain all the vertices in V ; since S D ; at that time, the invariant is true after
line 3
...
(The first
time through this loop, u D s
...
Then, lines 7–8 relax each edge
...
Observe that the algorithm
never inserts vertices into Q after line 3 and that each vertex is extracted from Q

24
...
6 The execution of Dijkstra’s algorithm
...
The
shortest-path estimates appear within the vertices, and shaded edges indicate predecessor values
...
(a) The
situation just before the first iteration of the while loop of lines 4–8
...
(b)–(f) The situation after each successive iteration
of the while loop
...

The d values and predecessors shown in part (f) are the final values
...

Because Dijkstra’s algorithm always chooses the “lightest” or “closest” vertex
in V S to add to set S, we say that it uses a greedy strategy
...
Greedy strategies do not always yield optimal results in general, but as the following theorem and its corollary show, Dijkstra’s algorithm does
indeed compute shortest paths
...
s; u/
...
6 (Correctness of Dijkstra’s algorithm)
Dijkstra’s algorithm, run on a weighted, directed graph G D
...
s; u/ for all
vertices u 2 V
...
7 The proof of Theorem 24
...
Set S is nonempty just before vertex u is added to it
...
Vertices x and y are distinct,
but we may have s D x or y D u
...


Proof

We use the following loop invariant:

At the start of each iteration of the while loop of lines 4–8, :d D ı
...

It suffices to show for each vertex u 2 V , we have u:d D ı
...
Once we show that u:d D ı
...

Initialization: Initially, S D ;, and so the invariant is trivially true
...
s; u/ for the vertex
added to set S
...
s; u/ when it is added to set S
...
s; u/ at that time by
examining a shortest path from s to u
...
s; s/ D 0 at that time
...
There must be some
path from s to u, for otherwise u:d D ı
...
s; u/
...
Prior to adding u to S,
path p connects a vertex in S, namely s, to a vertex in V S, namely u
...
Thus, as Figure 24
...
(Either of paths p1 or p2 may have no edges
...
s; y/ when u is added to S
...
Then, because we chose u as the first vertex for which
u:d ¤ ı
...
s; x/ when x was added

24
...
Edge
...

We can now obtain a contradiction to prove that u:d D ı
...
Because y
appears before u on a shortest path from s to u and all edge weights are nonnegative (notably those on path p2 ), we have ı
...
s; u/, and thus
y:d D ı
...
s; u/
Ä u:d
(by the upper-bound property)
...
2)

But because both vertices u and y were in V S when u was chosen in line 5,
we have u:d Ä y:d
...
2) are in fact equalities,
giving
y:d D ı
...
s; u/ D u:d :
Consequently, u:d D ı
...
We conclude
that u:d D ı
...

Termination: At termination, Q D ; which, along with our earlier invariant that
Q D V S, implies that S D V
...
s; u/ for all vertices u 2 V
...
7
If we run Dijkstra’s algorithm on a weighted, directed graph G D
...

Proof

Immediate from Theorem 24
...


Analysis
How fast is Dijkstra’s algorithm? It maintains the min-priority queue Q by calling three priority-queue operations: I NSERT (implicit in line 3), E XTRACT-M IN
(line 5), and D ECREASE -K EY (implicit in R ELAX, which is called in line 8)
...
Because each
vertex u 2 V is added to set S exactly once, each edge in the adjacency list AdjŒu
is examined in the for loop of lines 7–8 exactly once during the course of the algorithm
...
(Observe once again that we are using aggregate analysis
...
Consider first the case in which we maintain the min-priority

662

Chapter 24 Single-Source Shortest Paths

queue by taking advantage of the vertices being numbered 1 to jV j
...
Each I NSERT and D ECREASE -K EY operation
takes O
...
V / time (since we
have to search through the entire array), for a total time of O
...
V 2 /
...
V 2 = lg V /—we can
improve the algorithm by implementing the min-priority queue with a binary minheap
...
5, the implementation should make sure that
vertices and corresponding heap elements maintain handles to each other
...
lg V /
...
The time to build the binary min-heap is O
...
Each D ECREASE -K EY
operation takes time O
...
The
total running time is therefore O
...
E lg V / if all vertices
are reachable from the source
...
V 2 /-time implementation if E D o
...

We can in fact achieve a running time of O
...
The amortized cost
of each of the jV j E XTRACT-M IN operations is O
...
1/ amortized time
...
lg V / without increasing the amortized time of
E XTRACT-M IN would yield an asymptotically faster implementation than with binary heaps
...
2) and
Prim’s algorithm for computing minimum spanning trees (see Section 23
...
It is
like breadth-first search in that set S corresponds to the set of black vertices in a
breadth-first search; just as vertices in S have their final shortest-path weights, so
do black vertices in a breadth-first search have their correct breadth-first distances
...

Exercises
24
...
2, first using vertex s
as the source and then using vertex ´ as the source
...
6,
show the d and values and the vertices in set S after each iteration of the while
loop
...
3 Dijkstra’s algorithm

663

24
...
Why doesn’t the proof of Theorem 24
...
3-3
Suppose we change line 4 of Dijkstra’s algorithm to the following
...
Is

24
...
The program produces :d and : for each vertex 2 V
...
V CE/-time algorithm to check the output of the professor’s program
...

You may assume that all edge weights are nonnegative
...
3-5
Professor Newman thinks that he has worked out a simpler proof of correctness
for Dijkstra’s algorithm
...
Show that the professor is mistaken by constructing a directed graph for
which Dijkstra’s algorithm could relax the edges of a shortest path out of order
...
3-6
We are given a directed graph G D
...
u; / 2 E has an
associated value r
...
u; / Ä 1 that
represents the reliability of a communication channel from vertex u to vertex
...
u; / as the probability that the channel from u to will not fail,
and we assume that these probabilities are independent
...

24
...
V; E/ be a weighted, directed graph with positive weight function
w W E ! f1; 2; : : : ; W g for some positive integer W , and assume that no two vertices have the same shortest-path weights from source vertex s
...
V [ V 0 ; E 0 / by replacing each
edge
...
u; / unit-weight edges in series
...
Show that

664

Chapter 24 Single-Source Shortest Paths

the order in which the breadth-first search of G 0 colors vertices in V black is the
same as the order in which Dijkstra’s algorithm extracts the vertices of V from the
priority queue when it runs on G
...
3-8
Let G D
...
Modify Dijkstra’s algorithm to compute the shortest paths from a given source vertex s in O
...

24
...
3-8 to run in O
...

(Hint: How many distinct shortest-path estimates can there be in V
S at any
point in time?)
24
...
V; E/ in which edges
that leave the source vertex s may have negative weights, all other edge weights
are nonnegative, and there are no negative-weight cycles
...


24
...
In this section, we
investigate a special case of linear programming that we reduce to finding shortest
paths from a single source
...

Linear programming
In the general linear-programming problem, we are given an m n matrix A,
an m-vector b, and an n-vector c
...

Although the simplex algorithm, which is the focus of Chapter 29, does not
always run in time polynomial in the size of its input, there are other linearprogramming algorithms that do run in polynomial time
...
First, if we know that we

24
...
Second,
faster algorithms exist for many special cases of linear programming
...
4-4) and the maximum-flow
problem (Exercise 26
...

Sometimes we don’t really care about the objective function; we just wish to find
any feasible solution, that is, any vector x that satisfies Ax Ä b, or to determine
that no feasible solution exists
...

Systems of difference constraints
In a system of difference constraints, each row of the linear-programming matrix A
contains one 1 and one 1, and all other entries of A are 0
...

For example, consider the problem of finding a 5-vector x D
...


(24
...
4)
(24
...
6)
(24
...
8)
(24
...
10)

666

Chapter 24 Single-Source Shortest Paths

One solution to this problem is x D
...
In fact, this problem has more than one solution
...
0; 2; 5; 4; 1/
...
This fact is not mere
coincidence
...
8
Let x D
...
Then x C d D
...

Proof For each xi and xj , we have
...



...
Thus, if x

Systems of difference constraints occur in many different applications
...
Each constraint
states that at least a certain amount of time, or at most a certain amount of time,
must elapse between two events
...
If we apply an adhesive that takes 2 hours to set at
time x1 and we have to wait until it sets to install a part at time x2 , then we have the
constraint that x2 x1 C 2 or, equivalently, that x1 x2 Ä 2
...
In this case, we get the pair of
constraints x2 x1 and x2 Ä x1 C 1 or, equivalently, x1 x2 Ä 0 and x2 x1 Ä 1
...
In a system Ax Ä b of difference constraints, we view the m n
linear-programming matrix A as the transpose of an incidence matrix (see Exercise 22
...
Each vertex i in the graph,
for i D 1; 2; : : : ; n, corresponds to one of the n unknown variables xi
...

More formally, given a system Ax Ä b of difference constraints, the corresponding constraint graph is a weighted, directed graph G D
...
i ; j / W xj xi Ä bk is a constraintg
[ f
...
0 ; 2 /;
...
0 ;

n /g

:

24
...
8 The constraint graph corresponding to the system (24
...
10) of difference constraints
...
0 ; i / appears in each vertex i
...
5; 3; 0; 1; 4/
...
Thus,
the vertex set V consists of a vertex i for each unknown xi , plus an additional
vertex 0
...
0 ; i / for each unknown xi
...
i ; j / is w
...
The weight of each edge leaving 0 is 0
...
8 shows the constraint graph for the system (24
...
10)
of difference constraints
...

Theorem 24
...
V; E/ be the corresponding constraint graph
...
0 ;

1 /; ı
...
0 ;

3 /; : : : ; ı
...
11)

is a feasible solution for the system
...

Proof We first show that if the constraint graph contains no negative-weight
cycles, then equation (24
...
Consider any edge

...
By the triangle inequality, ı
...
0 ; i / C w
...
0 ; j / ı
...
i ; j /
...
0 ; i / and

668

Chapter 24 Single-Source Shortest Paths

xj D ı
...
i ; j / that corresponds to edge
...

Now we show that if the constraint graph contains a negative-weight cycle, then
the system of difference constraints has no feasible solution
...

(The vertex 0 cannot be on cycle c, because it has no entering edges
...
1 ; 2 / ;
x2 Ä w
...
k 2 ; k 1 / ;
xk 1 Ä w
...
The solution must also satisfy the inequality that results
when we sum the k inequalities together
...
The right-hand side
sums to w
...
c/
...
c/ < 0, and we obtain the contradiction that 0 Ä w
...

Solving systems of difference constraints
Theorem 24
...
Because the constraint graph contains edges
from the source vertex 0 to all other vertices, any negative-weight cycle in the
constraint graph is reachable from 0
...
In
Figure 24
...
5; 3; 0; 1; 4/, and by Lemma 24
...
d 5; d 3; d; d 1; d 4/
is also a feasible solution for any constant d
...

A system of difference constraints with m constraints on n unknowns produces
a graph with n C 1 vertices and n C m edges
...
n C 1/
...
n2 C nm/ time
...
4-5 asks you to modify the algorithm to run in O
...


24
...
4-1
Find a feasible solution or determine that no feasible solution exists for the following system of difference constraints:
x1
x1
x2
x2
x2
x3
x4
x5
x5
x6

x2
x4
x3
x5
x6
x6
x2
x1
x4
x3

Ä
Ä
Ä
Ä
Ä
Ä
Ä
Ä
Ä
Ä

1,
4,
2,
7,
5,
10 ,
2,
1,
3,
8
...
4-2
Find a feasible solution or determine that no feasible solution exists for the following system of difference constraints:
x1
x1
x2
x3
x4
x4
x4
x5
x5

x2
x5
x4
x2
x1
x3
x5
x3
x4

Ä
Ä
Ä
Ä
Ä
Ä
Ä
Ä
Ä

4,
5,
6,
1,
3,
5,
10 ,
4,
8
...
4-3
Can any shortest-path weight from the new vertex
tive? Explain
...
4-4
Express the single-pair shortest-path problem as a linear program
...
4-5
Show how to modify the Bellman-Ford algorithm slightly so that when we use it
to solve a system of difference constraints with m inequalities on n unknowns, the
running time is O
...

24
...
Show how to adapt the BellmanFord algorithm to solve this variety of constraint system
...
4-7
Show how to solve a system of difference constraints by a Bellman-Ford-like algorithm that runs on a constraint graph without the extra vertex 0
...
4-8 ?
Let Ax Ä b be a system of m difference constraints in n unknowns
...

24
...
max fxi g min fxi g/
subject to Ax Ä b
...

24
...
Show how to adapt the Bellman-Ford
algorithm to solve this variety of constraint system
...
4-11
Give an efficient algorithm to solve a system Ax Ä b of difference constraints
when all of the elements of b are real-valued and all of the unknowns xi must be
integers
...
4-12 ?
Give an efficient algorithm to solve a system Ax Ä b of difference constraints
when all of the elements of b are real-valued and a specified subset of some, but
not necessarily all, of the unknowns xi must be integers
...
5 Proofs of shortest-paths properties

671

24
...
We stated these properties
without proof at the beginning of this chapter
...

The triangle inequality
In studying breadth-first search (Section 22
...
1 a simple property of shortest distances in unweighted graphs
...

Lemma 24
...
V; E/ be a weighted, directed graph with weight function w W E ! R
and source vertex s
...
u; / 2 E, we have
ı
...
s; u/ C w
...
Then p has
no more weight than any other path from s to
...
u; /
...
5-3 asks you to handle the case in which there is no shortest path
from s to
...

Lemma 24
...
V; E/ be a weighted, directed graph with weight function w W E ! R
...
s; / for all 2 V , and this invariant is
S INGLE -S OURCE
...
Then, :d
maintained over any sequence of relaxation steps on the edges of G
...
s; /, it never changes
...
s; / for all vertices 2 V by induction
over the number of relaxation steps
...
s; / is certainly true after initialization, since :d D 1
implies :d ı
...
s; s/ (note that
ı
...

For the inductive step, consider the relaxation of an edge
...
By the inductive
hypothesis, x:d ı
...
The only d value
that may change is :d
...
u; /
ı
...
u; / (by the inductive hypothesis)
ı
...

To see that the value of :d never changes once :d D ı
...
s; /, and it cannot increase because relaxation steps do not increase d
values
...
12 (No-path property)
Suppose that in a weighted, directed graph G D
...

Then, after the graph is initialized by I NITIALIZE -S INGLE -S OURCE
...
s; / D 1, and this equality is maintained as an invariant over
any sequence of relaxation steps on the edges of G
...
s; / Ä :d, and
thus :d D 1 D ı
...

Lemma 24
...
V; E/ be a weighted, directed graph with weight function w W E ! R,
and let
...
Then, immediately after relaxing edge
...
u; ; w/, we have :d Ä u:d C w
...

Proof If, just prior to relaxing edge
...
u; /, then
:d D u:d C w
...
If, instead, :d Ä u:d C w
...
u; /
afterward
...
14 (Convergence property)
Let G D
...
5 Proofs of shortest-paths properties

673

some vertices u; 2 V
...
G; s/ and then a sequence of relaxation steps that includes the call
R ELAX
...
If u:d D ı
...
s; / at all times after the call
...
s; u/ at some point prior to relaxing edge
...
In particular, after relaxing
edge
...
u; /
(by Lemma 24
...
s; u/ C w
...
s; /
(by Lemma 24
...

By the upper-bound property, :d
ı
...
s; /, and this equality is maintained thereafter
...
15 (Path-relaxation property)
Let G D
...
Consider any shortest path p D h 0 ; 1 ; : : : ; k i
from s D 0 to k
...
G; s/ and
then a sequence of relaxation steps occurs that includes, in order, relaxing the edges

...
1 ; 2 /; : : : ;
...
s; k / after these relaxations and
at all times afterward
...


Proof We show by induction that after the ith edge of path p is relaxed, we have
i :d D ı
...
For the basis, i D 0, and before any edges of p have been relaxed,
we have from the initialization that 0 :d D s:d D 0 D ı
...
By the upper-bound
property, the value of s:d never changes after initialization
...
s; i 1 /, and we examine
what happens when we relax edge
...
By the convergence property, after
relaxing this edge, we have i :d D ı
...

Relaxation and shortest-paths trees
We now show that once a sequence of relaxations has caused the shortest-path estimates to converge to shortest-path weights, the predecessor subgraph G induced
by the resulting values is a shortest-paths tree for G
...


674

Chapter 24 Single-Source Shortest Paths

Lemma 24
...
V; E/ be a weighted, directed graph with weight function w W E ! R,
let s 2 V be a source vertex, and assume that G contains no negative-weight
cycles that are reachable from s
...
G; s/, the predecessor subgraph G forms a rooted tree with
root s, and any sequence of relaxation steps on edges of G maintains this property
as an invariant
...
Consider a predecessor subgraph G that arises after a sequence of
relaxation steps
...
Suppose for the sake of
contradiction that some relaxation step creates a cycle in the graph G
...
Then, i : D i 1 for i D 1; 2; : : : ; k
and, without loss of generality, we can assume that relaxing edge
...

We claim that all vertices on cycle c are reachable from the source s
...
By the
upper-bound property, each vertex on cycle c has a finite shortest-path weight,
which implies that it is reachable from s
...
k 1 ; k ; w/ and show that c is a negative-weight cycle, thereby contradicting the assumption that G contains no negative-weight cycles that are reachable
from the source
...

Thus, for i D 1; 2; : : : ; k 1, the last update to i :d was by the assignment
i :d D i 1 :dCw
...
If i 1 :d changed since then, it decreased
...
k 1 ; k ; w/, we have
i :d

i 1 :d

C w
...
12)

Because k : is changed by the call, immediately beforehand we also have the
strict inequality
k :d

>

k 1 :d

C w
...
12), we obtain the
sum of the shortest-path estimates around cycle c:
k
X

i :d

>

i D1

k
X


...


i 1;

i //

i D1

D

k
X
i D1

i 1 :d

C

k
X
i D1

w
...
5 Proofs of shortest-paths properties

675

x
z

s

u

v

y

Figure 24
...
If there are two
paths p1 (s ; u ; x ! ´ ; ) and p2 (s ; u ; y ! ´ ; ), where x ¤ y, then ´: D x
and ´: D y, a contradiction
...
This
equality implies
0>

k
X

w
...

We have now proven that G is a directed, acyclic graph
...
5-2) to prove that for each
vertex 2 V , there is a unique simple path from s to in G
...
The vertices in V are those with non-NIL values, plus s
...
We leave the details as
Exercise 24
...

To complete the proof of the lemma, we must now show that for any vertex
2 V , the graph G contains at most one simple path from s to
...
That is, suppose that, as Figure 24
...
But then, ´: D x and ´: D y, which implies
the contradiction that x D y
...

We can now show that if, after we have performed a sequence of relaxation steps,
all vertices have been assigned their true shortest-path weights, then the predecessor subgraph G is a shortest-paths tree
...
17 (Predecessor-subgraph property)
Let G D
...
Let us call I NITIALIZE -S INGLE -S OURCE
...
s; /
for all 2 V
...

Proof We must prove that the three properties of shortest-paths trees given on
page 647 hold for G
...
By definition, a shortest-path weight ı
...
But a vertex 2 V fsg has been assigned
a finite value for :d if and only if : ¤ NIL
...

The second property follows directly from Lemma 24
...

It remains, therefore, to prove the last property of shortest-paths trees: for each
p
vertex 2 V , the unique simple path s ; in G is a shortest path from s to
in G
...
For i D 1; 2; : : : ; k,
we have both i :d D ı
...
i 1 ; i /, from which we
conclude w
...
s; i / ı
...
Summing the weights along path p
yields
w
...


i 1;

i/

i D1

Ä

k
X


...
s;

i 1 //

i D1

D ı
...
s;

k/
k/

ı
...
s; 0 / D ı
...


Thus, w
...
s; k /
...
s; k / is a lower bound on the weight of any path
from s to k , we conclude that w
...
s; k /, and thus p is a shortest path
from s to D k
...
5-1
Give two shortest-paths trees for the directed graph of Figure 24
...


24
...
5-2
Give an example of a weighted, directed graph G D
...
u; / 2 E, there is a shortest-paths tree rooted at s that contains
...
u; /
...
5-3
Embellish the proof of Lemma 24
...

24
...
V; E/ be a weighted, directed graph with source vertex s, and let G
be initialized by I NITIALIZE -S INGLE -S OURCE
...
Prove that if a sequence of
relaxation steps sets s: to a non-NIL value, then G contains a negative-weight
cycle
...
5-5
Let G D
...
Let
s 2 V be the source vertex, and suppose that we allow : to be the predecessor
of on any shortest path to from source s if 2 V fsg is reachable from s,
and NIL otherwise
...
(By Lemma 24
...
)
24
...
V; E/ be a weighted, directed graph with weight function w W E ! R
and no negative-weight cycles
...
G; s/
...

24
...
V; E/ be a weighted, directed graph that contains no negative-weight
cycles
...
G; s/
...
s; / for all 2 V
...
5-8
Let G be an arbitrary weighted, directed graph with a negative-weight cycle reachable from the source vertex s
...


678

Chapter 24 Single-Source Shortest Paths

Problems
24-1 Yen’s improvement to Bellman-Ford
Suppose that we order the edge relaxations in each pass of the Bellman-Ford algorithm as follows
...
V; E/
...
i ; j / 2 E W i < j g and
Eb D f
...
(Assume that G contains no self-loops, so that every
edge is in either Ef or Eb
...
V; Ef / and Gb D
...

a
...


2;

:::;

jV j i

and that Gb is

Suppose that we implement each pass of the Bellman-Ford algorithm in the following way
...
We then visit each vertex in the order jV j ; jV j 1 ; : : : ; 1 ,
relaxing edges of Eb that leave the vertex
...
Prove that with this scheme, if G contains no negative-weight cycles that are
reachable from the source vertex s, then after only djV j =2e passes over the
edges, :d D ı
...

c
...
x1 ; x2 ; : : : ; xd / nests within another box
with dimensions
...
1/ < y1 , x
...
, x
...

a
...

b
...

c
...

Give an efficient algorithm to find the longest sequence hBi1 ; Bi2 ; : : : ; Bik i of
boxes such that Bij nests within Bij C1 for j D 1; 2; : : : ; k 1
...


Problems for Chapter 24

679

24-3 Arbitrage
Arbitrage is the use of discrepancies in currency exchange rates to transform one
unit of a currency into more than one unit of the same currency
...
S
...
S
...
Then, by converting currencies,
a trader can start with 1 U
...
dollar and buy 49 2 0:0107 D 1:0486 U
...
dollars,
thus turning a profit of 4:86 percent
...

a
...

b
...
Analyze
the running time of your algorithm
...
It then refines the
initial solution by looking at the two highest-order bits
...

In this problem, we examine an algorithm for computing the shortest paths from
a single source by scaling edge weights
...
V; E/
with nonnegative integer edge weights w
...
u; /2E fw
...
Our
goal is to develop an algorithm that runs in O
...
We assume that all
vertices are reachable from the source
...
Specifically,
let k D dlg
...
u; / D w
...
That is, wi
...
u; / given by the i most significant bits of w
...

(Thus, wk
...
u; / for all
...
) For example, if k D 5 and
w
...
u; / D
h110i D 6
...
u; / D h00100i D 4, then
w3
...
Let us define ıi
...
Thus, ık
...
u; / for all
u; 2 V
...
s; / for all 2 V , then computes ı2
...
s; / for all 2 V
...
E/ time, so
that the entire algorithm takes O
...
E lg W / time
...
Suppose that for all vertices 2 V , we have ı
...
Show that we can
compute ı
...
E/ time
...
Show that we can compute ı1
...
E/ time
...

c
...
u; / D 2wi
wi
...
u; / C 1
...
s; / Ä ıi
...
s; / C jV j
for all

1
...


d
...
u; / 2 E,
wi
...
u; / C 2ıi 1
...
s; / :

Prove that for i D 2; 3; : : : ; k and all u; 2 V , the “reweighted” value wi
...
u; / is a nonnegative integer
...
Now, define ıi
...
Prove that for i D 2; 3; : : : ; k and all 2 V ,
y
y
ıi
...
s; / C 2ıi

1
...
s; / Ä jEj
...
Show how to compute ıi
...
s; / for all 2 V in O
...
s; / for all 2 V in O
...

24-5 Karp’s minimum mean-weight cycle algorithm
Let G D
...
We define the mean weight of a cycle c D he1 ; e2 ; : : : ; ek i of edges in E
to be
k
1X
w
...
c/ D
k i D1

Problems for Chapter 24

681

Let
D minc
...
We call a cycle c
a minimum mean-weight cycle
...
c/ D
an efficient algorithm for computing
...
Let ı
...
s; / be the weight of a shortest path from s to consisting of exactly k edges
...
s; / D 1
...
s; / D
a
...
s; / for all vertices 2 V
...
Show that if
max

0ÄkÄn

D 0, then

ın
...
s; /
k

0

2 V
...
)

for all vertices

c
...
Suppose
D 0 and that the weight of the simple path from u to along the cycle
that
is x
...
s; / D ı
...
(Hint: The weight of the simple path
from to u along the cycle is x
...
Show that if
vertex such that
max

0ÄkÄn

ın
...
s; /
D0:
k

(Hint: Show how to extend a shortest path to any vertex on a minimum meanweight cycle along the cycle to make a shortest path to the next vertex on the
cycle
...
Show that if
min max
2V 0ÄkÄn

D 0, then

ın
...
s; /
D0:
k

f
...
Use this fact to show that
D min max

2V 0ÄkÄn 1

ın
...
s; /
:
k

g
...
VE/-time algorithm to compute


...
For example the sequences h1; 4; 6; 8; 3; 2i, h9; 2; 4; 10; 5i, and
h1; 2; 3; 4i are bitonic, but h1; 3; 12; 4; 2; 10i is not bitonic
...
)
Suppose that we are given a directed graph G D
...
We are given one additional piece of information: for each vertex 2 V , the weights of the edges along any shortest path
from s to form a bitonic sequence
...


Chapter notes
Dijkstra’s algorithm [88] appeared in 1959, but it contained no mention of a priority
queue
...
Bellman describes the relation of shortest paths to difference
constraints
...

When edge weights are relatively small nonnegative integers, we have more efficient algorithms to solve the single-source shortest-paths problem
...
As discussed in the chapter notes for Chapter 6, in
this case several data structures can implement the various priority-queue operations more efficiently than a binary heap or a Fibonacci heap
...
E C V lg W / time on
graphs with nonnegative edge weights, where W is the largest weight of any edge
in the graph
...
E lg lg V / time, and by Raman [291], who gives an algorithm that runs
«
˚
time
...
lg V /1=3C ;
...
Although the amount of space used can be unbounded in the size of the input, it can
be reduced to be linear in the size of the input using randomized hashing
...
V C E/time algorithm for single-source shortest paths
...

For graphs with negative edge weights, an algorithm due to Gabow and Tarp
janp
[122] runs in O
...
V W // time, and one by Goldberg [137] runs in
O
...
u; /2E fjw
...

Cherkassky, Goldberg, and Radzik [64] conducted extensive experiments comparing various shortest-path algorithms
...
This problem might arise in making a table of distances between all pairs of cities for a road atlas
...
V; E/ with a weight function w W E ! R that maps edges
to real-valued weights
...
We typically want the output in tabular form:
the entry in u’s row and ’s column should be the weight of a shortest path from u
to
...
If all
edge weights are nonnegative, we can use Dijkstra’s algorithm
...
V 3 C VE/ D O
...
The binary min-heap implementation of the min-priority
queue yields a running time of O
...
Alternatively, we can implement the min-priority queue with a Fibonacci
heap, yielding a running time of O
...

If the graph has negative-weight edges, we cannot use Dijkstra’s algorithm
...
The
resulting running time is O
...
V 4 /
...
We also investigate the relation of the all-pairs
shortest-paths problem to matrix multiplication and study its algebraic structure
...
(Johnson’s algorithm for sparse graphs, in Section 25
...
) For convenience, we assume that the vertices are numbered
1; 2; : : : ; jV j, so that the input is an n n matrix W representing the edge weights
of an n-vertex directed graph G D
...
That is, W D
...
i; j / if i ¤ j and
...
i; j / 62 E :

(25
...

The tabular output of the all-pairs shortest-paths algorithms presented in this
chapter is an n n matrix D D
...
That is, if we let ı
...
i; j / at
termination
...
ij /, where ij is NIL if either i D j or there is no path from i to j ,
and otherwise ij is the predecessor of j on some shortest path from i
...
For each vertex i 2 V , we define the predecessor
subgraph of G for i as G ;i D
...


ij ; j /

Wj 2V

;i

figg :

If G ;i is a shortest-paths tree, then the following procedure, which is a modified
version of the P RINT-PATH procedure from Chapter 22, prints a shortest path from
vertex i to vertex j
...
…; i; j /
1 if i == j
2
print i
3 elseif ij == NIL
4
print “no path from” i “to” j “exists”
5 else P RINT-A LL -PAIRS -S HORTEST-PATH
...
Some of the exercises cover
the basics
...
1 presents a dynamic-programming algorithm based on matrix multiplication to solve the all-pairs shortest-paths problem
...
V 3 lg V /
...
2 gives
another dynamic-programming algorithm, the Floyd-Warshall algorithm, which
runs in time ‚
...
Section 25
...
Finally, Section 25
...
V 2 lg V C VE/ time and is a good choice for
large, sparse graphs
...
First, we shall generally assume that the input graph G D
...
Second, we shall use the convention of denoting
matrices by uppercase letters, such as W , L, or D, and their individual elements
by subscripted lowercase letters, such as wij , lij , or dij
...
m/

...
m/ D dij , to indicate
parenthesized superscripts, as in L
...
Finally, for a given n n matrix A, we shall assume that the value of n is
stored in the attribute A:rows
...
1 Shortest paths and matrix multiplication
This section presents a dynamic-programming algorithm for the all-pairs shortestpaths problem on a directed graph G D
...
Each major loop of the dynamic
program will invoke an operation that is very similar to matrix multiplication, so
that the algorithm will look like repeated matrix multiplication
...
V 4 /-time algorithm for the all-pairs shortest-paths problem and
then improve its running time to ‚
...

Before proceeding, let us briefly recap the steps given in Chapter 15 for developing a dynamic-programming algorithm
...
Characterize the structure of an optimal solution
...
Recursively define the value of an optimal solution
...
Compute the value of an optimal solution in a bottom-up fashion
...


25
...
For the all-pairs
shortest-paths problem on a graph G D
...
1)
that all subpaths of a shortest path are shortest paths
...
wij /
...
Assuming that
there are no negative-weight cycles, m is finite
...
If vertices i and j are distinct, then we decompose path p into
p0

i ; k ! j , where path p 0 now contains at most m 1 edges
...
1,
p 0 is a shortest path from i to k, and so ı
...
i; k/ C wkj
...
m/
Now, let lij be the minimum weight of any path from vertex i to vertex j that
contains at most m edges
...
Thus,
(
0 if i D j ;

...
m/

...
Thus, we recursively define
˚
«Á

...
m
D min lij 1/ ; min li
...
m 1/
«
C wkj :
D min li k
(25
...

What are the actual shortest-path weights ı
...
i; j / < 1, there is a shortest path from i to j that is simple and thus contains at
most n 1 edges
...
The actual shortest-path
weights are therefore given by

...
i; j / D lij

1/


...
nC1/
D lij D lij
D

:

(25
...
wij /, we now compute a series of matrices

...
1/ ; L
...
n 1/ , where for m D 1; 2; : : : ; n 1, we have L
...

The final matrix L
...
Observe that

...
1/ D W
...
m 1/ and W , returns the matrix L
...
That is, it extends the shortest paths computed so far by one more edge
...
L; W /
1 n D L:rows
0
2 let L0 D lij be a new n n matrix
3 for i D 1 to n
4
for j D 1 to n
0
5
lij D 1
6
for k D 1 to n
0
0
7
lij D min
...
lij /, which it returns at the end
...
2) for all i and j , using L for L
...
m/
...
) Its running time is ‚
...

Now we can see the relation to matrix multiplication
...
Then, for
i; j D 1; 2; : : : ; n, we compute

cij D

n
X

ai k bkj :

(25
...
m

1/

!
w !

...
2), we obtain equation (25
...
Thus, if we make these changes to
E XTEND -S HORTEST-PATHS and also replace 1 (the identity for min) by 0 (the

25
...
n3 /-time procedure for multiplying square
matrices that we saw in Section 4
...
A; B/
1 n D A:rows
2 let C be a new n n matrix
3 for i D 1 to n
4
for j D 1 to n
5
cij D 0
6
for k D 1 to n
7
cij D cij C ai k bkj
8 return C
Returning to the all-pairs shortest-paths problem, we compute the shortest-path
weights by extending shortest paths edge by edge
...
A; B/, we compute the sequence of n 1 matrices
L
...
2/ D
L
...
n

1/

L
...
1/ W
L
...
n

2/

W

D W ;
D W2 ;
D W3 ;
D Wn

1

:

As we argued above, the matrix L
...

The following procedure computes this sequence in ‚
...

S LOW-A LL -PAIRS -S HORTEST-PATHS
...
1/ D W
3 for m D 2 to n 1
4
let L
...
m/ D E XTEND -S HORTEST-PATHS
...
m
6 return L
...
1 shows a graph and the matrices L
...

Improving the running time
Our goal, however, is not to compute all the L
...
n 1/
...
1/ D

0
1
1
2
1

3
0
4
1
1

8
1
0
5
1

L
...
2/ D

0
3
1
2
8

3
0
4
1
1

L
...
1 A directed graph and the sequence of matrices L
...
You might want to verify that L
...
4/ W , equals L
...
m/ D L
...


tion (25
...
m/ D L
...
Just as traditional matrix multiplication is associative, so is matrix multiplication defined by
the E XTEND -S HORTEST-PATHS procedure (see Exercise 25
...
Therefore, we
can compute L
...
n 1/e matrix products by computing the sequence
L
...
2/
L
...
8/
dlg
...
2

D
D
D
D

W ;
W2
W4
W8
:
:
:
dlg
...
n 1/e 1

D W2

dlg
...
n 1/e

:

/
is equal to L
...

Since 2dlg
...
2
The following procedure computes the above sequence of matrices by using this
technique of repeated squaring
...
1 Shortest paths and matrix multiplication

1

1
–4

2
4

2

–1
3

7
5

2

5

691

3
10

–8
6

Figure 25
...
1-1, 25
...
3-1
...
W /
1 n D W:rows
2 L
...
2m/ be a new n n matrix
6
L
...
L
...
m/ /
7
m D 2m
8 return L
...
2m/ D L
...
At the end of each iteration, we double the value
of m
...
n 1/ by actually computing L
...
By equation (25
...
2m/ D L
...
The next time the test
in line 4 is performed, m has been doubled, so now m n 1, the test fails, and
the procedure returns the last matrix it computed
...
n 1/e matrix products takes ‚
...
n3 lg n/ time
...

Exercises
25
...
2, showing the matrices that result for each iteration of the loop
...

25
...
1-3
What does the matrix

L
...
1-4
Show that matrix multiplication defined by E XTEND -S HORTEST-PATHS is associative
...
1-5
Show how to express the single-source shortest-paths problem as a product of matrices and a vector
...
1)
...
1-6
Suppose we also wish to compute the vertices on shortest paths in the algorithms of
this section
...
n3 / time
...
1-7
We can also compute the vertices on shortest paths as we compute the shortest
...
Define ij as the predecessor of vertex j on any minimum-weight
path from i to j that contains at most m edges
...
1/ ; …
...
n 1/ as the matrices L
...
2/ ; : : : ; L
...

25
...
n 1/e matrices, each with n2 elements, for a total space requirement of

...
Modify the procedure to require only ‚
...

25
...


25
...
1-10
Give an efficient algorithm to find the length (number of edges) of a minimumlength negative-weight cycle in a graph
...
2 The Floyd-Warshall algorithm
In this section, we shall use a different dynamic-programming formulation to solve
the all-pairs shortest-paths problem on a directed graph G D
...
The resulting algorithm, known as the Floyd-Warshall algorithm, runs in ‚
...
As
before, negative-weight edges may be present, but we assume that there are no
negative-weight cycles
...
1, we follow the dynamic-programming
process to develop the algorithm
...

The structure of a shortest path
In the Floyd-Warshall algorithm, we characterize the structure of a shortest path
differently from how we characterized it in Section 25
...
The Floyd-Warshall algorithm considers the intermediate vertices of a shortest path, where an intermediate
vertex of a simple path p D h 1 ; 2 ; : : : ; l i is any vertex of p other than 1 or l ,
that is, any vertex in the set f 2 ; 3 ; : : : ; l 1 g
...
Under our
assumption that the vertices of G are V D f1; 2; : : : ; ng, let us consider a subset
f1; 2; : : : ; kg of vertices for some k
...
(Path p is simple
...
The relationship
depends on whether or not k is an intermediate vertex of path p
...
Thus, a shortest path from vertex i
to vertex j with all intermediate vertices in the set f1; 2; : : : ; k 1g is also a
shortest path from i to j with all intermediate vertices in the set f1; 2; : : : ; kg
...
3 illustrates
...
1, p1 is a shortest path from i to k
with all intermediate vertices in the set f1; 2; : : : ; kg
...
Because vertex k is not an intermediate vertex of
path p1 , all intermediate vertices of p1 are in the set f1; 2; : : : ; k 1g
...
3 Path p is a shortest path from vertex i to vertex j , and k is the highest-numbered
intermediate vertex of p
...
The same holds for path p2 from vertex k to vertex j
...
Similarly, p2 is a shortest path from vertex k to vertex j with
all intermediate vertices in the set f1; 2; : : : ; k 1g
...
k/
path estimates that differs from the one in Section 25
...
Let dij be the weight
of a shortest path from vertex i to vertex j for which all intermediate vertices
are in the set f1; 2; : : : ; kg
...


...
Following the above

...
k/
(25
...
k 1/

...
k 1/
; di k
C dkj
if k 1 :
min dij
Because for any path, all intermediate vertices are in the set f1; 2; : : : ; ng, the ma
...
n/
trix D
...
i; j / for all i; j 2 V
...
5), we can use the following bottom-up procedure to com
...
Its input is an n n matrix W
defined as in equation (25
...
The procedure returns the matrix D
...


25
...
W /
1 n D W:rows
2 D
...
k/
4
let D
...
k/

...
k
7
dij D min dij 1/ ; di
...
n/

695

1/

Figure 25
...
k/ computed by the Floyd-Warshall algorithm
for the graph in Figure 25
...

The running time of the Floyd-Warshall algorithm is determined by the triply
nested for loops of lines 3–7
...
1/ time,
the algorithm runs in time ‚
...
As in the final algorithm in Section 25
...
Thus, the Floyd-Warshall algorithm is quite practical for even
moderate-sized input graphs
...
One way is to compute the matrix D of shortest-path weights
and then construct the predecessor matrix … from the D matrix
...
1-6
asks you to implement this method so that it runs in O
...
Given the predecessor matrix …, the P RINT-A LL -PAIRS -S HORTEST-PATH procedure will print
the vertices on a given shortest path
...
k/
...
k/

...
1/ ; : : : ; …
...
n/ and we define ij as the predecessor of
vertex j on a shortest path from vertex i with all intermediate vertices in the set
f1; 2; : : : ; kg
...
k/
We can give a recursive formulation of ij
...
Thus,
(
NIL if i D j or wij D 1 ;

...
6)
ij D
i
if i ¤ j and wij < 1 :
For k
1, if we take the path i ; k ; j , where k ¤ j , then the predecessor
of j we choose is the same as the predecessor of j we chose on a shortest path
from k with all intermediate vertices in the set f1; 2; : : : ; k 1g
...
0/ D

D
...
2/ D

D
...
4/ D

D
...
5/ D

NIL

4

˘

NIL

NIL


...
3/ D

4

NIL

˘

3

4
4
4
4

NIL

4
4

5
2
2
NIL

1
1
1
1

5

NIL

3
3
3

1

NIL

4
4

NIL

1
2

˘

NIL

NIL

4


...
1/ D

NIL

3

NIL

˘

NIL

NIL

4


...
4 The sequence of matrices D
...
k/ computed by the Floyd-Warshall algorithm
for the graph in Figure 25
...


25
...
Formally, for k 1,
(

...
k

...
k 1/ C dkj 1/ ;

...
7)
D

...
k

...
k 1/ C dkj 1/ :
kj
k
We leave the incorporation of the …
...
2-3
...
4 shows the sequence of …
...
1
...
Exercise 25
...

Transitive closure of a directed graph
Given a directed graph G D
...
We define the transitive closure of G as the graph G D
...
i; j / W there is a path from vertex i to vertex j in Gg :
One way to compute the transitive closure of a graph in ‚
...
If there is a
path from vertex i to vertex j , we get dij < n
...

There is another, similar way to compute the transitive closure of G in ‚
...
This method substitutes the logical
operations _ (logical OR) and ^ (logical AND) for the arithmetic operations min

...
For i; j; k D 1; 2; : : : ; n, we define tij to
be 1 if there exists a path in graph G from vertex i to vertex j with all intermediate
vertices in the set f1; 2; : : : ; kg, and 0 otherwise
...
n/
G D
...
i; j / into E if and only if tij D 1
...
k/
definition of tij , analogous to recurrence (25
...
i; j / 62 E ;

...
i; j / 2 E ;
and for k

...
k
tij D tij

1,
1/

_ ti
...
k
^ tkj

1/

:

(25
...
k/
As in the Floyd-Warshall algorithm, we compute the matrices T
...


698

Chapter 25 All-Pairs Shortest Paths

1

2

4

3

T
...
3/

D

1
0
0
1

0
1
1
0

0
1
1
1

0
1
0
1

T
...
4/

Figure 25
...
2/

D

1
0
0
1

0
1
1
0

0
1
1
1

0
1
1
1

A directed graph and the matrices T
...


T RANSITIVE -C LOSURE
...
0/
2 let T
...
i; j / 2 G:E

...
0/
7
else tij D 0
8 for k D 1 to n

...
k/ D tij be a new n n matrix
10
for i D 1 to n
11
for j D 1 to n

...
k

...
k 1/ ^ tkj 1/
k
13 return T
...
5 shows the matrices T
...
The T RANSITIVE -C LOSURE procedure, like the
Floyd-Warshall algorithm, runs in ‚
...
On some computers, though, logical operations on single-bit values execute faster than arithmetic operations on
integer words of data
...
2 The Floyd-Warshall algorithm

699

than the Floyd-Warshall algorithm’s by a factor corresponding to the size of a word
of computer storage
...
2-1
Run the Floyd-Warshall algorithm on the weighted, directed graph of Figure 25
...

Show the matrix D
...

25
...
1
...
2-3
Modify the F LOYD -WARSHALL procedure to compute the …
...
6) and (25
...
Prove rigorously that for all i 2 V , the predecessor
subgraph G ;i is a shortest-paths tree with root i
...
k/

...
k/ C wlj , according to the
acyclic, first show that ij D l implies dij
l

...
Then, adapt the proof of Lemma 24
...
)
25
...
n3 / space, since we

...
Show that the following procedure, which
simply drops all the superscripts, is correct, and thus only ‚
...

F LOYD -WARSHALL0
...
dij ; di k C dkj /
7 return D
25
...
7) handles equality:
(

...
k

...
k 1/ C dkj 1/ ;

...
k 1/

...
k
ij
if dij 1/ di
...
2-6
How can we use the output of the Floyd-Warshall algorithm to detect the presence
of a negative-weight cycle?
25
...
k/

...
k/
in the set f1; 2; : : : ; kg
...
k/
WARSHALL procedure to compute the ij values, and rewrite the P RINT-A LL
...

PAIRS -S HORTEST-PATH procedure to take the matrix ˆ D
ij
How is the matrix ˆ like the s table in the matrix-chain multiplication problem of
Section 15
...
2-8
Give an O
...
V; E/
...
2-9
Suppose that we can compute the transitive closure of a directed acyclic graph in
f
...

Show that the time to compute the transitive closure G D
...
V; E/ is then f
...
V C E /
...
3 Johnson’s algorithm for sparse graphs
Johnson’s algorithm finds shortest paths between all pairs in O
...
For sparse graphs, it is asymptotically faster than either repeated squaring of
matrices or the Floyd-Warshall algorithm
...
Johnson’s algorithm uses as subroutines both Dijkstra’s
algorithm and the Bellman-Ford algorithm, which Chapter 24 describes
...

If all edge weights w in a graph G D
...
V 2 lg V C VE/
...
3 Johnson’s algorithm for sparse graphs

701

that allows us to use the same method
...
For all pairs of vertices u; 2 V , a path p is a shortest path from u to
weight function w if and only if p is also a shortest path from u to
weight function w
...
For all edges
...
u; / is nonnegative
...
VE/ time
...
We use ı to denote shortest-path weights derived from weight
y
function w and ı to denote shortest-path weights derived from weight function w
...
1 (Reweighting does not change shortest paths)
Given a weighted, directed graph G D
...
For each edge

...
u; / D w
...
u/
y

h
...
9)

Let p D h 0 ; 1 ; : : : ; k i be any path from vertex 0 to vertex k
...
That is, w
...
0 ; k / if and only if w
...
0 ; k /
...

y
Proof

We start by showing that

w
...
p/ C h
...
k / :

(25
...
p/ D
y

k
X

w
...
w
...


i 1/

h
...


i 1;

i/

C h
...
p/ C h
...
k / :

h
...
p/ D w
...
0 / h
...
Bey
cause h
...
k / do not depend on the path, if one path from 0 to k is
shorter than another using weight function w, then it is also shorter using w
...

y
w
...
0 ; k / if and only if w
...

Finally, we show that G has a negative-weight cycle using weight function w if
and only if G has a negative-weight cycle using weight function w
...
By equation (25
...
c/ D w
...
0 /
y
D w
...
k /

and thus c has negative weight using w if and only if it has negative weight using w
...
u; / to be
y
nonnegative for all edges
...
Given a weighted, directed graph G D

...
V 0 ; E 0 /,
where V 0 D V [ fsg for some new vertex s 62 V and E 0 D E [ f
...

We extend the weight function w so that w
...
Note that
because s has no edges that enter it, no shortest paths in G 0 , other than those with
source s, contain s
...
Figure 25
...
1
...
Let us define
h
...
s; / for all
2 V 0
...
10),
we have h
...
u/ C w
...
u; / 2 E 0
...
9), we have
y
w
...
u; / C h
...
/ 0, and we have satisfied the second property
...
6(b) shows the graph G 0 from Figure 25
...

Computing all-pairs shortest paths
Johnson’s algorithm to compute all-pairs shortest paths uses the Bellman-Ford algorithm (Section 24
...
3) as subroutines
...
The algorithm returns the usual jV j jV j matrix D D dij , where dij D ı
...
As is typical for an all-pairs
shortest-paths algorithm, we assume that the vertices are numbered from 1 to jV j
...
3 Johnson’s algorithm for sparse graphs

0
3
0

–4

–4
0

4

–5

4

0

0

0
3
2/–3

4

10

0

10

2/2

2/–1

4

5

5

(c)

0

0

4

13

2
0

2/–2
5

3
0/–5

(f)

0 4

0

4
1
2/7

2
(d)

13

2
0

0

2/3
5

0

10

0/1
4

3
0/0

0
2
(e)

0/5
4

0

4

13

2
0

2

3
0/–4

0

1
4/8
0

0

10

2

2
2/5

2
0/–1
1
2/2

13

2
0

2

0

2
0/4
0

1
2/3

13

0/–4

5

2
0/0

2
0

0

–5 3

(b)

2
2/1
4

10
–4

0 4

(a)

1
0/0

13

1
0

0

1
6

5

0

0

2

–5 3

8

7

2
–1

1
4

1
0
2

0

5

2
–1

0

0

703

0/0

4

5

0

10

0/0

3
2/1

0
2
(g)

2/6
4

Figure 25
...
1
...
(a) The graph G 0 with the original weight function w
...
Within each vertex is h
...
s; /
...
u; / with weight function w
...
u; / C h
...
(c)–(g) The result of running
y
Dijkstra’s algorithm on each vertex of G using weight function w In each part, the source vertex u
y
...
Within each
y
vertex are the values ı
...
u; /, separated by a slash
...
u; / is equal to
y
ı
...
/ h
...


704

Chapter 25 All-Pairs Shortest Paths

J OHNSON
...
s; / W 2 G:Vg, and
w
...
G 0 ; w; s/ == FALSE
3
print “the input graph contains a negative-weight cycle”
4 else for each vertex 2 G 0 :V
5
set h
...
s; /
computed by the Bellman-Ford algorithm
6
for each edge
...
u; / D w
...
u/ h
...
du / be a new n n matrix
9
for each vertex u 2 G:V
y
y
10
run D IJKSTRA
...
u; / for all 2 G:V
11
for each vertex 2 G:V
y
12
du D ı
...
/ h
...
Line 1 produces G 0
...
If G 0 , and hence G, contains a negative-weight cycle, line 3 reports the
problem
...
Lines 4–5
set h
...
s; / computed by the Bellman-Ford algoy
rithm for all 2 V 0
...
For each pair of very
tices u; 2 V , the for loop of lines 9–12 computes the shortest-path weight ı
...
Line 12 stores in
matrix entry du the correct shortest-path weight ı
...
10)
...
Figure 25
...

If we implement the min-priority queue in Dijkstra’s algorithm by a Fibonacci
heap, Johnson’s algorithm runs in O
...
The simpler binary minheap implementation yields a running time of O
...

Exercises
25
...
2
...

y

Problems for Chapter 25

705

25
...
3-3
Suppose that w
...
u; / 2 E
...
3-4
Professor Greenstreet claims that there is a simpler way to reweight edges than
the method used in Johnson’s algorithm
...
u; /2E fw
...
u; / D w
...
u; / 2 E
...
3-5
Suppose that we run Johnson’s algorithm on a directed graph G with weight function w
...
u; / D 0 for every
y
edge
...

25
...
He claims that instead we can just use G 0 D G and let s be any
vertex
...
Then show that if G
is strongly connected (every vertex is reachable from every other vertex), the results
returned by J OHNSON with the professor’s modification are correct
...
V; E/ as we insert edges into E
...
Assume that the
graph G has no edges initially and that we represent the transitive closure as a
boolean matrix
...
Show how to update the transitive closure G D
...
V; E/
in O
...

b
...
V 2 / time is required
to update the transitive closure after the insertion of e into G, no matter what
algorithm is used
...
Describe an efficient algorithm for updating the transitive closure as edges are
inserted into the graph
...
V 3 /, where ti is the time to update the transitive
closure upon inserting the ith edge
...

25-2 Shortest paths in -dense graphs
A graph G D
...
V 1C / for some constant in the
range 0 < Ä 1
...

a
...
n˛ / for some
constant 0 < ˛ Ä 1? Compare these running times to the amortized costs of
these operations for a Fibonacci heap
...
Show how to compute shortest paths from a single source on an -dense directed
graph G D
...
E/ time
...
)
c
...
V; E/ with no negative-weight edges in O
...

d
...
VE/ time on an
-dense directed graph G D
...


Chapter notes
Lawler [224] has a good discussion of the all-pairs shortest-paths problem, although he does not analyze solutions for sparse graphs
...
The Floyd-Warshall algorithm is due to
Floyd [105], who based it on a theorem of Warshall [349] that describes how to
compute the transitive closure of boolean matrices
...

Several researchers have given improved algorithms for computing shortest
paths via matrix multiplication
...
V 5=2 / comparisons between sums of edge

Notes for Chapter 25

707

weights and obtains an algorithm that runs in O
...
lg lg V = lg V /1=3 / time, which
is slightly better than the running time of the Floyd-Warshall algorithm
...
V 3
...
Another line of research
demonstrates that we can apply algorithms for fast matrix multiplication (see the
chapter notes for Chapter 4) to the all-pairs shortest paths problem
...
n! / be
the running time of the fastest algorithm for multiplying n n matrices; currently
! < 2:376 [78]
...
V ! p
...
n/ denotes a particular function that is polylogarithmically bounded in n
...
VE/ time needed to perform jV j breadth-first searches
...
The asymptotically fastest such algorithm, by Shoshan and
Zwick [316], runs in time O
...
V W //
...
Given a graph with nonnegative edge weights, their algorithms run in
O
...
E/
...
Decremental algorithms allow a sequence of intermixed edge deletions and queries; by
comparison, Problem 25-1, in which edges are inserted, asks for an incremental algorithm
...
The query times are O
...
For transitive closure, the amortized time for each update is
O
...
For all-pairs shortest paths, the update times depend on the
queries
...

update is O
...
O
...
V 3 =E lg2 V //
...

Aho, Hopcroft, and Ullman [5] defined an algebraic structure known as a “closed
semiring,” which serves as a general framework for solving path problems in directed graphs
...
2 are instantiations of an all-pairs algorithm based on closed
semirings
...


26

Maximum Flow

Just as we can model a road map as a directed graph in order to find the shortest
path from one point to another, we can also interpret a directed graph as a “flow
network” and use it to answer questions about material flows
...
The source produces the material at some steady
rate, and the sink consumes the material at the same rate
...

Flow networks can model many problems, including liquids flowing through pipes,
parts through assembly lines, current through electrical networks, and information
through communication networks
...
Each conduit has a stated capacity, given as a maximum rate at which the material can flow through the conduit, such as 200 gallons of liquid per hour through
a pipe or 20 amperes of electrical current through a wire
...
In other words, the rate at which material enters a vertex must equal the rate at which it leaves the vertex
...

In the maximum-flow problem, we wish to compute the greatest rate at which
we can ship material from the source to the sink without violating any capacity
constraints
...

Moreover, we can adapt the basic techniques used in maximum-flow algorithms to
solve other network-flow problems
...
Section 26
...
Section 26
...
An application of this method,

26
...
3
...
4 presents the push-relabel method, which underlies many of
the fastest algorithms for network-flow problems
...
5 covers the “relabelto-front” algorithm, a particular implementation of the push-relabel method that
runs in time O
...
Although this algorithm is not the fastest algorithm known,
it illustrates some of the techniques used in the asymptotically fastest algorithms,
and it is reasonably efficient in practice
...
1 Flow networks
In this section, we give a graph-theoretic definition of flow networks, discuss their
properties, and define the maximum-flow problem precisely
...

Flow networks and flows
A flow network G D
...
u; / 2 E
has a nonnegative capacity c
...
We further require that if E contains an
edge
...
; u/ in the reverse direction
...
) If
...
u; / D 0, and we disallow self-loops
...
For convenience, we assume that each
vertex lies on some path from the source to the sink
...
The graph is therefore connected
and, since each vertex other than s has at least one entering edge, jEj jV j 1
...
1 shows an example of a flow network
...
Let G D
...
Let s be the source of the network, and let t be
the sink
...
u; / Ä c
...

Flow conservation: For all u 2 V
X
2V

f
...
u; / :

2V

When
...
u; / D 0
...
1 (a) A flow network G D
...

The Vancouver factory is the source s, and the Winnipeg warehouse is the sink t
...
u; / crates per day can go from city u to city
...
(b) A flow f in G with value jf j D 19
...
u; / is labeled
by f
...
u; /
...


We call the nonnegative quantity f
...
The
value jf j of a flow f is defined as
X
X
f
...
; s/ ;
(26
...
(Here, the j j
notation denotes flow value, not absolute value or cardinality
...
; s/, will be 0
...
In the maximum-flow problem, we are given a flow network G
with source s and sink t, and we wish to find a flow of maximum value
...
The capacity constraint simply
says that the flow from one vertex to another must be nonnegative and must not
exceed the given capacity
...

An example of flow
A flow network can model the trucking problem shown in Figure 26
...
The
Lucky Puck Company has a factory (source s) in Vancouver that manufactures
hockey pucks, and it has a warehouse (sink t) in Winnipeg that stocks them
...
1 Flow networks

13

v2

14

s

10

v′
10

v4

4

(a)

13

v3

v2

14

20

t

7

16

t

7

4

10

s

9

16

12

v1

20

9

v3

4

12

v1

711

v4

4

(b)

Figure 26
...
(a) A flow network containing both the edges
...
2 ; 1 /
...
We add the new vertex 0 , and we replace edge
...
1 ; 0 / and
...
1 ; 2 /
...
Because the trucks travel over specified routes (edges) between
cities (vertices) and have a limited capacity, Lucky Puck can ship at most c
...
1(a)
...
1(a)
...

Lucky Puck is not concerned with how long it takes for a given puck to get from
the factory to the warehouse; they care only that p crates per day leave the factory
and p crates per day arrive at the warehouse
...
Additionally, the model must obey flow conservation, for in a steady
state, the rate at which pucks enter an intermediate city must equal the rate at which
they leave
...

Modeling problems with antiparallel edges
Suppose that the trucking firm offered Lucky Puck the opportunity to lease space
for 10 crates in trucks going from Edmonton to Calgary
...
2(a)
...
1 ; 2 / 2 E, then
...
We call the two edges
...
2 ; 1 / antiparallel
...
Figure 26
...
We choose
one of the two antiparallel edges, in this case
...
1 ; 2 / with the pair of edges
...
0 ; 2 /
...

The resulting network satisfies the property that if an edge is in the network, the
reverse edge is not
...
1-1 asks you to prove that the resulting network is
equivalent to the original one
...
It will be convenient to disallow antiparallel edges, however, and so we have a straightforward way to convert a network
containing antiparallel edges into an equivalent one with no antiparallel edges
...
The Lucky Puck Company, for example, might actually have a set
of m factories fs1 ; s2 ; : : : ; sm g and a set of n warehouses ft1 ; t2 ; : : : ; tn g, as shown
in Figure 26
...
Fortunately, this problem is no harder than ordinary maximum
flow
...
Figure 26
...
We add a supersource s and add a
directed edge
...
s; si / D 1 for each i D 1; 2; : : : ; m
...
ti ; t/ with capacity c
...
Intuitively, any flow in the network in (a) corresponds to
a flow in the network in (b), and vice versa
...
Exercise 26
...

Exercises
26
...
More
formally, suppose that flow network G contains edge
...
u; / by new edges

...
x; / with c
...
x; / D c
...
Show that a maximum flow
in G 0 has the same value as a maximum flow in G
...
1 Flow networks

713

s1

s1
10

10

s2

12

t1

15

5



8

6

s3

8

t2

s



14

13

11



18

t3

7

13

s4



t

t3

18

11

2

t2

20

14



7

s4

6

s3

20

t1

15



5

3

s2



3



12

2

s5

s5
(a)

(b)

Figure 26
...
(a) A flow network with five sources S D fs1 ; s2 ; s3 ; s4 ; s5 g
and three sinks T D ft1 ; t2 ; t3 g
...
We add
a supersource s and an edge with infinite capacity from s to each of the multiple sources
...


26
...
Show that any flow in a multiple-source, multiple-sink flow network
corresponds to a flow of identical value in the single-source, single-sink network
obtained by adding a supersource and a supersink, and vice versa
...
1-3
Suppose that a flow network G D
...
Let u be a vertex for which there
is no path s ; u ; t
...
u; / D f
...


714

Chapter 26 Maximum Flow

26
...
The scalar flow product,
denoted ˛f , is a function from V V to R defined by

...
u; / D ˛ f
...
That is, show that if f1 and f2
are flows, then so is ˛f1 C
...

26
...

26
...
The problem is so severe that not only do they refuse to walk to school together, but in fact
each one refuses to walk on any block that the other child has stepped on that day
...
Fortunately
both the professor’s house and the school are on corners, but beyond that he is not
sure if it is going to be possible to send both of his children to the same school
...
Show how to formulate the problem of determining whether both his children can go to the same school as a maximum-flow
problem
...
1-7
Suppose that, in addition to edge capacities, a flow network has vertex capacities
...
/ on how much flow can pass though
...
V; E/ with vertex capacities into an equivalent flow network G 0 D
...
How many vertices and
edges does G 0 have?

26
...
We call it a “method” rather than an “algorithm” because it encompasses
several implementations with differing running times
...
These ideas are essential to the important max-flow min-cut theorem (Theorem 26
...
2 The Ford-Fulkerson method

715

the flow network
...

The Ford-Fulkerson method iteratively increases the value of the flow
...
u; / D 0 for all u; 2 V , giving an initial flow of value 0
...
Once we know the edges of an augmenting
path in Gf , we can easily identify specific edges in G for which we can change
the flow so that we increase the value of the flow
...
We repeatedly augment the flow until the residual network has
no more augmenting paths
...

F ORD -F ULKERSON -M ETHOD
...

Residual networks
Intuitively, given a flow network G and a flow f , the residual network Gf consists
of edges with capacities that represent how we can change the flow on edges of G
...
If that value is positive, we place
that edge into Gf with a “residual capacity” of cf
...
u; / f
...

The only edges of G that are in Gf are those that can admit more flow; those
edges
...
u; / D 0, and they are not
in Gf
...

As an algorithm manipulates the flow, with the goal of increasing the total flow, it
might need to decrease the flow on a particular edge
...
u; / on an edge in G, we place an edge
...
; u/ D f
...
u; /, at most canceling out the flow on
...

These reverse edges in the residual network allow an algorithm to send back flow

716

Chapter 26 Maximum Flow

it has already sent along an edge
...

More formally, suppose that we have a flow network G D
...
Let f be a flow in G, and consider a pair of vertices u; 2 V
...
u; / by

€ c
...
u; / D

/ f
...
u; / 2 E ;
f
...
; u/ 2 E ;
0
otherwise :

(26
...
u; / 2 E implies
...
2) applies to each ordered pair of vertices
...
2), if c
...
u; / D 11, then we
can increase f
...
u; / D 5 units before we exceed the capacity
constraint on edge
...
We also wish to allow an algorithm to return up to 11
units of flow from to u, and hence cf
...

Given a flow network G D
...
V; Ef /, where
Ef D f
...
u; / > 0g :

(26
...
Figure 26
...
1(b), and Figure 26
...
The edges in Ef are either edges in E or their reversals, and thus
jEf j Ä 2 jEj :
Observe that the residual network Gf is similar to a flow network with capacities
given by cf
...
u; / and its reversal
...
Other than this difference, a
residual network has the same properties as a flow network, and we can define a
flow in the residual network as one that satisfies the definition of a flow, but with
respect to capacities cf in the network Gf
...
If f is a flow in G and f 0 is a flow in the corresponding residual
network Gf , we define f " f 0 , the augmentation of flow f by f 0 , to be a function
from V V to R, defined by
(
f
...
u; / f 0
...
u; / 2 E ;
(26
...
f " f 0 /
...
2 The Ford-Fulkerson method

3

v2

11/14

v4

4/4

8

v3

12/

13

v2

11/14
(c)

19/

v4

t
4/4

11
1

s
12

12

v1

5

20

7/7

9

1/4

11

s

t
4

v4

(b)

3

12/12

15

11

1

v1

7
3

v2

(a)

/16

5

5

11
5

s

v3

v2

3

v3
7

8/1

t

12

v1

5

20

7/7

s

4/
9

1/4

11

15/

4

v3

9

12/12

3

v1

1

/16

717

v4

1
19

t
4

11
(d)

Figure 26
...
1(b)
...
p/ D cf
...
Edges with
residual capacity equal to 0, such as
...
(c) The flow in G that results from augmenting along path p by its residual capacity 4
...
3 ; 2 /, are labeled only by their capacity, another convention we
follow throughout
...


The intuition behind this definition follows the definition of the residual network
...
u; / by f 0
...
; u/ because
pushing flow on the reverse edge in the residual network signifies decreasing the
flow in the original network
...
For example, if we send 5 crates of hockey
pucks from u to and send 2 crates from to u, we could equivalently (from the
perspective of the final result) just send 3 creates from u to and none from to u
...

Lemma 26
...
V; E/ be a flow network with source s and sink t, and let f be a flow
in G
...
Then the function f " f 0 defined in equation (26
...

Proof We first verify that f " f 0 obeys the capacity constraint for each edge in E
and flow conservation at each vertex in V fs; tg
...
u; / 2 E, then cf
...
u; /
...
; u/ Ä cf
...
u; /, and hence

...
u; / D f
...
u; /
f
...
u; /
D f 0
...
; u/ (by equation (26
...
u; / (because f 0
...
u; /)

In addition,

...
u; /
D
Ä
Ä
D
D

f
...
u;
f
...
u;
c
...
u; / f 0
...
u; /
/ C cf
...
u; / f
...
4))
(because flows are nonnegative)
(capacity constraint)
(definition of cf )

For flow conservation, because both f and f 0 obey flow conservation, we have
that for all u 2 V fs; tg,
X
X

...
u; / D

...
u; / C f 0
...
; u//
2V

2V

D

X

f
...
; u/ C

X

2V

D

X

X

f 0
...
; u/

2V

X

0

f
...
f
...
; u/

f 0
...
u; //

2V

D

X


...
; u/ ;

2V

where the third line follows from the second by flow conservation
...
Recall that we disallow antiparallel
edges in G (but not in Gf ), and hence for each vertex 2 V , we know that there
can be an edge
...
; s/, but never both
...
s; / 2 Eg
to be the set of vertices with edges from s, and V2 D f W
...
We have V1 [ V2 Â V and, because we disallow
antiparallel edges, V1 \ V2 D ;
...
f " f 0 /
...
f " f 0 /
...
f " f /
...
f " f 0 /
...
5)

26
...
f " f 0 /
...
w; x/ 62 E
...
5), and then reorder and group terms
to obtain
jf " f 0 j
X

...
s; / C f 0
...
s; / C

2V1

D

X

X
2V2

f
...
s; /

2V1

X

f
...
; s/
X

X


...
; s/ C f 0
...
s; //

2V2
0

f
...
; s/ C

2V2

X

f 0
...
; s/

2V2

f 0
...
; s//

X
2V2

f
...
s; /
X

2V1
0

f
...
; s/
X

X

f 0
...
; s/ :

(26
...
6), we can extend all four summations to sum over V , since each
additional term has value 0
...
2-1 asks you to prove this formally
...
s; /
f
...
s; /
f 0
...
7)
jf " f 0 j D
2V

2V

2V

2V

0

D jf j C jf j :
Augmenting paths
Given a flow network G D
...
By the definition of the residual network, we may increase the flow on an edge
...
u; / without violating the capacity constraint on whichever of
...
; u/ is in the original flow network G
...
4(b) is an augmenting path
...
2 ; 3 / D 4
...
p/ D min fcf
...
u; / is on pg :

720

Chapter 26 Maximum Flow

The following lemma, whose proof we leave as Exercise 26
...

Lemma 26
...
V; E/ be a flow network, let f be a flow in G, and let p be an augmenting
path in Gf
...
p/ if
...
8)
fp
...
p/ > 0
...
Figure 26
...
4(a) by the flow fp in Figure 26
...
4(d) shows the ensuing residual network
...
3
Let G D
...
Let fp be defined as in equation (26
...
Then the function f " fp is a flow in G with value
jf " fp j D jf j C jfp j > jf j
...
1 and 26
...


Cuts of flow networks
The Ford-Fulkerson method repeatedly augments the flow along augmenting paths
until it has found a maximum flow
...
To prove this theorem, though, we
must first explore the notion of a cut of a flow network
...
S; T / of flow network G D
...
(This definition is similar to the definition of “cut” that we used for minimum spanning trees in Chapter 23, except
that here we are cutting a directed graph rather than an undirected graph, and we
insist that s 2 S and t 2 T
...
S; T / across the
cut
...
S; T / D
f
...
; u/ :
(26
...
2 The Ford-Fulkerson method

/16

12/12

v1

v3

8/1

3

v2

11/14

15/

20

7/7

s

4/
9

1/4

11

721

t
4/4

v4

S T

Figure 26
...
S; T / in the flow network of Figure 26
...
The vertices in S are black, and the vertices in T are white
...
S; T / is f
...
S; T / D 26
...
S; T / is
XX
c
...
S; T / D
u2S

(26
...

The asymmetry between the definitions of flow and capacity of a cut is intentional and important
...
For flow, we consider the
flow going from S to T minus the flow going in the reverse direction from T to S
...

Figure 26
...
fs; 1 ; 2 g ; f 3 ; 4 ; tg/ in the flow network of Figure 26
...
The net flow across this cut is
f
...
2;

4/

f
...
1 ;

3/

C c
...

Lemma 26
...
S; T / be any
cut of G
...
S; T / is f
...


722

Chapter 26 Maximum Flow

Proof We can rewrite the flow-conservation condition for any node u 2 V fs; tg
as
X
X
f
...
; u/ D 0 :
(26
...
1) and adding the left-hand side of
equation (26
...
s; /
f
...
u; /
f
...
s; /
f
...
u; /
f
...
s; / C

2V

D

XX

u2S fsg 2V

X

!

f
...
; s/ C

2V

u2S fsg

f
...
; u/

u2S fsg

f
...
u; / C
f
...
; u/
f
...
u; /

2T u2S

XX

2S u2S

f
...
u; /

2S u2S

2T u2S

XX

!
f
...
x; y/ appears once in each summation
...
u; /
f
...
S; T / :
A corollary to Lemma 26
...


26
...
5
The value of any flow f in a flow network G is bounded from above by the capacity
of any cut of G
...
S; T / be any cut of G and let f be any flow
...
4 and the
capacity constraint,
jf j D f
...
u; /
D
u2S

u2S

Ä
Ä

2T

XX

f
...
u; /
c
...
S; T / :
Corollary 26
...
The important max-flow min-cut theorem, which we now state and
prove, says that the value of a maximum flow is in fact equal to the capacity of a
minimum cut
...
6 (Max-flow min-cut theorem)
If f is a flow in a flow network G D
...
f is a maximum flow in G
...
The residual network Gf contains no augmenting paths
...
jf j D c
...
S; T / of G
...
1/ )
...
Then, by Corollary 26
...
8), is a flow
in G with value strictly greater than jf j, contradicting the assumption that f is a
maximum flow
...
2/ )
...
Define
S D f 2 V W there exists a path from s to

in Gf g

and T D V S
...
S; T / is a cut: we have s 2 S trivially and t 62 S
because there is no path from s to t in Gf
...
If
...
u; / D c
...
u; / 2 Ef , which would place in set S
...
; u/ 2 E, we must
have f
...
u; / D f
...
u; / 2 Ef , which would place in S
...
u; /
nor
...
u; / D f
...
We thus have
XX
XX
f
...
; u/
f
...
u; /

XX

0

2T u2S

D c
...
4, therefore, jf j D f
...
S; T /
...
3/ )
...
5, jf j Ä c
...
S; T /
...
S; T / thus implies that f is a maximum flow
...
As Lemma 26
...
3 suggest, we
replace f by f " fp , obtaining a new flow whose value is jf j C jfp j
...
V; E/ by updating the flow attribute
...
u; / 2 E
...
u; / 62 E, we assume implicitly that
...
We also assume that we
are given the capacities c
...
u; / D 0
if
...
We compute the residual capacity cf
...
2)
...
p/ in the code is just a temporary variable that
stores the residual capacity of the path p
...
G; s; t/
1 for each edge
...
u; /:f D 0
3 while there exists a path p from s to t in the residual network Gf
4
cf
...
u; / W
...
u; / in p
6
if
...
u; /:f D
...
p/
8
else
...
; u/:f cf
...
1 that we represent an attribute f for edge
...
u; /: f —that we use for an attribute of any other object
...
2 The Ford-Fulkerson method

725

The F ORD -F ULKERSON algorithm simply expands on the F ORD -F ULKERSON M ETHOD pseudocode given earlier
...
6 shows the result of each iteration
in a sample run
...
The while loop of lines 3–8
repeatedly finds an augmenting path p in Gf and augments flow f along p by
the residual capacity cf
...
Each residual edge in path p is either an edge in the
original network or the reversal of an edge in the original network
...
When no augmenting paths exist, the
flow f is a maximum flow
...
If we choose it poorly, the algorithm might not even terminate: the
value of the flow will increase with successive augmentations, but it need not even
converge to the maximum flow value
...
2), however, the algorithm runs in
polynomial time
...

In practice, the maximum-flow problem often arises with integral capacities
...
If f denotes a maximum flow in the transformed
network, then a straightforward implementation of F ORD -F ULKERSON executes
the while loop of lines 3–8 at most jf j times, since the flow value increases by at
least one unit in each iteration
...
V; E/ with the right data structure and find an augmenting
path by a linear-time algorithm
...
V; E 0 /, where E 0 D f
...
u; / 2 E or

...
Edges in the network G are also edges in G 0 , and therefore we can
easily maintain capacities and flows in this data structure
...
u; / of G 0 such that
cf
...
2)
...
V C E 0 / D O
...
Each iteration of the while loop thus takes O
...
E jf j/
...


Chapter 26 Maximum Flow

20

/16

v1

20

v2

t

7
10

s
4/1

4

v4

3

4

16

v2

/16

v1

s

4

7

t

10
4

v4

4

4/1

3

4/2

8/12

t

v4

v3

4/4

8/2

0

5
v2

4/14

8

4

4

4
9

s

v3

8
4

(c)

4

v1

v3

4/4

0

4
12

8/12

v4

4

5
13

v1

4/14

7

4
4

/16

v2

t

v2

4/14

t

7

v3

4
4

s

8

v1

12

(b)

14

13

4

v4

20

7

s

4/
9

t

4/4

v2

v3

4

9
13

4/12

4/
9

v3

9

s

4

(a)

12

4

v1

16

7

726

v4

4/4

Figure 26
...
(a)–(e) Successive iterations of
the while loop
...
The right side of each part shows the new flow f that results from augmenting f
by fp
...


When the capacities are integral and the optimal flow value jf j is small, the
running time of the Ford-Fulkerson algorithm is good
...
7(a) shows an example of what can happen on a simple flow network for which jf j is large
...
If
the first augmenting path found by F ORD -F ULKERSON is s ! u ! ! t , shown
in Figure 26
...
The resulting residual network appears in Figure 26
...
If the second iteration finds the augmenting path s ! ! u ! t, as shown in Figure 26
...

Figure 26
...
We can continue, choosing
the augmenting path s ! u ! ! t in the odd-numbered iterations and the augmenting path s ! ! u ! t in the even-numbered iterations
...


26
...
6, continued (f) The residual network at the last while loop test
...
The value of the maximum flow
found is 23
...
That is, we choose the augmenting
path as a shortest path from s to t in the residual network, where each edge has
unit distance (weight)
...
We now prove that the Edmonds-Karp algorithm runs
in O
...

The analysis depends on the distances to vertices in the residual network Gf
...
u; / for the shortest-path distance
from u to in Gf , where each edge has unit distance
...
7
If the Edmonds-Karp algorithm is run on a flow network G D
...
s; /
in the residual network Gf increases monotonically with each flow augmentation
...
7 (a) A flow network for which F ORD -F ULKERSON can take ‚
...
The shaded path is an augmenting path with residual capacity 1
...
(c) The resulting residual network
...
Let f be the flow just before the first augmentation
that decreases some shortest-path distance, and let f 0 be the flow just afterward
...
s; / whose distance was decreased by
the augmentation, so that ıf 0
...
s; /
...
u; / 2 Ef 0 and
ıf 0
...
s; /

1:

(26
...
e
...
s; u/

ıf
...
13)

We claim that
...
Why? If we had
...
s; / Ä ıf
...
10, the triangle inequality)
Ä ıf 0
...
13))
(by equation (26
...
s; /
which contradicts our assumption that ıf 0
...
s; /
...
u; / 62 Ef and
...
The Edmonds-Karp algorithm always augments flow along shortest paths, and therefore the shortest path from s to u in Gf
has
...
Therefore,
ıf
...
s; u/ 1
Ä ıf 0
...
13))
D ıf 0
...
12)) ,

26
...
s; / < ıf
...
We conclude that our
assumption that such a vertex exists is incorrect
...

Theorem 26
...
V; E/ with source s
and sink t, then the total number of flow augmentations performed by the algorithm
is O
...

Proof We say that an edge
...
u; /, that
is, if cf
...
u; /
...
Moreover, at
least one edge on any augmenting path must be critical
...

Let u and be vertices in V that are connected by an edge in E
...
u; / is critical for the first time, we have
ıf
...
s; u/ C 1 :
Once the flow is augmented, the edge
...

It cannot reappear later on another augmenting path until after the flow from u to
is decreased, which occurs only if
...
If f 0 is
the flow in G when this event occurs, then we have
ıf 0
...
s; / C 1 :
Since ıf
...
s; / by Lemma 26
...
s; u/ D ıf 0
...
s; / C 1
D ıf
...
u; / becomes critical to the time when it next
becomes critical, the distance of u from the source increases by at least 2
...
The intermediate vertices on a
shortest path from s to u cannot contain s, u, or t (since
...
Therefore, until u becomes unreachable from the source,
if ever, its distance is at most jV j 2
...
u; / becomes
critical, it can become critical at most
...
Since there are O
...
VE/
...

Because we can implement each iteration of F ORD -F ULKERSON in O
...
VE 2 /
...
The algorithm of Section 26
...
V 2 E/ running time, which forms the basis for the O
...
5
...
2-1
Prove that the summations in equation (26
...
7)
...
2-2
In Figure 26
...
fs;
the capacity of this cut?

2;

4g ; f 1;

3 ; tg/?

What is

26
...
1(a)
...
2-4
In the example of Figure 26
...
2-5
Recall that the construction in Section 26
...
Prove that any flow in the resulting network has a finite value
if the edges of the original network with multiple sources and sinks have finite
capacity
...
2-6
Suppose that each source si in a flow network with multiple sources and sinks
P
produces exactly pi units of flow, so that
2V f
...
Suppose also
P
that each P tj consumes exactly qj units, so that
sink
2V f
...
Show how to convert the problem of finding a flow f that obeys
i

26
...

26
...
2
...
2-8
Suppose that we redefine the residual network to disallow edges into s
...

26
...

Does the augmented flow satisfy the flow conservation property? Does it satisfy
the capacity constraint?
26
...
V; E/ by a sequence of at
most jEj augmenting paths
...
)
26
...
For example, the edge connectivity
of a tree is 1, and the edge connectivity of a cyclic chain of vertices is 2
...
V; E/ by
running a maximum-flow algorithm on at most jV j flow networks, each having
O
...
E/ edges
...
2-12
Suppose that you are given a flow network G, and G has edges entering the
source s
...
; s/ entering the source
has f
...
Prove that there must exist another flow f 0 with f 0
...
Give an O
...

26
...
Show how to
modify the capacities of G to create a new flow network G 0 in which any minimum
cut in G 0 is a minimum cut with the smallest number of edges in G
...
3 Maximum bipartite matching
Some combinatorial problems can easily be cast as maximum-flow problems
...
1 gave us
one example
...

This section presents one such problem: finding a maximum matching in a bipartite
graph
...
We shall also see how to use
the Ford-Fulkerson method to solve the maximum-bipartite-matching problem on
a graph G D
...
VE/ time
...
V; E/, a matching is a subset of edges M Â E
such that for all vertices 2 V , at most one edge of M is incident on
...
A maximum matching is a matching
of maximum cardinality, that is, a matching M such that for any matching M 0 ,
we have jM j jM 0 j
...
We further assume that every vertex in V has at least one
incident edge
...
8 illustrates the notion of a matching in a bipartite graph
...
As an example, we might consider matching a set L of machines with a set R of tasks to be performed simultaneously
...
u; / in E to mean that a particular machine u 2 L is capable of performing a particular task 2 R
...

Finding a maximum bipartite matching
We can use the Ford-Fulkerson method to find a maximum matching in an undirected bipartite graph G D
...
The trick is
to construct a flow network in which flows correspond to matchings, as shown in
Figure 26
...
We define the corresponding flow network G 0 D
...
We let the source s and sink t be new vertices not
in V , and we let V 0 D V [ fs; tg
...
3 Maximum bipartite matching

733

s

L

R
(a)

L

R
(b)

t

L

R
(c)

Figure 26
...
V; E/ with vertex partition V D L [ R
...
(b) A maximum matching with cardinality 3
...
Each edge has unit capacity
...
The shaded edges from L to R correspond
to those in the maximum matching from (b)
...
s; u/ W u 2 Lg [ f
...
u; / 2 Eg [ f
...
Since
each vertex in V has at least one incident edge, jEj jV j =2
...
E/
...
We say that a flow f on a flow network
G D
...
u; / is an integer for all
...

Lemma 26
...
V; E/ be a bipartite graph with vertex partition V D L [ R, and let
G 0 D
...
If M is a matching in G, then
there is an integer-valued flow f in G 0 with value jf j D jM j
...

Proof We first show that a matching M in G corresponds to an integer-valued
flow f in G 0
...
If
...
s; u/ D f
...
; t/ D 1
...
u; / 2 E 0 , we define f
...
It is simple
to verify that f satisfies the capacity constraint and flow conservation
...
u; / 2 M corresponds to one unit of flow in G 0 that
traverses the path s ! u ! ! t
...
The net flow across cut
...
4, the value of the flow is jf j D jM j
...
u; / W u 2 L;

2 R; and f
...
s; u/, and its capacity
is 1
...
Furthermore,
since f is integer-valued, for each u 2 L, the one unit of flow can enter on at most
one edge and can leave on at most one edge
...
u; / D 1, and at most one
edge leaving each u 2 L carries positive flow
...
The set M is therefore a matching
...
s; u/ D 1, and for every edge
...
u; / D 0
...
L [ fsg ; R [ ftg/, the net flow across cut
...
Applying Lemma 26
...
L [ fsg ; R [ ftg/ D jM j
...
9, we would like to conclude that a maximum matching
in a bipartite graph G corresponds to a maximum flow in its corresponding flow
network G 0 , and we can therefore compute a maximum matching in G by running
a maximum-flow algorithm on G 0
...
u; / is
not an integer, even though the flow value jf j must be an integer
...

Theorem 26
...

Moreover, for all vertices u and , the value of f
...

Proof The proof is by induction on the number of iterations
...
3-2
...
9
...
3 Maximum bipartite matching

735

Corollary 26
...

Proof We use the nomenclature from Lemma 26
...
Suppose that M is a maximum matching in G and that the corresponding flow f in G 0 is not maximum
...
Since the capacities in G 0 are integer-valued, by Theorem 26
...
Thus, f 0 corresponds to a matching M 0 in G with cardinality
jM 0 j D jf 0 j > jf j D jM j, contradicting our assumption that M is a maximum
matching
...

Thus, given a bipartite undirected graph G, we can find a maximum matching by
creating the flow network G 0 , running the Ford-Fulkerson method, and directly obtaining a maximum matching M from the integer-valued maximum flow f found
...
L; R/ D O
...
V /
...
VE 0 / D O
...
E/
...
3-1
Run the Ford-Fulkerson algorithm on the flow network in Figure 26
...
Number the vertices in L top
to bottom from 1 to 5 and in R top to bottom from 6 to 9
...

26
...
10
...
3-3
Let G D
...
Give a good upper bound on the length of any
augmenting path found in G 0 during the execution of F ORD -F ULKERSON
...
3-4 ?
A perfect matching is a matching in which every vertex is matched
...
V; E/ be an undirected bipartite graph with vertex partition V D L [ R, where
jLj D jRj
...
X / D fy 2 V W
...
Prove Hall’s theorem:
there exists a perfect matching in G if and only if jAj Ä jN
...

26
...
V; E/, where V D L [ R, is d-regular if every
vertex 2 V has degree exactly d
...

Prove that every d -regular bipartite graph has a matching of cardinality jLj by
arguing that a minimum cut of the corresponding flow network has capacity jLj
...
4 Push-relabel algorithms
In this section, we present the “push-relabel” approach to computing maximum
flows
...
Push-relabel methods also efficiently solve other flow problems, such as the minimum-cost flow problem
...
V 2 E/ time, thereby improving upon the
O
...
Section 26
...
V 3 / time
...
Rather than examine the entire residual network to find an augmenting path, push-relabel algorithms work on one vertex at a time, looking only
at the vertex’s neighbors in the residual network
...
They do, however, maintain a preflow, which
is a function f W V V ! R that satisfies the capacity constraint and the following
relaxation of flow conservation:
X
X
f
...
u; / 0
2V

2V

for all vertices u 2 V fsg
...
We call the quantity
X
X
f
...
u/ D
f
...
14)
2V

2V

the excess flow into vertex u
...
We say that a vertex u 2 V fs; tg is overflowing if
e
...


26
...
We shall then investigate the two operations employed by the method:
“pushing” preflow and “relabeling” a vertex
...

Intuition
You can understand the intuition behind the push-relabel method in terms of fluid
flows: we consider a flow network G D
...
Applying this analogy to the Ford-Fulkerson method,
we might say that each augmenting path in the network gives rise to an additional
stream of fluid, with no branch points, flowing from the source to the sink
...

The generic push-relabel algorithm has a rather different intuition
...
Vertices, which are pipe junctions, have two
interesting properties
...
Second,
each vertex, its reservoir, and all its pipe connections sit on a platform whose height
increases as the algorithm progresses
...
The flow from a lower vertex to a higher
vertex may be positive, but operations that push flow push it only downhill
...
All other vertex
heights start at 0 and increase with time
...
The amount it sends is exactly
enough to fill each outgoing pipe from the source to capacity; that is, it sends the
capacity of the cut
...
When flow first enters an intermediate vertex, it
collects in the vertex’s reservoir
...

We may eventually find that the only pipes that leave a vertex u and are not
already saturated with flow connect to vertices that are on the same level as u or
are uphill from u
...
We increase
its height to one unit more than the height of the lowest of its neighbors to which
it has an unsaturated pipe
...

Eventually, all the flow that can possibly get through to the sink has arrived there
...
To make the preflow
a “legal” flow, the algorithm then sends the excess collected in the reservoirs of
overflowing vertices back to the source by continuing to relabel vertices to above

738

Chapter 26 Maximum Flow

the fixed height jV j of the source
...

The basic operations
From the preceding discussion, we see that a push-relabel algorithm performs two
basic operations: pushing flow excess from a vertex to one of its neighbors and
relabeling a vertex
...

Let G D
...
A function h W V ! N is a height function3 if h
...
t/ D 0, and
h
...
/ C 1
for every residual edge
...
We immediately obtain the following lemma
...
12
Let G D
...
For any two vertices u; 2 V , if h
...
/ C 1, then
...

The push operation
The basic operation P USH
...
u; / > 0,
and h
...
/ C1
...
It assumes that we can compute residual capacity cf
...
We maintain the excess flow stored at a vertex u as
the attribute u:e and the height of u as the attribute u:h
...
u; /
is a temporary variable that stores the amount of flow that we can push from u to
...
” We use the term “height” because it is more suggestive of the intuition
behind the algorithm
...
The height of a vertex is related to its distance from the sink t, as would be
found in a breadth-first search of the transpose G T
...
4 Push-relabel algorithms

739

P USH
...
u; / > 0, and u:h D :h C 1
...
u; / D min
...
u; // units of flow from u to
...
u; / D min
...
u; //
4 if
...
u; /:f D
...
u; /
6 else
...
; u/:f f
...
u; /
8
:e D :e C f
...
Because vertex u has a positive excess u:e
and the residual capacity of
...
u; / D min
...
u; // without causing u:e to become negative or the
capacity c
...
Line 3 computes the value f
...
Line 5 increases the flow on edge
...
Line 6 decreases the flow on
edge
...
Finally, lines 7–8 update the excess flows into vertices u and
...

Observe that nothing in the code for P USH depends on the heights of u and ,
yet we prohibit it from being invoked unless u:h D :h C 1
...
By Lemma 26
...

We call the operation P USH
...
If a push operation applies to some edge
...
It is a saturating push if edge
...
u; / D 0 afterward); otherwise, it is a nonsaturating push
...
A simple lemma
characterizes one result of a nonsaturating push
...
13
After a nonsaturating push from u to , the vertex u is no longer overflowing
...
u; / actually
pushed must equal u:e prior to the push
...


740

Chapter 26 Maximum Flow

The relabel operation
The basic operation R ELABEL
...
u; / 2 Ef
...
(Recall that by definition,
neither the source s nor the sink t can be overflowing, and so s and t are ineligible
for relabeling
...
u/
1 / Applies when: u is overflowing and for all
/
we have u:h Ä :h
...

/
3 u:h D 1 C min f :h W
...
u; / 2 Ef ,

When we call the operation R ELABEL
...
Note
that when u is relabeled, Ef must contain at least one edge that leaves u, so that
the minimization in the code is over a nonempty set
...
; u/
f
...
; u/:f > 0
...
u; / > 0, which implies that
...
The
operation R ELABEL
...

The generic algorithm
The generic push-relabel algorithm uses the following subroutine to create an initial preflow in the flow network
...
G; s/
1 for each vertex 2 G:V
2
:h D 0
3
:e D 0
4 for each edge
...
u; /:f D 0
6 s:h D jG:Vj
7 for each vertex 2 s:Adj
8

...
s; /
9
:e D c
...
s; /

26
...
u; / if u D s ;

...
15)

That is, we fill to capacity each edge leaving the source s, and all other edges carry
no flow
...
s; /,
and we initialize s:e to the negative of the sum of these capacities
...
16)
0
otherwise :
Equation (26
...
u; / for which
u:h > :h C 1 are those for which u D s, and those edges are saturated, which
means that they are not in the residual network
...
G/
1 I NITIALIZE -P REFLOW
...

Lemma 26
...
V; E/ be a flow network with source s and sink t, let f be a preflow,
and let h be any height function for f
...

Proof For any residual edge
...
u/ Ä h
...
If a push operation does not apply to an overflowing vertex u,
then for all residual edges
...
u/ < h
...
u/ Ä h
...

Correctness of the push-relabel method
To show that the generic push-relabel algorithm solves the maximum-flow problem, we shall first prove that if it terminates, the preflow f is a maximum flow
...
We start with some observations about the
height function h
...
15 (Vertex heights never decrease)
During the execution of the G ENERIC -P USH -R ELABEL procedure on a flow network G D
...
Moreover, whenever a relabel operation is applied to a vertex u, its height u:h increases
by at least 1
...
If vertex u is about to be relabeled, then for all vertices such that
...
Thus,
u:h < 1 C min f :h W
...

Lemma 26
...
V; E/ be a flow network with source s and sink t
...

Proof The proof is by induction on the number of basic operations performed
...

We claim that if h is a height function, then an operation R ELABEL
...
If we look at a residual edge
...
u/ ensures that u:h Ä :h C 1 afterward
...
w; u/ that enters u
...
15, w:h Ä u:h C 1 before the
operation R ELABEL
...
Thus, the operation
R ELABEL
...

Now, consider an operation P USH
...
This operation may add the edge
...
u; / from Ef
...
In the latter case,
removing
...

The following lemma gives an important property of height functions
...
17
Let G D
...
Then there is no path from the source s
to the sink t in the residual network Gf
...
Without loss of generality, p
is a simple path, and so k < jV j
...
i ; i C1 / 2 Ef
...
i / Ä h
...
Combining these inequalities over path p yields h
...
t/Ck
...
t/ D 0,

26
...
s/ Ä k < jV j, which contradicts the requirement that h
...

We are now ready to show that if the generic push-relabel algorithm terminates,
the preflow it computes is a maximum flow
...
18 (Correctness of the generic push-relabel algorithm)
If the algorithm G ENERIC -P USH -R ELABEL terminates when run on a flow network G D
...

Proof

We use the following loop invariant:

Each time the while loop test in line 2 in G ENERIC -P USH -R ELABEL is
executed, f is a preflow
...

Maintenance: The only operations within the while loop of lines 2–3 are push and
relabel
...
As argued on page 739, if f is
a preflow prior to a push operation, it remains a preflow afterward
...
14 and the invariant that f is always a preflow, there are
no overflowing vertices
...
Lemma 26
...
17 tells us that there is no
path from s to t in the residual network Gf
...
6), therefore, f is a maximum flow
...
We bound separately each of the three types
of operations: relabels, saturating pushes, and nonsaturating pushes
...
V 2 E/ time
...
Recall that we allow edges into the source in the residual network
...
19
Let G D
...
Then, for any overflowing vertex x, there is a simple path from x to s in the
residual network Gf
...
Let U D V U
...
14), sum over all vertices
in U , and note that V D U [ U , to obtain
X
e
...
; u/
f
...
; u/

f
...
; u/

2U

u2U

P

f
...
u; / C

XX
u2U

2U

XX

X
2U

2U

2U

XX

u2U

f
...
u; /

2U

f
...
u; /

2U

f
...
u/ must be positive because e
...
Thus,
we have
XX
XX
f
...
u; / > 0 :
(26
...
17) to hold, we must have
P
P
u2U
2U f
...
Hence, there must exist at least one pair of vertices
u0 2 U and 0 2 U with f
...
But, if f
...
u0 ; 0 /, which means that there is a simple path from x to 0 (the
path x ; u0 ! 0 ), thus contradicting the definition of U
...

Lemma 26
...
V; E/ be a flow network with source s and sink t
...

Proof The heights of the source s and the sink t never change because these
vertices are by definition not overflowing
...

Now consider any vertex u 2 V fs; tg
...
We shall
show that after each relabeling operation, we still have u:h Ä 2 jV j 1
...
4 Push-relabel algorithms

745

relabeled, it is overflowing, and Lemma 26
...
Let p D h 0 ; 1 ; : : : ; k i, where 0 D u, k D s, and k Ä jV j 1
because p is simple
...
i ; i C1 / 2 Ef , and
therefore, by Lemma 26
...
Expanding these inequalities over
path p yields u:h D 0 :h Ä k :h C k Ä s:h C
...

Corollary 26
...
V; E/ be a flow network with source s and sink t
...
2 jV j 1/
...

Proof Only the jV j 2 vertices in V fs; tg may be relabeled
...

The operation R ELABEL
...
The value of u:h is initially 0 and by
Lemma 26
...
Thus, each vertex u 2 V
fs; tg
is relabeled at most 2 jV j 1 times, and the total number of relabel operations
performed is at most
...
jV j 2/ < 2 jV j2
...
20 also helps us to bound the number of saturating pushes
...
22 (Bound on saturating pushes)
During the execution of G ENERIC -P USH -R ELABEL on any flow network G D

...

Proof For any pair of vertices u; 2 V , we will count the saturating pushes
from u to and from to u together, calling them the saturating pushes between u
and
...
u; / and
...
Now, suppose that a saturating push from u to has occurred
...
In order for another push from u to to occur
later, the algorithm must first push flow from to u, which cannot happen until
:h D u:h C 1
...
Likewise, u:h must increase by at least 2
between saturating pushes from to u
...
20,
never exceed 2 jV j 1, which implies that the number of times any vertex can have
its height increase by 2 is less than jV j
...
Multiplying by the number of edges
gives a bound of less than 2 jV j jEj on the total number of saturating pushes
...


746

Chapter 26 Maximum Flow

Lemma 26
...
V; E/, the number of nonsaturating pushes is less than 4 jV j2
...

P
Proof Define a potential function ˆ D
We
...
Initially, ˆ D 0, and the
value of ˆ may change after each relabeling, saturating push, and nonsaturating
push
...
Then we will show that each nonsaturating push must
decrease ˆ by at least 1, and will use these bounds to derive an upper bound on the
number of nonsaturating pushes
...
First, relabeling a
vertex u increases ˆ by less than 2 jV j, since the set over which the sum is taken is
the same and the relabeling cannot increase u’s height by more than its maximum
possible height, which, by Lemma 26
...
Second, a saturating
push from a vertex u to a vertex increases ˆ by less than 2 jV j, since no heights
change and only vertex , whose height is at most 2 jV j 1, can possibly become
overflowing
...

Why? Before the nonsaturating push, u was overflowing, and may or may not
have been overflowing
...
13, u is no longer overflowing after the
push
...
Therefore, the potential function ˆ has decreased by exactly u:h, and it
:h D 1, the net effect is that the
has increased by either 0 or :h
...

Thus, during the course of the algorithm, the total amount of increase in ˆ is
due to relabelings and saturated pushes, and Corollary 26
...
22
constrain the increase to be less than
...
2 jV j2 / C
...
2 jV j jEj/ D
4 jV j2
...
Since ˆ 0, the total amount of decrease, and therefore the
total number of nonsaturating pushes, is less than 4 jV j2
...

Having bounded the number of relabelings, saturating pushes, and nonsaturating push, we have set the stage for the following analysis of the G ENERIC P USH -R ELABEL procedure, and hence of any algorithm based on the push-relabel
method
...
24
During the execution of G ENERIC -P USH -R ELABEL on any flow network G D

...
V 2 E/
...
21 and Lemmas 26
...
23
...
4 Push-relabel algorithms

747

Thus, the algorithm terminates after O
...
All that remains is
to give an efficient method for implementing each operation and for choosing an
appropriate operation to execute
...
25
There is an implementation of the generic push-relabel algorithm that runs in
O
...
V; E/
...
4-2 asks you to show how to implement the generic algorithm
with an overhead of O
...
1/ per push
...
1/ time
...

Exercises
26
...
G; s/ terminates, we have
s:e Ä jf j, where f is a maximum flow for G
...
4-2
Show how to implement the generic push-relabel algorithm using O
...
1/ time per push, and O
...
V 2 E/
...
4-3
Prove that the generic push-relabel algorithm spends a total of only O
...
V 2 / relabel operations
...
4-4
Suppose that we have found a maximum flow in a flow network G D
...
Give a fast algorithm to find a minimum cut in G
...
4-5
Give an efficient push-relabel algorithm to find a maximum matching in a bipartite
graph
...

26
...
V; E/ are in the set
f1; 2; : : : ; kg
...
(Hint: How many times can each edge support a nonsaturating push before it becomes saturated?)

748

Chapter 26 Maximum Flow

26
...

26
...
u; / be the distance (number of edges) from u to in the residual network Gf
...
u; t/ and that u:h
jV j implies
u:h jV j Ä ıf
...

26
...
u; / be the distance from u to in the residual
network Gf
...
u; t/ and that u:h
jV j implies
u:h jV j D ıf
...
The total time that your implementation dedicates to maintaining this property should be O
...

26
...
V; E/ is at most 4 jV j2 jEj for
jV j 4
...
5 The relabel-to-front algorithm
The push-relabel method allows us to apply the basic operations in any order at
all
...
V 2 E/
bound given by Corollary 26
...
We shall now examine the relabel-to-front algorithm, a push-relabel algorithm whose running time is O
...
V 2 E/, and even better for dense networks
...

Beginning at the front, the algorithm scans the list, repeatedly selecting an overflowing vertex u and then “discharging” it, that is, performing push and relabel
operations until u no longer has a positive excess
...


26
...
After proving some properties about the network of admissible
edges, we shall investigate the discharge operation and then present and analyze the
relabel-to-front algorithm itself
...
V; E/ is a flow network with source s and sink t, f is a preflow in G, and h
is a height function, then we say that
...
u; / > 0
and h
...
/ C 1
...
u; / is inadmissible
...
V; Ef;h /, where Ef;h is the set of admissible edges
...

The following lemma shows that this network is a directed acyclic graph (dag)
...
26 (The admissible network is acyclic)
If G D
...
V; Ef;h / is acyclic
...
Suppose that Gf;h contains a cycle p D
h 0 ; 1 ; : : : ; k i, where 0 D k and k > 0
...
i 1 / D h
...
Summing around the cycle gives
k
X

h
...
h
...
i / C k :

i D1

Because each vertex in cycle p appears once in each of the summations, we derive
the contradiction that 0 D k
...

Lemma 26
...
V; E/ be a flow network, let f be a preflow in G, and suppose that the
attribute h is a height function
...
u; / is an admissible edge, then P USH
...
The operation does not create any new
admissible edges, but it may cause
...


750

Chapter 26 Maximum Flow

Proof By the definition of an admissible edge, we can push flow from u to
...
u; / applies
...
; u/
...
; u/ cannot become admissible
...
u; / D 0 afterward and
...

Lemma 26
...
V; E/ be a flow network, let f be a preflow in G, and suppose that
the attribute h is a height function
...
u/ applies
...

Proof If u is overflowing, then by Lemma 26
...
If there are no admissible edges leaving u, then no flow
can be pushed from u and so R ELABEL
...
After the relabel operation,
u:h D 1 C min f :h W
...
Thus, if is a vertex that realizes the minimum in this set, the edge
...
Hence, after the relabel, there
is at least one admissible edge leaving u
...
; u/ is admissible
...
But by Lemma 26
...
Moreover, relabeling a vertex does not change the residual network
...
; u/ is not
in the residual network, and hence it cannot be in the admissible network
...
” Given
a flow network G D
...
Thus, vertex appears in the list u:N if

...
; u/ 2 E
...
u; /
...

The relabel-to-front algorithm cycles through each neighbor list in an arbitrary
order that is fixed throughout the execution of the algorithm
...

Initially, u:current is set to u:N:head
...
5 The relabel-to-front algorithm

751

Discharging an overflowing vertex
An overflowing vertex u is discharged by pushing all of its excess flow through
admissible edges to neighboring vertices, relabeling u as necessary to cause edges
leaving u to become admissible
...

D ISCHARGE
...
u/
5
u:current D u:N:head
6
elseif cf
...
u; /
8
else u:current D :next-neighbor
Figure 26
...
Each iteration performs exactly
one of three actions, depending on the current vertex in the neighbor list u:N
...
If is NIL, then we have run off the end of u:N
...

(Lemma 26
...
)
2
...
u; / is an admissible edge (determined by the test in
line 6), then line 7 pushes some (or possibly all) of u’s excess to vertex
...
If is non-NIL but
...

Observe that if D ISCHARGE is called on an overflowing vertex u, then the last
action performed by D ISCHARGE must be a push from u
...

We must be sure that when P USH or R ELABEL is called by D ISCHARGE, the
operation applies
...

Lemma 26
...
u; / in line 7, then a push operation applies to
...

If D ISCHARGE calls R ELABEL
...

Proof The tests in lines 1 and 6 ensure that a push operation occurs only if the
operation applies, which proves the first statement in the lemma
...
9 Discharging a vertex y
...
Only the neighbors of y and edges of the flow network that enter or leave y
are shown
...
The
neighbor list y: N at the beginning of each iteration appears on the right, with the iteration number
on top
...
(a) Initially, there are 19 units of excess to push from y,
and y: current D s
...
In iteration 4, y: current D NIL (shown by the shading being below the neighbor list),
and so y is relabeled and y: current is reset to the head of the neighbor list
...
In iterations 5 and 6, edges
...
y; x/ are found to be inadmissible, but
iteration 7 pushes 8 units of excess flow from y to ´
...
(c) Because the push in iteration 7 saturated edge
...
In iteration 9, y: current D NIL, and so vertex y is again relabeled and y: current is reset
...
5 The relabel-to-front algorithm

s
–26

y
11

8/8

14/

14

x
5

s
–26

5

y
6

8/8

y
6

14/14

15
s
x
z
z
8

y
0

8/14

x
5

14
s
x
z

z
8

x
5

s
–20

13
s
x
z

z
8

8/8

(g)

6
5
4
3
2
1
0

x
0

5/5

11
s
x
z

12
s
x
z

14

8/8

(f)

6
5
4
3
2
1
0

10
s
x
z

14/

5

(e)

6
5
4
3
2
1
0

s
–26

5

(d)

6
5
4
3
2
1
0

753

z
8

Figure 26
...
y; s/ is inadmissible, but iteration 11 pushes 5 units
of excess flow from y to x
...
y; x/ to be inadmissible
...
y; ´/ inadmissible, and iteration 14 relabels vertex y and resets y: current
...
(g) Vertex y
now has no excess flow, and D ISCHARGE terminates
...


754

Chapter 26 Maximum Flow

To prove the second statement, according to the test in line 1 and Lemma 26
...
If a call to
D ISCHARGE
...
It is possible, however, that during a
call to D ISCHARGE
...
Calls to D ISCHARGE on other vertices may then occur, but u:current will continue moving through the list during the next call to
D ISCHARGE
...
We now consider what happens during a complete pass through
the list, which begins at the head of u:N and finishes with u:current D NIL
...
For the u:current pointer to advance past a vertex 2 u:N during a pass, the
edge
...
Thus, by the time
the pass completes, every edge leaving u has been determined to be inadmissible
at some time during the pass
...
Why? By Lemma 26
...

Thus, any admissible edge must be created by a relabel operation
...
28, any other vertex that is
relabeled during the pass (resulting from a call of D ISCHARGE
...
Thus, at the end of the pass, all edges leaving u
remain inadmissible, which completes the proof
...
A key property is that the vertices in L are topologically sorted
according to the admissible network, as we shall see in the loop invariant that follows
...
26 that the admissible network is a dag
...
It also assumes that u:next
points to the vertex that follows u in list L and that, as usual, u:next D NIL if u is
the last vertex in the list
...
5 The relabel-to-front algorithm

755

R ELABEL -T O -F RONT
...
G; s/
2 L D G:V fs; tg, in any order
3 for each vertex u 2 G:V fs; tg
4
u:current D u:N:head
5 u D L:head
6 while u ¤ NIL
7
old-height D u:h
8
D ISCHARGE
...
Line 1 initializes the preflow
and heights to the same values as in the generic push-relabel algorithm
...

Lines 3–4 initialize the current pointer of each vertex u to the first vertex in u’s
neighbor list
...
10 illustrates, the while loop of lines 6–11 runs through the list L,
discharging vertices
...
Each
time through the loop, line 8 discharges a vertex u
...
We can determine
whether u was relabeled by comparing its height before the discharge operation,
saved into the variable old-height in line 7, with its height afterward, in line 9
...
If line 10 moved u to the front of the list, the vertex used in the next iteration
is the one following u in its new position in the list
...
First, observe that it performs push and relabel operations only when they apply, since
Lemma 26
...

It remains to show that when R ELABEL -T O -F RONT terminates, no basic operations apply
...
V; Ef;h /, and no vertex
before u in the list has excess flow
...
Since jV j

Chapter 26 Maximum Flow

s
–26
12

(a)

6
5
4
3

L:
N:

8

x
s
y
z
t

y
s
x
z

z
x
y
t

y
s
x
z

x
s
y
z
t

z
x
y
t

16

10

t
0

/14

x
0

5/5

y
19

z
0

10

t
7

8/8

/12

x
5

7/16
7
8

y
0

8/14

s
–20

z
x
y
t

L:
N:

z
0

14

12

(c)

6
5
4
3
2
1
0

5

s
–26
/12
12

(b)

x
12

y
14
7

6
5
4
3
2
1
0

y
s
x
z

/14

/12

2
1
0

x
s
y
z
t

L:
N:

14

5

756

7/16

7

z
8

10

t
7

Figure 26
...
(a) A flow network just before the first iteration
of the while loop
...
On the right is shown the initial list
L D hx; y; ´i, where initially u D x
...
Vertex x is discharged
...
Because x is relabeled, it moves
to the head of L, which in this case does not change the structure of L
...
Figure 26
...

Because y is relabeled, it is moved to the head of L
...
Because vertex x is not relabeled in this
discharge operation, it remains in place in list L
...
5 The relabel-to-front algorithm

7

z
x
y
t

L:
N:

z
8

10

z
x
y
t

y
s
x
z

x
s
y
z
t

t
12

8/8

x
0

x
s
y
z
t

y
0

8/14

5

s
–20

y
s
x
z

12/16

x
0

/12
12

(e)

6
5
4
3
2
1
0

L:
N:

8/8

2
1
0

y
0

8/14

5

s
–20
/12
12

(d)

6
5
4
3

757

7

12/16

z
0

8/10

t
20

Figure 26
...
It is relabeled
to height 1 and all 8 units of excess flow are pushed to t
...
(e) Vertex y now follows vertex ´ in L and is therefore discharged
...
Vertex x is then discharged
...
R ELABEL T O -F RONT has reached the end of list L and terminates
...


least s and t), no edge can be admissible
...

Because u is initially the head of the list L, there are no vertices before it and
so there are none before it with excess flow
...
By Lemma 26
...
Thus, only relabel operations can create admissible edges
...
28 states that there
are no admissible edges entering u but there may be admissible edges leaving u
...


758

Chapter 26 Maximum Flow

To see that no vertex preceding u in L has excess flow, we denote the vertex
that will be u in the next iteration by u0
...
When u
is discharged, it has no excess flow afterward
...
If u is not relabeled
during the discharge, no vertices before it on the list acquired excess flow during
this discharge, because L remained topologically sorted at all times during the
discharge (as just pointed out, admissible edges are created only by relabeling,
not pushing), and so each push operation causes excess flow to move only to
vertices further down the list (or to s or t)
...

Termination: When the loop terminates, u is just past the end of L, and so the
loop invariant ensures that the excess of every vertex is 0
...

Analysis
We shall now show that R ELABEL -T O -F RONT runs in O
...
V; E/
...
21, which provides an O
...
V 2 / bound on the total number of relabel operations overall
...
4-3 provides an O
...
22 provides an O
...

Theorem 26
...
V; E/
is O
...

Proof Let us consider a “phase” of the relabel-to-front algorithm to be the time
between two consecutive relabel operations
...
V 2 / phases, since there
are O
...
Each phase consists of at most jV j calls to D IS CHARGE, which we can see as follows
...
If D ISCHARGE does perform a relabel, the next
call to D ISCHARGE belongs to a different phase
...
V 2 / phases, the number of times
D ISCHARGE is called in line 8 of R ELABEL -T O -F RONT is O
...
Thus, the total

26
...
V 3 /
...
Each iteration of the while loop within D ISCHARGE
performs one of three actions
...

We start with relabel operations (lines 4–5)
...
4-3 provides an O
...
V 2 / relabels that are performed
...
This action
occurs O
...
u// times each time a vertex u is relabeled, and O
...
u//
times overall for the vertex
...
VE/ by the handshaking lemma
(Exercise B
...

The third type of action performed by D ISCHARGE is a push operation (line 7)
...
VE/
...
Thus, there can be at most one nonsaturating
push per call to D ISCHARGE
...
V 3 /
times, and thus the total time spent performing nonsaturating pushes is O
...

The running time of R ELABEL -T O -F RONT is therefore O
...
V 3 /
...
5-1
Illustrate the execution of R ELABEL -T O -F RONT in the manner of Figure 26
...
1(a)
...
5-2 ?
We would like to implement a push-relabel algorithm in which we maintain a firstin, first-out queue of overflowing vertices
...

After the vertex at the head of the queue is discharged, it is removed
...
Show how to implement this algorithm
to compute a maximum flow in O
...

26
...
How would this change affect the analysis of
R ELABEL -T O -F RONT?
26
...
V 3 / time
...
5-5
Suppose that at some point in the execution of a push-relabel algorithm, there exists
an integer 0 < k Ä jV j 1 for which no vertex has :h D k
...
If such a k exists,
the gap heuristic updates every vertex 2 V
fsg for which :h > k, to set
:h D max
...
Show that the resulting attribute h is a height function
...
)

Problems
26-1 Escape problem
An n n grid is an undirected graph consisting of n rows and n columns of vertices,
as shown in Figure 26
...
We denote the vertex in the ith row and the j th column
by
...
All vertices in a grid have exactly four neighbors, except for the boundary
vertices, which are the points
...

Given m Ä n2 starting points
...
x2 ; y2 /; : : : ;
...
For example,
the grid in Figure 26
...
11(b) does not
...
Consider a flow network in which vertices, as well as edges, have capacities
...
Show that determining the maximum flow in a network with edge
and vertex capacities can be reduced to an ordinary maximum-flow problem on
a flow network of comparable size
...
11 Grids for the escape problem
...
(a) A grid with an escape, shown by shaded paths
...


b
...

26-2 Minimum path cover
A path cover of a directed graph G D
...
Paths may start
and end anywhere, and they may be of any length, including 0
...

a
...
V; E/
...
V 0 ; E 0 /, where
V 0 D fx0 ; x1 ; : : : ; xn g [ fy0 ; y1 ; : : : ; yn g ;
E 0 D f
...
yi ; y0 / W i 2 V g [ f
...
i; j / 2 Eg ;
and run a maximum-flow algorithm
...
Does your algorithm work for directed graphs that contain cycles? Explain
...
He has identified n important subareas of algorithms (roughly corresponding to different portions of this textbook), which he represents by the set A D fA1 ; A2 ; : : : ; An g
...
The consulting
company has lined up a set J D fJ1 ; J2 ; : : : ; Jm g of potential jobs
...
Each expert can work on multiple jobs simultaneously
...

Professor Gore’s job is to determine which subareas to hire experts in and which
jobs to accept in order to maximize the net revenue, which is the total income from
jobs accepted minus the total cost of employing the experts
...
It contains a source vertex s, vertices
A1 ; A2 ; : : : ; An , vertices J1 ; J2 ; : : : ; Jm , and a sink vertex t
...
s; Ak / with capacity c
...
Ji ; t/ with capacity
c
...
For k D 1; 2; : : : ; n and i D 1; 2; : : : ; m, if Ak 2 Ri , then G
contains an edge
...
Ak ; Ji / D 1
...
Show that if Ji 2 T for a finite-capacity cut
...

b
...

c
...
Analyze the running time of your algorithm in terms of m, n, and
Pm
r D i D1 jRi j
...
V; E/ be a flow network with source s, sink t, and integer capacities
...

a
...
u; / 2 E by 1
...
V C E/-time algorithm to update the maximum flow
...
Suppose that we decrease the capacity of a single edge
...
Give
an O
...

26-5 Maximum flow by scaling
Let G D
...
u; / on each edge
...
Let C D max
...
u; /
...
Argue that a minimum cut of G has capacity at most C jEj
...
For a given number K, show how to find an augmenting path of capacity at
least K in O
...


Problems for Chapter 26

763

We can use the following modification of F ORD -F ULKERSON -M ETHOD to compute a maximum flow in G:
M AX -F LOW-B Y-S CALING
...
u; /2E c
...
Argue that M AX -F LOW-B Y-S CALING returns a maximum flow
...
Show that the capacity of a minimum cut of the residual network Gf is at most
2K jEj each time line 4 is executed
...
Argue that the inner while loop of lines 5–6 executes O
...

f
...
E 2 lg C / time
...
The algorithm runs in O
...
Given an undirected, bipartite graph G D
...
We say that
a simple path P in G is an augmenting path with respect to M if it starts at an
unmatched vertex in L, ends at an unmatched vertex in R, and its edges belong
alternately to M and E M
...
) In this problem,
we treat a path as a sequence of edges, rather than as a sequence of vertices
...

Given two sets A and B, the symmetric difference A˚B is defined as
...
B A/, that is, the elements that are in exactly one of the two sets
...
Show that if M is a matching and P is an augmenting path with respect to M ,
then the symmetric difference M ˚ P is a matching and jM ˚ P j D jM j C 1
...
P1 [ P2 [ [ Pk / is a matching
with cardinality jM j C k
...
G/
1 M D;
2 repeat
3
let P D fP1 ; P2 ; : : : ; Pk g be a maximal set of vertex-disjoint
shortest augmenting paths with respect to M
4
M D M ˚
...

b
...
V; M ˚ M / has degree at most 2
...
Argue that edges in each such simple path
or cycle belong alternately to M or M
...

Let l be the length of a shortest augmenting path with respect to a matching M , and
let P1 ; P2 ; : : : ; Pk be a maximal set of vertex-disjoint augmenting paths of length l
with respect to M
...
P1 [ [Pk /, and suppose that P is a shortest
augmenting path with respect to M 0
...
Show that if P is vertex-disjoint from P1 ; P2 ; : : : ; Pk , then P has more than l
edges
...
Now suppose that P is not vertex-disjoint from P1 ; P2 ; : : : ; Pk
...
M ˚ M 0 / ˚ P
...
P1 [ P2 [
that jAj
...
Conclude that P has more than l edges
...
Prove that if a shortest augmenting path with respect to M has l edges, the size
of the maximum matching is at most jM j C jV j =
...


Notes for Chapter 26

765

f
...
(Hint: By how much can M grow after iteration number jV j?)
g
...
E/ time to find a maximal set of vertexmatching M
...
V E/
...
Goldberg, Tardos, and Tarjan [139] also provide a nice survey of algorithms
for network-flow problems, and Schrijver [304] has written an interesting review
of historical developments in the field of network flows
...
Many early implementations
of the Ford-Fulkerson method found augmenting paths using breadth-first search;
Edmonds and Karp [102], and independently Dinic [89], proved that this strategy
yields a polynomial-time algorithm
...
Karzanov [202] first developed the idea of
preflows
...
Goldberg and Tarjan gave an O
...
VE lg
...
Several other researchers
have developed push-relabel maximum-flow algorithms
...
Cheriyan and
Maheshwari [62] proposed pushing flow from the overflowing vertex of maximum
height
...
The algorithm of King, Rao,
and Tarjan [204] is the fastest such algorithm and runs in O
...
V lg V / V /
time
...
min
...
V 2 =E C 2/ lg C /,
where C D max
...
u; /
...
All previous maximumflow algorithms, including the ones in this chapter, use some notion of distance
(the push-relabel algorithms use the analogous notion of height), with a length of 1

766

Chapter 26 Maximum Flow

assigned implicitly to each edge
...

Informally, with respect to these lengths, shortest paths from the source to the sink
tend have high capacity, which means that fewer iterations need be performed
...
A study
by Cherkassky and Goldberg [63] underscores the importance of using two heuristics when implementing a push-relabel algorithm
...
The second heuristic is the gap heuristic, described in
Exercise 26
...
Cherkassky and Goldberg conclude that the best choice of pushrelabel variants is the one that chooses to discharge the overflowing vertex with the
maximum height
...
V E/ time and is described in Problem 26-6
...


VII

Selected Topics

Introduction
This part contains a selection of algorithmic topics that extend and complement
earlier material in this book
...
Others cover specialized domains such as
computational geometry or number theory
...

Chapter 27 presents an algorithmic model for parallel computing based on dynamic multithreading
...
It then
investigates several interesting multithreaded algorithms, including algorithms for
matrix multiplication and merge sorting
...
It presents
two general methods—LU decomposition and LUP decomposition—for solving
linear equations by Gaussian elimination in O
...
It also shows that matrix
inversion and matrix multiplication can be performed equally fast
...

Chapter 29 studies linear programming, in which we wish to maximize or minimize an objective, given limited resources and competing constraints
...
This chapter covers how
to formulate and solve linear programs
...
In contrast
to many algorithms in this book, the simplex algorithm does not run in polynomial
time in the worst case, but it is fairly efficient and widely used in practice
...
n lg n/ time
...

Chapter 31 presents number-theoretic algorithms
...
Next, it studies algorithms for solving modular linear equations and for
raising one number to a power modulo another number
...

This cryptosystem can be used not only to encrypt messages so that an adversary
cannot read them, but also to provide digital signatures
...
Finally, the chapter covers Pollard’s “rho” heuristic for factoring integers and discusses the state of the art
of integer factorization
...
After examining the naive approach, the chapter presents an elegant approach due to Rabin and Karp
...

Chapter 33 considers a few problems in computational geometry
...
Two clever algorithms for finding the convex hull of a set of
points—Graham’s scan and Jarvis’s march—also illustrate the power of sweeping
methods
...

Chapter 34 concerns NP-complete problems
...
This chapter presents techniques for determining when a problem is
NP-complete
...
The chapter also proves that the famous travelingsalesman problem is NP-complete
...
For some NP-complete problems,
approximate solutions that are near optimal are quite easy to produce, but for others
even the best approximation algorithms known work progressively more poorly as

Part VII Selected Topics

771

the problem size increases
...
This chapter illustrates these possibilities with the vertex-cover
problem (unweighted and weighted versions), an optimization version of 3-CNF
satisfiability, the traveling-salesman problem, the set-covering problem, and the
subset-sum problem
...
In this chapter, we shall extend our algorithmic model to encompass parallel
algorithms, which can run on a multiprocessor computer that permits multiple
instructions to execute concurrently
...

Parallel computers—computers with multiple processing units—have become
increasingly common, and they span a wide range of prices and performance
...
At an intermediate price/performance point are clusters built from individual computers—often
simple PC-class machines—with a dedicated network interconnecting them
...

Multiprocessor computers have been around, in one form or another, for
decades
...
A major reason is that vendors have not agreed on a single architectural model for parallel
computers
...
Other parallel computers employ distributed memory, where each processor’s memory is private, and
an explicit message must be sent between processors in order for one processor to
access the memory of another
...
Although time
will tell, that is the approach we shall take in this chapter
...
Each
thread maintains an associated program counter and can execute code independently of the other threads
...
Although the
operating system allows programmers to create and destroy threads, these operations are comparatively slow
...

Unfortunately, programming a shared-memory parallel computer directly using
static threads is difficult and error-prone
...
For any but the simplest of applications, the programmer must use complex communication protocols
to implement a scheduler to load-balance the work
...
Some
concurrency platforms are built as runtime libraries, but others provide full-fledged
parallel languages with compiler and runtime support
...
Dynamic multithreading allows programmers to specify parallelism in applications without worrying about communication
protocols, load balancing, and other vagaries of static-thread programming
...
Although the
functionality of dynamic-multithreading environments is still evolving, almost all
support two features: nested parallelism and parallel loops
...
A parallel loop is like an ordinary
for loop, except that the iterations of the loop can execute concurrently
...
A key aspect of this model is that the programmer
needs to specify only the logical parallelism within a computation, and the threads
within the underlying concurrency platform schedule and load-balance the computation among themselves
...

Our model for dynamic multithreading offers several important advantages:
It is a simple extension of our serial programming model
...
Moreover, if we delete these concurrency keywords from the multithreaded pseudocode, the resulting text is serial
pseudocode for the same problem, which we call the “serialization” of the multithreaded algorithm
...

Many multithreaded algorithms involving nested parallelism follow naturally
from the divide-and-conquer paradigm
...

The model is faithful to how parallel-computing practice is evolving
...

Section 27
...
Section 27
...
3 tackles the tougher problem of multithreading merge sort
...
1 The basics of dynamic multithreading
We shall begin our exploration of dynamic multithreading using the example of
computing Fibonacci numbers recursively
...
22):
F0 D 0 ;
F1 D 1 ;
Fi D Fi

1

C Fi

2

for i

2:

Here is a simple, recursive, serial algorithm to compute the nth Fibonacci number:

27
...
6/

F IB
...
4/

F IB
...
3/

F IB
...
1/

F IB
...
2/

F IB
...
1/

F IB
...
2/

F IB
...
1/

F IB
...
3/

F IB
...
1/

F IB
...
2/

F IB
...
0/

F IB
...
0/

Figure 27
...
6/
...


F IB
...
n 1/
4
y D F IB
...
Figure 27
...
For example,
a call to F IB
...
5/ and then F IB
...
But, the call to F IB
...
4/
...
4/ return the same result
(F4 D 3)
...
4/
replicates the work that the first call performs
...
n/ denote the running time of F IB
...
Since F IB
...
n/ D T
...
n

2/ C ‚
...
n/ D ‚
...
For an inductive hypothesis, assume that T
...
Substituting, we obtain

776

Chapter 27 Multithreaded Algorithms

T
...
aFn 1 b/ C
...
1/
a
...
1/
aFn b
...
1//
aFn b

if we choose b large enough to dominate the constant in the ‚
...
We can then
choose a large enough to satisfy the initial condition
...
n/ D ‚
...
1)
p
where D
...
25)
...
(See Problem 31-3 for much faster ways
...
Observe that within F IB
...
n 1/ and F IB
...
Therefore, the two recursive calls can run in parallel
...
Here is how we can rewrite the F IB procedure to use
dynamic multithreading:
P-F IB
...
n
4
y D P-F IB
...
We define the serialization of a multithreaded algorithm to be the serial algorithm that results from deleting the multithreaded keywords: spawn, sync, and when we examine parallel loops, parallel
...

Nested parallelism occurs when the keyword spawn precedes a procedure call,
as in line 3
...
1 The basics of dynamic multithreading

777

for the child to complete, as would normally happen in a serial execution
...
n 1/, the parent may go on
to compute P-F IB
...
Since the
P-F IB procedure is recursive, these two subroutine calls themselves create nested
parallelism, as do their children, thereby creating a potentially vast tree of subcomputations, all executing in parallel
...
The concurrency keywords
express the logical parallelism of the computation, indicating which parts of the
computation may proceed in parallel
...
We shall discuss the theory behind
schedulers shortly
...
The keyword sync indicates that
the procedure must wait as necessary for all its spawned children to complete before proceeding to the statement after the sync
...
In addition to explicit
synchronization provided by the sync statement, every procedure executes a sync
implicitly before it returns, thus ensuring that all its children terminate before it
does
...
V; E/, called a computation dag
...
2
shows the computation dag that results from computing P-F IB
...
Conceptually,
the vertices in V are instructions, and the edges in E represent dependencies between instructions, where
...
For convenience, however, if a chain of instructions contains no
parallel control (no spawn, sync, or return from a spawn—via either an explicit
return statement or the return that happens implicitly upon reaching the end of
a procedure), we may group them into a single strand, each of which represents
one or more instructions
...
For example, if a strand
has two successors, one of them must have been spawned, and a strand with multiple predecessors indicates the predecessors joined because of a sync statement
...


778

Chapter 27 Multithreaded Algorithms

P-FIB(4)

P-FIB(3)

P-FIB(2)

P-FIB(1)

P-FIB(2)

P-FIB(1)

P-FIB(1)

P-FIB(0)

P-FIB(0)

Figure 27
...
4/
...
n 1/ in line 3, shaded circles representing the part of the procedure that calls P-F IB
...
n 1/ returns, and white circles representing the part of the procedure after the sync where
it sums x and y up to the point where it returns the result
...
Spawn edges and call edges point downward, continuation
edges point horizontally to the right, and return edges point upward
...


If G has a directed path from strand u to strand , we say that the two strands are
(logically) in series
...

We can picture a multithreaded computation as a dag of strands embedded in a
tree of procedure instances
...
1 shows the tree of procedure
instances for P-F IB
...
Figure 27
...
All directed edges connecting strands run either within a procedure or along
undirected edges in the procedure tree
...
A continuation edge
...
2, connects a strand u to its successor u0 within the same procedure
instance
...
u; /,
which points downward in the figure
...
Strand u spawning strand differs from u calling
in that a spawn induces a horizontal continuation edge from u to the strand u0 fol-

27
...
When a strand u returns to its calling
procedure and x is the strand immediately following the next sync in the calling
procedure, the computation dag contains return edge
...

A computation starts with a single initial strand—the black vertex in the procedure
labeled P-F IB
...
2—and ends with a single final strand—the white
vertex in the procedure labeled P-F IB
...

We shall study the execution of multithreaded algorithms on an ideal parallel computer, which consists of a set of processors and a sequentially consistent
shared memory
...
That is, the memory behaves as if the instructions
were executed sequentially according to some global linear order that preserves the
individual orders in which each processor issues its own instructions
...
Depending on scheduling, the ordering
could differ from one run of the program to another, but the behavior of any execution can be understood by assuming that the instructions are executed in some
linear order consistent with the computation dag
...
Specifically, it assumes that each
processor in the machine has equal computing power, and it ignores the cost of
scheduling
...

Performance measures
We can gauge the theoretical efficiency of a multithreaded algorithm by using two
metrics: “work” and “span
...
In other words, the work
is the sum of the times taken by each of the strands
...
The span is the longest time to execute the strands along any path in the dag
...
(Recall from Section 24
...
V; E/ in ‚
...
) For example, the
computation dag of Figure 27
...

The actual running time of a multithreaded computation depends not only on
its work and its span, but also on how many processors are available and how
the scheduler allocates strands to processors
...
For example,
we might denote the running time of an algorithm on P processors by TP
...
The span is the running time
if we could run each strand on its own processor—in other words, if we had an
unlimited number of processors—and so we denote the span by T1
...
Since the
total work to do is T1 , we have P TP T1
...
2)

A P -processor ideal parallel computer cannot run any faster than a machine
with an unlimited number of processors
...
Thus, the span law follows:
TP

T1 :

(27
...
By the work law, we have TP
T1 =TP Ä P
...
When the
speedup is linear in the number of processors, that is, when T1 =TP D ‚
...

The ratio T1 =T1 of the work to the span gives the parallelism of the multithreaded computation
...
As a
ratio, the parallelism denotes the average amount of work that can be performed in
parallel for each step along the critical path
...
Finally, and perhaps most important, the parallelism provides a limit on
the possibility of attaining perfect linear speedup
...
To see this last point, suppose that P > T1 =T1 , in which case

27
...
Moreover,
if the number P of processors in the ideal parallel computer greatly exceeds the
P , so that the speedup is
parallelism—that is, if P
T1 =T1 —then T1 =TP
much less than the number of processors
...

As an example, consider the computation P-F IB
...
2, and assume
that each strand takes unit time
...
Consequently, achieving much more
than double the speedup is impossible, no matter how many processors we employ to execute the computation
...
n/ exhibits substantial parallelism
...
T1 =T1 /=P D
T1 =
...
Thus, if the slackness is less than 1,
we cannot hope to achieve perfect linear speedup, because T1 =
...

Indeed, as the slackness decreases from 1 toward 0, the speedup of the computation
diverges further and further from perfect linear speedup
...
As we shall see,
as the slackness increases from 1, a good scheduler can achieve closer and closer
to perfect linear speedup
...
The
strands must also be scheduled efficiently onto the processors of the parallel machine
...
Instead, we rely on the concurrency platform’s scheduler to map the dynamically unfolding computation to individual processors
...
We can
just imagine that the concurrency platform’s scheduler maps strands to processors
directly
...
Moreover, a good scheduler operates in a distributed fashion,
where the threads implementing the scheduler cooperate to load-balance the computation
...


782

Chapter 27 Multithreaded Algorithms

Instead, to keep our analysis simple, we shall investigate an on-line centralized
scheduler, which knows the global state of the computation at any given time
...
If at least P strands are ready to execute
during a time step, we say that the step is a complete step, and a greedy scheduler
assigns any P of the ready strands to processors
...

From the work law, the best running time we can hope for on P processors
is TP D T1 =P , and from the span law the best we can hope for is TP D T1
...

Theorem 27
...
4)

Proof We start by considering the complete steps
...
Suppose for the purpose of
contradiction that the number of complete steps is strictly greater than bT1 =P c
...
bT1 =P c C 1/ D P bT1 =P c C P
D T1
...
8))
(by inequality (3
...


Thus, we obtain the contradiction that the P processors would perform more work
than the computation requires, which allows us to conclude that the number of
complete steps is at most bT1 =P c
...
Let G be the dag representing the entire
computation, and without loss of generality, assume that each strand takes unit
time
...
) Let G 0
be the subgraph of G that has yet to be executed at the start of the incomplete step,
and let G 00 be the subgraph remaining to be executed after the incomplete step
...
Since an
incomplete step of a greedy scheduler executes all strands with in-degree 0 in G 0 ,
the length of a longest path in G 00 must be 1 less than the length of a longest path
in G 0
...
Hence, the number of incomplete steps is at most T1
...


27
...
1 shows that a greedy scheduler always
performs well
...
2
The running time TP of any multithreaded computation scheduled by a greedy
scheduler on an ideal parallel computer with P processors is within a factor of 2
of optimal
...
Since the work and span laws—inequalities (27
...
3)—give
max
...
1 implies that
us TP
TP

Ä T1 =P C T1
Ä 2 max
...

Corollary 27
...
Then, if P
T1 =T1 , we
T1 =P , or equivalently, a speedup of approximately P
...
Since the work
hence Theorem 27
...
2) dictates that TP
P
...
Then, the span term in the
greedy bound, inequality (27
...
For example, if a computation runs on only 10 or 100 processors, it doesn’t make sense to value parallelism
of, say 1,000,000 over parallelism of 10,000, even with the factor of 100 difference
...


784

Chapter 27 Multithreaded Algorithms

A
A

B
B

Work: T1
...
A/ C T1
...
A [ B/ D T1
...
B/

Work: T1
...
A/ C T1
...
A [ B/ D max
...
A/; T1
...
3 The work and span of composed subcomputations
...
(b) When two subcomputations are joined in parallel, the
work of the composition remains the sum of their work, but the span of the composition is only the
maximum of their spans
...
Analyzing
the work is relatively straightforward, since it amounts to nothing more than analyzing the running time of an ordinary serial algorithm—namely, the serialization
of the multithreaded algorithm—which you should already be familiar with, since
that is what most of this textbook is about! Analyzing the span is more interesting,
but generally no harder once you get the hang of it
...

Analyzing the work T1
...
n/ poses no hurdles, because we’ve already
done it
...
n/ D T
...
n / from equation (27
...

Figure 27
...
If two subcomputations are
joined in series, their spans add to form the span of their composition, whereas
if they are joined in parallel, the span of their composition is the maximum of the
spans of the two subcomputations
...
n/, the spawned call to P-F IB
...
n 2/ in line 4
...
n/ as the recurrence
T1
...
T1
...
n
D T1
...
1/ ;

2// C ‚
...
n/ D ‚
...

The parallelism of P-F IB
...
n/=T1
...
n =n/, which grows dramatically as n gets large
...
1 The basics of dynamic multithreading

785

value for n suffices to achieve near perfect linear speedup for P-F IB
...

Parallel loops
Many algorithms contain loops all of whose iterations can operate in parallel
...
Our pseudocode provides this functionality via the parallel
concurrency keyword, which precedes the for keyword in a for loop statement
...
aij /
by an n-vector x D
...
The resulting n-vector y D
...
We can perform matrix-vector multiplication by computing all
the entries of y in parallel as follows:
M AT-V EC
...
A compiler can implement
each parallel for loop as a divide-and-conquer subroutine using nested parallelism
...
A; x; y; n; 1; n/, where the compiler produces the auxiliary subroutine M AT-V EC -M AIN -L OOP as follows:

786

Chapter 27 Multithreaded Algorithms

1,8

1,4

5,8

1,2

1,1

3,4

2,2

3,3

5,6

4,4

5,5

7,8

6,6

7,7

8,8

Figure 27
...
A; x; y; 8; 1; 8/
...
The black circles represent strands corresponding to either the base case or the part of the procedure up to the spawn of
M AT-V EC -M AIN -L OOP in line 5; the shaded circles represent strands corresponding to the part of
the procedure that calls M AT-V EC -M AIN -L OOP in line 6 up to the sync in line 7, where it suspends
until the spawned subroutine in line 5 returns; and the white circles represent strands corresponding
to the (negligible) part of the procedure after the sync up to the point where it returns
...
A; x; y; n; i; i 0 /
1 if i == i 0
2
for j D 1 to n
3
yi D yi C aij xj
4 else mid D b
...
A; x; y; n; i; mid/
6
M AT-V EC -M AIN -L OOP
...
4
...
n/ of M AT-V EC on an n n matrix, we simply compute
the running time of its serialization, which we obtain by replacing the parallel for
loops with ordinary for loops
...
n/ D ‚
...
This analysis

27
...
In fact, the overhead of recursive spawning does increase the work
of a parallel loop compared with that of its serialization, but not asymptotically
...
5-3)
...
n/ time in this case)
...

As a practical matter, dynamic-multithreading concurrency platforms sometimes
coarsen the leaves of the recursion by executing several iterations in a single leaf,
either automatically or under programmer control, thereby reducing the overhead
of recursive spawning
...

We must also account for the overhead of recursive spawning when analyzing the
span of a parallel-loop construct
...
i/, the span is
T1
...
lg n/ C max iter1
...
lg n/, because the recursive spawning dominates the constanttime work of each iteration
...
n/, because each iteration of the outer parallel for loop contains n iterations
of the inner (serial) for loop
...
n/ for the whole procedure
...
n2 /, the
parallelism is ‚
...
n/ D ‚
...
(Exercise 27
...
)
Race conditions
A multithreaded algorithm is deterministic if it always does the same thing on the
same input, no matter how the instructions are scheduled on the multicore computer
...
Often, a
multithreaded algorithm that is intended to be deterministic fails to be, because it
contains a “determinacy race
...
Famous race bugs include the
Therac-25 radiation therapy machine, which killed three people and injured sev-

788

Chapter 27 Multithreaded Algorithms

eral others, and the North American Blackout of 2003, which left over 50 million
people without power
...
You can
run tests in the lab for days without a failure only to discover that your software
sporadically crashes in the field
...
The
following procedure illustrates a race condition:
R ACE -E XAMPLE
...
Although it might seem that R ACE E XAMPLE should always print the value 2 (its serialization certainly does), it could
instead print the value 1
...

When a processor increments x, the operation is not indivisible, but is composed
of a sequence of instructions:
1
...

2
...

3
...

Figure 27
...
Recall that
since an ideal parallel computer supports sequential consistency, we can view the
parallel execution of a multithreaded algorithm as an interleaving of instructions
that respects the dependencies in the dag
...
The value x is stored
in memory, and r1 and r2 are processor registers
...
In steps 2 and 3, processor 1 reads x from memory into its register r1
and increments it, producing the value 1 in r1
...
Processor 2 reads x from memory into
register r2 ; increments it, producing the value 1 in r2 ; and then stores this value
into x, setting x to 1
...
Therefore, step 8 prints the
value 1, rather than 2, as the serialization would print
...
If the effect of the parallel execution were that
processor 1 executed all its instructions before processor 2, the value 2 would be

27
...
5 Illustration of the determinacy race in R ACE -E XAMPLE
...
The processor registers are r1 and r2
...
(b) An execution
sequence that elicits the bug, showing the values of x in memory and registers r1 and r2 for each
step in the execution sequence
...
Conversely, if the effect were that processor 2 executed all its instructions
before processor 1, the value 2 would still be printed
...

Of course, many executions do not elicit the bug
...
That’s the problem with determinacy races
...
But some orderings generate
improper results when the instructions interleave
...
You can run tests for days and never see the bug, only to
experience a catastrophic system crash in the field when the outcome is critical
...
Thus, in a parallel for construct, all the iterations
should be independent
...
Note that arguments to a
spawned child are evaluated in the parent before the actual spawn occurs, and thus
the evaluation of arguments to a spawned subroutine is in series with any accesses
to those arguments after the spawn
...
lg n/ by parallelizing the inner for loop:
M AT-V EC -W RONG
...
Exercise 27
...
lg n/ span
...
As an example, two parallel threads might store the same value into a shared variable, and it
wouldn’t matter which stored the value first
...

A chess lesson
We close this section with a true story that occurred during the development of
the world-class multithreaded chess-playing program ?Socrates [80], although the
timings below have been simplified for exposition
...
At one point, the developers incorporated an optimization into the program that reduced its running time on an important benchmark on the 32-processor
0
machine from T32 D 65 seconds to T32 D 40 seconds
...
As a result, they abandoned the “optimization
...
The original version of the program had work T1 D 2048
seconds and span T1 D 1 second
...
4) as an equation,
TP D T1 =P C T1 , and use it as an approximation to the running time on P processors, we see that indeed T32 D 2048=32 C 1 D 65
...
Again
0
using our approximation, we get T32 D 1024=32 C 8 D 40
...
In particular, we have T512 D 2048=512C1 D 5

27
...
The optimization that sped up
the program on 32 processors would have made the program twice as slow on 512
processors! The optimized version’s span of 8, which was not the dominant term in
the running time on 32 processors, became the dominant term on 512 processors,
nullifying the advantage from using more processors
...


Exercises
27
...
n 2/ in line 4 of P-F IB, rather than calling it
as is done in the code
...
1-2
Draw the computation dag that results from executing P-F IB
...
Assuming that
each strand in the computation takes unit time, what are the work, span, and parallelism of the computation? Show how to schedule the dag on 3 processors using
greedy scheduling by labeling each strand with the time step in which it is executed
...
1-3
Prove that a greedy scheduler achieves the following time bound, which is slightly
stronger than the bound proven in Theorem 27
...
5)

27
...
Describe how the two executions would proceed
...
1-5
Professor Karan measures her deterministic multithreaded algorithm on 4, 10,
and 64 processors of an ideal parallel computer using a greedy scheduler
...
Argue that the professor is either lying or incompetent
...
2), the span law (27
...
5) from Exercise 27
...
)

792

Chapter 27 Multithreaded Algorithms

27
...
n2 = lg n/ parallelism while maintaining ‚
...

27
...
A/
1 n D A:rows
2 parallel for j D 2 to n
3
parallel for i D 1 to j 1
4
exchange aij with aj i
Analyze the work, span, and parallelism of this algorithm
...
1-8
Suppose that we replace the parallel for loop in line 3 of P-T RANSPOSE (see Exercise 27
...
Analyze the work, span, and parallelism
of the resulting algorithm
...
1-9
For how many processors do the two versions of the chess programs run equally
fast, assuming that TP D T1 =P C T1 ?

27
...
2
...

Multithreaded matrix multiplication
The first algorithm we study is the straighforward algorithm based on parallelizing
the loops in the procedure S QUARE -M ATRIX -M ULTIPLY on page 75:

27
...
A; B/
1 n D A:rows
2 let C be a new n n matrix
3 parallel for i D 1 to n
4
parallel for j D 1 to n
5
cij D 0
6
for k D 1 to n
7
cij D cij C ai k bkj
8 return C
To analyze this algorithm, observe that since the serialization of the algorithm is
just S QUARE -M ATRIX -M ULTIPLY, the work is therefore simply T1
...
n3 /,
the same as the running time of S QUARE -M ATRIX -M ULTIPLY
...
n/ D ‚
...
lg n/ C ‚
...
n/ D ‚
...

Thus, the parallelism is ‚
...
n/ D ‚
...
Exercise 27
...
n3 = lg n/, which you cannot do
straightforwardly using parallel for, because you would create races
...
2, we can multiply n n matrices serially in time

...
n2:81 / using Strassen’s divide-and-conquer strategy, which motivates
us to look at multithreading such an algorithm
...
2,
with multithreading a simpler divide-and-conquer algorithm
...
6)

Thus, to multiply two n n matrices, we perform eight multiplications of n=2 n=2
matrices and one addition of n n matrices
...
Unlike the S QUARE M ATRIX -M ULTIPLY-R ECURSIVE procedure on which it is based, P-M ATRIX M ULTIPLY-R ECURSIVE takes the output matrix as a parameter to avoid allocating
matrices unnecessarily
...
C; A; B/
1 n D A:rows
2 if n == 1
3
c11 D a11 b11
4 else let T be a new n n matrix
5
partition A, B, C , and T into n=2 n=2 submatrices
A11 ; A12 ; A21 ; A22 ; B11 ; B12 ; B21 ; B22 ; C11 ; C12 ; C21 ; C22 ;
and T11 ; T12 ; T21 ; T22 ; respectively
6
spawn P-M ATRIX -M ULTIPLY-R ECURSIVE
...
C12 ; A11 ; B12 /
8
spawn P-M ATRIX -M ULTIPLY-R ECURSIVE
...
C22 ; A21 ; B12 /
10
spawn P-M ATRIX -M ULTIPLY-R ECURSIVE
...
T12 ; A12 ; B22 /
12
spawn P-M ATRIX -M ULTIPLY-R ECURSIVE
...
T22 ; A22 ; B22 /
14
sync
15
parallel for i D 1 to n
16
parallel for j D 1 to n
17
cij D cij C tij
Line 3 handles the base case, where we are multiplying 1 1 matrices
...
We allocate a temporary matrix T in line 4, and
line 5 partitions each of the matrices A, B, C , and T into n=2 n=2 submatrices
...
) The recursive call in line 6 sets the submatrix C11 to the submatrix
product A11 B11 , so that C11 equals the first of the two terms that form its sum in
equation (27
...
Similarly, lines 7–9 set C12 , C21 , and C22 to the first of the two
terms that equal their sums in equation (27
...
Line 10 sets the submatrix T11 to
the submatrix product A12 B21 , so that T11 equals the second of the two terms that
form C11 ’s sum
...
The first seven recursive
calls are spawned, and the last one runs in the main strand
...
2 Multithreaded matrix multiplication

795

after which we add the products from T into C in using the doubly nested parallel
for loops in lines 15–17
...
n/ of the P-M ATRIX -M ULTIPLY-R ECURSIVE
procedure, echoing the serial running-time analysis of its progenitor S QUARE M ATRIX -M ULTIPLY-R ECURSIVE
...
1/ time,
perform eight recursive multiplications of n=2 n=2 matrices, and finish up with
the ‚
...
Thus, the recurrence for the
work M1
...
n/ D 8M1
...
n2 /
D ‚
...
In other words, the work of our multithreaded algorithm is asymptotically the same as the running time of the procedure S QUARE M ATRIX -M ULTIPLY in Section 4
...

To determine the span M1
...
1/, which is dominated by the ‚
...
Because the eight
parallel recursive calls all execute on matrices of the same size, the maximum span
for any recursive call is just the span of any one
...
n/ of P-M ATRIX -M ULTIPLY-R ECURSIVE is
M1
...
n=2/ C ‚
...
7)

This recurrence does not fall under any of the cases of the master theorem, but
it does meet the condition of Exercise 4
...
By Exercise 4
...
7) is M1
...
lg2 n/
...
n/=M1
...
n3 = lg2 n/, which is very
high
...
Divide the input matrices A and B and output matrix C into n=2 n=2 submatrices, as in equation (27
...
This step takes ‚
...

2
...
We can create all 10 matrices
with ‚
...
lg n/ span by using doubly nested parallel for loops
...
Using the submatrices created in step 1 and the 10 matrices created in
step 2, recursively spawn the computation of seven n=2 n=2 matrix products
P1 ; P2 ; : : : ; P7
...
Compute the desired submatrices C11 ; C12 ; C21 ; C22 of the result matrix C by
adding and subtracting various combinations of the Pi matrices, once again
using doubly nested parallel for loops
...
n2 / work and ‚
...

To analyze this algorithm, we first observe that since the serialization is the
same as the original serial algorithm, the work is just the running time of the
serialization, namely, ‚
...
As for P-M ATRIX -M ULTIPLY-R ECURSIVE, we
can devise a recurrence for the span
...
7) as we did for P-M ATRIX -M ULTIPLY-R ECURSIVE,
which has solution ‚
...
Thus, the parallelism of multithreaded Strassen’s
method is ‚
...

Exercises
27
...
Use the convention that spawn and call edges point
downward, continuation edges point horizontally to the right, and return edges
point upward
...

27
...
2-1 for P-M ATRIX -M ULTIPLY-R ECURSIVE
...
2-3
Give pseudocode for a multithreaded algorithm that multiplies two n
with work ‚
...
lg n/
...


n matrices

27
...
Your algorithm should be highly parallel even if any of
p, q, and r are 1
...


27
...
2-5
Give pseudocode for an efficient multithreaded algorithm that transposes an n n
matrix in place by using divide-and-conquer to divide the matrix recursively into
four n=2 n=2 submatrices
...

27
...
2), which computes shortest paths between all
pairs of vertices in an edge-weighted graph
...


27
...
3
...
3
...
n lg n/
...
We can easily modify the pseudocode so that the first
recursive call is spawned:
M ERGE -S ORT0
...
p C r/=2c
3
spawn M ERGE -S ORT 0
...
A; q C 1; r/
5
sync
6
M ERGE
...
After the
two recursive subroutines in lines 3 and 4 have completed, which is ensured by the
sync statement in line 5, M ERGE -S ORT 0 calls the same M ERGE procedure as on
page 31
...
To do so, we first need to analyze M ERGE
...
n/
...
n/
...
n/ of M ERGE -S ORT 0 on n elements:
MS01
...
n=2/ C ‚
...
n lg n/ ;

798

Chapter 27 Multithreaded Algorithms

p1

T



q1

Äx

r1

x

x

merge

A



Äx
p3

p2

q2




copy

r2

merge

x
q3

x





x
r3

Figure 27
...
Letting x D T Œq1  be the median of T Œp1 : : r1  and q2
be the place in T Œp2 : : r2  such that x would fall between T Œq2 1 and T Œq2 , every element in
subarrays T Œp1 : : q1 1 and T Œp2 : : q2 1 (lightly shaded) is less than or equal to x, and every
element in the subarrays T Œq1 C 1 : : r1  and T Œq2 C 1 : : r2  (heavily shaded) is at least x
...


which is the same as the serial running time of merge sort
...
n/ D MS01
...
n/
D ‚
...
n/=MS01
...
lg n/,
which is an unimpressive amount of parallelism
...

You probably have already figured out where the parallelism bottleneck is in
this multithreaded merge sort: the serial M ERGE procedure
...

Our divide-and-conquer strategy for multithreaded merging, which is illustrated in Figure 27
...
Suppose that we
are merging the two sorted subarrays T Œp1 : : r1  of length n1 D r1 p1 C 1
and T Œp2 : : r2  of length n2 D r2 p2 C 1 into another subarray AŒp3 : : r3 , of
length n3 D r3 p3 C 1 D n1 C n2
...

We first find the middle element x D T Œq1  of the subarray T Œp1 : : r1 ,
where q1 D b
...
Because the subarray is sorted, x is a median
of T Œp1 : : r1 : every element in T Œp1 : : q1 1 is no more than x, and every element in T Œq1 C 1 : : r1  is no less than x
...
3 Multithreaded merge sort

799

index q2 in the subarray T Œp2 : : r2  so that the subarray would still be sorted if we
inserted x between T Œq2 1 and T Œq2 
...
Set q3 D p3 C
...
q2

p2 /
...
Copy x into AŒq3 
...
Recursively merge T Œp1 : : q1
the subarray AŒp3 : : q3 1
...
Recursively merge T Œq1 C 1 : : r1  with T Œq2 : : r2 , and place the result into the
subarray AŒq3 C 1 : : r3 
...
Thus, their sum is the number of elements that end up before x in
the subarray AŒp3 : : r3 
...
Since we have assumed that the subarray T Œp1 : : r1  is at least as long as T Œp2 : : r2 , that is, n1 n2 , we can check
for the base case by just checking whether n1 D 0
...

Now, let’s put these ideas into pseudocode
...
The procedure B INARY-S EARCH
...

If x Ä T Œp, and hence less than or equal to all the elements of T Œp : : r, then
it returns the index p
...

Here is the pseudocode:
B INARY-S EARCH
...
p; r C 1/
3 while low < high
4
mid D b
...
x; T; p; r/ takes ‚
...
(See Exercise 2
...
) Since B INARY-S EARCH is a serial procedure, its worst-case work and
span are both ‚
...

We are now prepared to write pseudocode for the multithreaded merging procedure itself
...
Unlike M ERGE, however, P-M ERGE does not assume that the two subarrays to
be merged are adjacent within the array
...
) Another difference between M ERGE and P-M ERGE is that
P-M ERGE takes as an argument an output subarray A into which the merged values should be stored
...
T; p1 ; r1 ; p2 ; r2 ; A; p3 / merges the sorted
subarrays T Œp1 : : r1  and T Œp2 : : r2  into the subarray AŒp3 : : r3 , where r3 D
p3 C
...
r2 p2 C 1/ 1 D p3 C
...
r2 p2 / C 1 and
is not provided as an input
...
T; p1 ; r1 ; p2 ; r2 ; A; p3 /
1 n1 D r 1 p 1 C 1
2 n2 D r 2 p 2 C 1
/ ensure that n1 n2
/
3 if n1 < n2
4
exchange p1 with p2
5
exchange r1 with r2
6
exchange n1 with n2
/ both empty?
/
7 if n1 == 0
8
return
9 else q1 D b
...
T Œq1 ; T; p2 ; r2 /
11
q3 D p3 C
...
q2 p2 /
12
AŒq3  D T Œq1 
13
spawn P-M ERGE
...
T; q1 C 1; r1 ; q2 ; r2 ; A; q3 C 1/
15
sync
The P-M ERGE procedure works as follows
...
Lines 3–6 enn2
...
Lines 9–15 implement the divide-and-conquer strategy
...
Line 11 com-

27
...

Then, we recurse using nested parallelism
...
The sync statement in line 15
ensures that the subproblems have completed before the procedure returns
...
) There
is some cleverness in the coding to ensure that when the subarray T Œp2 : : r2  is
empty, the code operates correctly
...

Analysis of multithreaded merging
We first derive a recurrence for the span PM 1
...
Because the spawn in line 13 and
the call in line 14 operate logically in parallel, we need examine only the costlier of
the two calls
...
Because lines 3–6 ensure that n2 Ä n1 , it follows that n2 D 2n2 =2 Ä

...
In the worst case, one of the two recursive calls merges
bn1 =2c elements of T Œp1 : : r1  with all n2 elements of T Œp2 : : r2 , and hence the
number of elements involved in the call is
bn1 =2c C n2

Ä
D
Ä
D

n1 =2 C n2 =2 C n2 =2

...
lg n/ cost of the call to B INARY-S EARCH in line 10, we obtain
the following recurrence for the worst-case span:
PM 1
...
3n=4/ C ‚
...
8)

(For the base case, the span is ‚
...
)
This recurrence does not fall under any of the cases of the master theorem, but it
meets the condition of Exercise 4
...
Therefore, the solution to recurrence (27
...
n/ D ‚
...

We now analyze the work PM1
...
n/
...
n/ D
...
Thus, it remains only to show that PM 1
...
n/
...
The binary search in
line 10 costs ‚
...
For the recursive calls, observe that although the recursive
calls in lines 13 and 14 might merge different numbers of elements, together the
two recursive calls merge at most n elements (actually n 1 elements, since T Œq1 
does not participate in either recursive call)
...
We therefore obtain the
recurrence
PM 1
...
˛ n/ C PM 1
...
lg n/ ;

(27
...

We prove that recurrence (27
...
n/ via the substitution
method
...
n/ Ä c1 n c2 lg n for some positive constants c1 and c2
...
n/ Ä
D
D
D
Ä


...
˛ n// C
...
1 ˛/n c2 lg
...
lg n/
c1
...
1 ˛//n c2
...
˛ n/ C lg
...
lg n/
c1 n c2
...
1 ˛/ C lg n/ C ‚
...
c2
...
1 ˛/// ‚
...
lg n C lg
...
lg n/ term
...
Since the work PM 1
...
n/
and O
...
n/ D ‚
...

The parallelism of P-M ERGE is PM 1
...
n/ D ‚
...

Multithreaded merge sort
Now that we have a nicely parallelized multithreaded merging procedure, we can
incorporate it into a multithreaded merge sort
...
In particular, the call P-M ERGE -S ORT
...


27
...
A; p; r; B; s/
1 n D r pC1
2 if n == 1
3
BŒs D AŒp
4 else let T Œ1 : : n be a new array
5
q D b
...
A; p; q; T; 1/
8
P-M ERGE -S ORT
...
T; 1; q 0 ; q 0 C 1; n; B; s/
After line 1 computes the number n of elements in the input subarray AŒp : : r,
lines 2–3 handle the base case when the array has only 1 element
...
In
particular, line 4 allocates a temporary array T with n elements to store the results
of the recursive merge sorting
...
At that point, the spawn and recursive
call are made, followed by the sync in line 9, which forces the procedure to wait
until the spawned procedure is done
...

Analysis of multithreaded merge sort
We start by analyzing the work PMS1
...
Indeed, the work is given by the
recurrence
PMS1
...
n=2/ C PM 1
...
n=2/ C ‚
...
4) for ordinary M ERGE -S ORT
from Section 2
...
1 and has solution PMS1
...
n lg n/ by case 2 of the master
theorem
...
n/
...
n/ D PMS1
...
n/
D PMS1
...
lg2 n/ :

(27
...
8), the master theorem does not apply to recurrence (27
...
6-2 does
...
n/ D ‚
...
lg3 n/
...
Recall that the parallelism of M ERGE -S ORT 0 , which calls the serial M ERGE procedure, is only ‚
...
For P-M ERGE -S ORT, the parallelism is
PMS1
...
n/ D ‚
...
lg3 n/
D ‚
...
A good implementation in
practice would sacrifice some parallelism by coarsening the base case in order to
reduce the constants hidden by the asymptotic notation
...

Exercises
27
...

27
...
3-8
...
Analyze your algorithm
...
3-3
Give an efficient multithreaded algorithm for partitioning an array around a pivot,
as is done by the PARTITION procedure on page 171
...
Make your algorithm as parallel as possible
...

(Hint: You may need an auxiliary array and may need to make more than one pass
over the input elements
...
3-4
Give a multithreaded version of R ECURSIVE -FFT on page 911
...
Analyze your algorithm
...
3-5 ?
Give a multithreaded version of R ANDOMIZED -S ELECT on page 216
...
Analyze your algorithm
...
3-3
...
3-6 ?
Show how to multithread S ELECT from Section 9
...
Make your implementation as
parallel as possible
...


Problems
27-1 Implementing parallel loops using nested parallelism
Consider the following multithreaded algorithm for performing pairwise addition
on n-element arrays AŒ1 : : n and BŒ1 : : n, storing the sums in C Œ1 : : n:
S UM -A RRAYS
...
Rewrite the parallel loop in S UM -A RRAYS using nested parallelism (spawn
and sync) in the manner of M AT-V EC -M AIN -L OOP
...

Consider the following alternative implementation of the parallel loop, which
contains a value grain-size to be specified:
S UM -A RRAYS0
...
A; B; C; k grain-size C 1;
min
...
A; B; C; i; j /
1 for k D i to j
2
C Œk D AŒk C BŒk

806

Chapter 27 Multithreaded Algorithms

b
...
What is the parallelism of this implementation?
c
...

Derive the best value for grain-size to maximize parallelism
...
The P-M ATRIX -M ULTIPLY-R ECURSIVE procedure does have high parallelism, however
...
Most parallel computers
approximately 10003 =102 D 107 , since lg 1000
have far fewer than 10 million processors
...
Describe a recursive multithreaded algorithm that eliminates the need for the
temporary matrix T at the cost of increasing the span to ‚
...
(Hint: Compute C D C C AB following the general strategy of P-M ATRIX -M ULTIPLYR ECURSIVE, but initialize C in parallel and insert a sync in a judiciously chosen location
...
Give and solve recurrences for the work and span of your implementation
...
Analyze the parallelism of your implementation
...
Compare with
the parallelism of P-M ATRIX -M ULTIPLY-R ECURSIVE
...
Parallelize the LU-D ECOMPOSITION procedure on page 821 by giving pseudocode for a multithreaded version of this algorithm
...

b
...

c
...

d
...
13) for inverting a symmetric positive-definite matrix
...

R EDUCE
...
Use nested parallelism to implement a multithreaded algorithm P-R EDUCE,
which performs the same function with ‚
...
lg n/ span
...

A related problem is that of computing a ˝-prefix computation, sometimes
called a ˝-scan, on an array xŒ1 : : n, where ˝ is once again an associative operator
...
The following
serial procedure S CAN performs a ˝-prefix computation:
S CAN
...
For example, changing
the for loop to a parallel for loop would create races, since each iteration of the
loop body depends on the previous iteration
...
x/
1 n D x:length
2 let yŒ1 : : n be a new array
3 P-S CAN -1-AUX
...
x; y; i; j /
1 parallel for l D i to j
2
yŒl D P-R EDUCE
...
Analyze the work, span, and parallelism of P-S CAN -1
...
x/
1 n D x:length
2 let yŒ1 : : n be a new array
3 P-S CAN -2-AUX
...
x; y; i; j /
1 if i == j
2
yŒi D xŒi
3 else k D b
...
x; y; i; k/
5
P-S CAN -2-AUX
...
Argue that P-S CAN -2 is correct, and analyze its work, span, and parallelism
...
On the first pass, we gather the
terms for various contiguous subarrays of x into a temporary array t, and on the
second pass we use the terms in t to compute the final result y
...
x/
1 n D x:length
2 let yŒ1 : : n and tŒ1 : : n be new arrays
3 yŒ1 D xŒ1
4 if n > 1
5
P-S CAN -U P
...
xŒ1; x; t; y; 2; n/
7 return y
P-S CAN -U P
...
i C j /=2c
5
tŒk D spawn P-S CAN -U P
...
x; t; k C 1; j /
7
sync
/ fill in the blank
/
8
return
P-S CAN -D OWN
...
i C j /=2c
; x; t; y; i; k/
5
spawn P-S CAN -D OWN
...

7
sync

/ fill in the blank
/
/ fill in the blank
/

d
...
Argue that with expressions you supplied, P-S CAN -3 is
correct
...
; x; t; y; i; j /
satisfies D xŒ1 ˝ xŒ2 ˝ ˝ xŒi 1
...
Analyze the work, span, and parallelism of P-S CAN -3
...
The pattern of neighboring entries does not change
during the computation and is called a stencil
...
4 presents

810

Chapter 27 Multithreaded Algorithms

a stencil algorithm to compute a longest common subsequence, where the value in
entry cŒi; j  depends only on the values in cŒi 1; j , cŒi; j 1, and cŒi 1; j 1,
as well as the elements xi and yj within the two sequences given as inputs
...

In this problem, we examine how to use nested parallelism to multithread a
simple stencil calculation on an n n array A in which, of the values in A, the
value placed into entry AŒi; j  depends only on values in AŒi 0 ; j 0 , where i 0 Ä i
and j 0 Ä j (and of course, i 0 ¤ i or j 0 ¤ j )
...
Furthermore, we assume throughout
this problem that once we have filled in the entries upon which AŒi; j  depends, we
can fill in AŒi; j  in ‚
...
4)
...
11)
AD
A21 A22
Observe now that we can fill in subarray A11 recursively, since it does not depend
on the entries of the other three subarrays
...
Finally, we can fill in A22 recursively
...
Give multithreaded pseudocode that performs this simple stencil calculation
using a divide-and-conquer algorithm S IMPLE -S TENCIL based on the decomposition (27
...
(Don’t worry about the details of the
base case, which depends on the specific stencil
...
What is the parallelism?
b
...
Analyze this
algorithm
...
Generalize your solutions to parts (a) and (b) as follows
...
Divide an n n array into b 2 subarrays, each of size n=b n=b, recursing
with as much parallelism as possible
...
n/ for any choice of b 2
...
)

Notes for Chapter 27

811

d
...
n= lg n/ parallelism
...
n/ inherent parallelism
...

27-6 Randomized multithreaded algorithms
Just as with ordinary serial algorithms, we sometimes want to implement randomized multithreaded algorithms
...

It also asks you to design and analyze a multithreaded algorithm for randomized
quicksort
...
Explain how to modify the work law (27
...
3), and greedy scheduler bound (27
...

b
...
Argue that the speedup of a randomized multithreaded algorithm should be defined as E ŒT1  =E ŒTP , rather than E ŒT1 =TP 
...
Argue that the parallelism of a randomized multithreaded algorithm should be
defined as the ratio E ŒT1  =E ŒT1 
...
Multithread the R ANDOMIZED -Q UICKSORT algorithm on page 179 by using
nested parallelism
...
) Give the
pseudocode for your P-R ANDOMIZED -Q UICKSORT algorithm
...
Analyze your multithreaded algorithm for randomized quicksort
...
)

Chapter notes
Parallel computers, models for parallel computers, and algorithmic models for parallel programming have been around in various forms for years
...
The data-parallel model [48, 168] is another popular algorithmic programming model, which features operations on vectors and matrices
as primitives
...
1
...
Blelloch
[47] developed an algorithmic programming model based on work and span (which
he called the “depth” of the computation) for data-parallel programming
...
T1 /
...

The multithreaded pseudocode and programming model were heavily influenced
by the Cilk [51, 118] project at MIT and the Cilk++ [71] extensions to C++ distributed by Cilk Arts, Inc
...
E
...
Prokop and have
been implemented in Cilk or Cilk++
...

The notion of sequential consistency is due to Lamport [223]
...
This chapter
focuses on how to multiply matrices and solve sets of simultaneous linear equations
...

Section 28
...
Then, Section 28
...
Finally, Section 28
...

One important issue that arises in practice is numerical stability
...
Although we shall briefly consider numerical stability on occasion, we do
not focus on it in this chapter
...


28
...
We
can formulate a linear system as a matrix equation in which each matrix or vector
element belongs to a field, typically the real numbers R
...

We start with a set of linear equations in n unknowns x1 ; x2 ; : : : ; xn :

814

Chapter 28 Matrix Operations

a11 x1 C a12 x2 C

C a1n xn D b1 ;

a21 x1 C a22 x2 C

C a2n xn D b2 ;
:
:
:

an1 x1 C an2 x2 C

(28
...
1) is a set of values for x1 ; x2 ; : : : ; xn that satisfy
all of the equations simultaneously
...

We can conveniently rewrite equations (28
...
aij /, x D
...
bi /, as
Ax D b :

(28
...
3)

is the solution vector
...
2)
as follows
...
A 1 A/x
A 1
...
Ax 0 /

...
1), the rank of A is equal to the
number n of unknowns
...
If the number of equations is less than the number n of unknowns—or,
more generally, if the rank of A is less than n—then the system is underdetermined
...
If the
number of equations exceeds the number n of unknowns, the system is overdetermined, and there may not exist any solutions
...
3 addresses the important

28
...

Let us return to our problem of solving the system Ax D b of n equations in n
unknowns
...
3), multiply b
by A 1 , yielding x D A 1 b
...
Fortunately, another approach—LUP decomposition—is numerically
stable and has the further advantage of being faster in practice
...
4)

where
L is a unit lower-triangular matrix,
U is an upper-triangular matrix, and
P is a permutation matrix
...
4) an LUP decomposition
of the matrix A
...

Computing an LUP decomposition for the matrix A has the advantage that we
can more easily solve linear systems when they are triangular, as is the case for
both matrices L and U
...
2), Ax D b, by solving only triangular linear systems, as
follows
...
1-4, amounts to permuting the equations (28
...

Using our decomposition (28
...
Let us
define y D Ux, where x is the desired solution vector
...
5)

for the unknown vector y by a method called “forward substitution
...
6)

816

Chapter 28 Matrix Operations

for the unknown x by a method called “back substitution
...
2-3), multiplying both sides of equation (28
...
7)

Hence, the vector x is our solution to Ax D b:
P 1 LUx (by equation (28
...
6))
P 1 Ly
1
(by equation (28
...

Forward and back substitution
Forward substitution can solve the lower-triangular system (28
...
n2 / time,
given L, P , and b
...
For i D 1; 2; : : : ; n, the entry Œi indicates that Pi; Œi  D 1
and Pij D 0 for j ¤ Œi
...
Since L is unit lower-triangular, we can rewrite equation (28
...
Knowing the value of y1 , we can
substitute it into the second equation, yielding
y2 D b

Œ2

l21 y1 :

Now, we can substitute both y1 and y2 into the third equation, obtaining
y3 D b

Œ3


...
1 Solving systems of linear equations

yi D b

i 1
X
Œi 

817

lij yj :

j D1

Having solved for y, we solve for x in equation (28
...
Here, we solve the nth equation first and
work backward to the first equation
...
n2 / time
...
6) as
u11 x1 C u12 x2 C

C

u1;n 2 xn

2

C

u1;n 1 xn

1

C

u1n xn D y1 ;

u22 x2 C

C

u2;n 2 xn

2

C

u2;n 1 xn

1

C

u2n xn D y2 ;
:
:
:

2;n 2 xn 2

C un

2;n 1 xn 1

C un

2;n xn

D yn

2

;

un

un

1;n 1 xn 1

C un

1;n xn

D yn

1

;

un;n xn D yn :
Thus, we can solve for xn ; xn 1 ; : : : ; x1 successively as follows:
xn D yn =un;n ;
xn 1 D
...
yn 2
...
The pseudocode assumes that the dimension n appears in the attribute L:rows and that the permutation matrix P is represented by
the array
...
L; U; ; b/
1 n D L:rows
2 let x be a new vector of length n
3 for i D 1 to n
Pi 1
4
yi D b Œi 
j D1 lij yj
5 for i D n downto 1
Pn
6
xi D y i
j Di C1 uij xj =ui i
7 return x

818

Chapter 28 Matrix Operations

Procedure LUP-S OLVE solves for y using forward substitution in lines 3–4, and
then it solves for x using backward substitution in lines 5–6
...
n2 /
...
The LUP decomposition is
L D

1
0 0
0:2
1 0
0:6 0:5 1

U

D

5
6
0 0:8
0
0

P

D

0 0 1
1 0 0
0 1 0

;

3
0:6
2:5

;

:

(You might want to verify that PA D LU
...
Using back substitution, we solve
Ux D y for x:

28
...

Computing an LU decomposition
We have now shown that if we can create an LUP decomposition for a nonsingular
matrix A, then forward and back substitution can solve the system Ax D b of
linear equations
...
We start with the case in which A is an n n nonsingular matrix and P is
absent (or, equivalently, P D In )
...
We call the
two matrices L and U an LU decomposition of A
...
We start by subtracting multiples of the first equation from the other equations
in order to remove the first variable from those equations
...
We continue this process
until the system that remains has an upper-triangular form—in fact, it is the matrix U
...

Our algorithm to implement this strategy is recursive
...
If n D 1, then we are done,
since we can choose L D I1 and U D A
...
n 1/-vector, w T is a row
...
n 1/
...
Then, using matrix algebra (verify the equations by

820

Chapter 28 Matrix Operations

simply multiplying through), we can factor A as
Ã
Â
a11 w T
A D
A0
ÃÂ
Ã
Â
wT
a11
1
0
:
D
0 A0
w T =a11
=a11 In 1

(28
...
8) are row and column
...
The term w T =a11 , formed by taking the
outer product of and w and dividing each element of the result by a11 , is an

...
n 1/ matrix, which conforms in size to the matrix A0 from which it is
subtracted
...
n 1/
...
9)

is called the Schur complement of A with respect to a11
...
Why? Suppose that the Schur complement, which is
...
n 1/, is
singular
...
1, it has row rank strictly less than n 1
...
The row rank of the entire matrix, therefore, is strictly less than n
...
2-8 to equation (28
...
1 we derive the contradiction that A is singular
...
Let us say that
A0

w T=a11 D L0 U 0 ;

where L0 is unit lower-triangular and U 0 is upper-triangular
...
(Note that because L0 is unit lowertriangular, so is L, and because U 0 is upper-triangular, so is U
...
1 Solving systems of linear equations

821

Of course, if a11 D 0, this method doesn’t work, because it divides by 0
...
The elements by
which we divide during LU decomposition are called pivots, and they occupy the
diagonal elements of the matrix U
...
When we use
permutations to avoid division by 0 (or by small numbers, which would contribute
to numerical instability), we are pivoting
...
Such matrices require
no pivoting, and thus we can employ the recursive strategy outlined above without fear of dividing by 0
...
3
...
(This transformation is a standard
optimization for a “tail-recursive” procedure—one whose last operation is a recursive call to itself
...
) It assumes that the attribute A:rows gives
the dimension of A
...

LU-D ECOMPOSITION
...
Within
this loop, line 6 determines the pivot to be ukk D akk
...
Line 8 determines the elements of the vector, storing i in li k , and line 9
computes the elements of the w T vector, storing wiT in uki
...
1 The operation of LU-D ECOMPOSITION
...
(b) The element a11 D 2
in the black circle is the pivot, the shaded column is =a11 , and the shaded row is w T
...
(c) We now
vertical line
...
The element a22 D 4 in the black
circle is the pivot, and the shaded column and row are =a22 and w T (in the partitioning of the Schur
complement), respectively
...
(d) After the
next step, the matrix A is factored
...
) (e) The factorization A D LU
...
(We don’t need to divide by akk in line 12 because we already did so when
we computed li k in line 8
...
n3 /
...
1 illustrates the operation of LU-D ECOMPOSITION
...
That is, we can set up a correspondence between
each element aij and either lij (if i > j ) or uij (if i Ä j ) and update the matrix A so that it holds both L and U when the procedure terminates
...

Computing an LUP decomposition
Generally, in solving a system of linear equations Ax D b, we must pivot on offdiagonal elements of A to avoid dividing by 0
...
But we also want to avoid dividing by a small value—even if A is

28
...
We therefore try to pivot
on a large value
...
Recall that we are given an n n nonsingular matrix A, and we wish
to find a permutation matrix P , a unit lower-triangular matrix L, and an uppertriangular matrix U such that PA D LU
...
1; 1/ position of the matrix
...
(The
first column cannot contain only 0s, for then A would be singular, because its determinant would be 0, by Theorems D
...
5
...
1-4)
...
a21 ; a31 ; : : : ; an1 /T , except that a11 replaces ak1 ; w T D
...
n 1/
...
Since ak1 ¤ 0, we can now perform
much the same linear algebra as for LU decomposition, but now guaranteeing that
we do not divide by 0:
Â
Ã
ak1 w T
QA D
A0
ÃÂ
Ã
Â
1
0
wT
ak1
:
D
0 A0
w T =ak1
=ak1 In 1
As we saw for LU decomposition, if A is nonsingular, then the Schur complew T =ak1 is nonsingular, too
...
A0

w T =ak1 / D L0 U 0 :

Define
Ã
Â
1 0
Q;
P D
0 P0
which is a permutation matrix, since it is the product of two permutation matrices
(Exercise D
...
We now have

824

Chapter 28 Matrix Operations

Â

PA D
D
D
D
D
D
D

Ã
1 0
QA
0 P0
ÃÂ
ÃÂ
Ã
Â
wT
1
0
ak1
1 0
0 A0
w T =ak1
=ak1 In 1
0 P0
ÃÂ
Ã
Â
wT
ak1
1
0
0
0
0
0 A
w T =ak1
P =ak1 P
Ã
ÃÂ
Â
wT
ak1
1
0
0 P 0
...
Because L0 is unit lower-triangular, so is L, and
because U 0 is upper-triangular, so is U
...
Here is the pseudocode for LUP decomposition:
LUP-D ECOMPOSITION
...
1 Solving systems of linear equations

825

Like LU-D ECOMPOSITION, our LUP-D ECOMPOSITION procedure replaces
the recursion with an iteration loop
...

We also implement the code to compute L and U “in place” in the matrix A
...
2 illustrates how LUP-D ECOMPOSITION factors a matrix
...
The outer for loop
beginning in line 5 implements the recursion
...
n k C 1/
...
If all elements in the current first column are
zero, lines 11–12 report that the matrix is singular
...
(The entire rows are swapped because in
w T =ak1 multiplied by P 0 , but
the derivation of the method above, not only is A0
so is =ak1
...

Because of its triply nested loop structure, LUP-D ECOMPOSITION has a running time of ‚
...
Thus,
pivoting costs us at most a constant factor in time
...
1-1
Solve the equation
1 0 0
4 1 0
6 5 1

x1
x2
x3

D

3
14
7

by using forward substitution
...
1-2
Find an LU decomposition of the matrix
4
8
12

5 6
6 7
7 12

:

826

Chapter 28 Matrix Operations

1
2
3
4

2
3
5
–1

0
1
0
0

0
0
0
1

3
2
1
4

4
2
1
...
2
0
...
2
4
...
6
(d)

3
1
2
4

5
5
4
2
0
...
4 –0
...
6 0 1
...
2
–0
...
5 4 –0
...
6
4
–2
4
2
3
...
6 0
0
...
2 –1

3
2
1
4

0
3
5
–2

3
1
4
2

1
0
0
0
P

0
0
1
0

˘

2
3
5
1

0
2 0:6
3
4
2
5
4
2
2 3:4
1

5
3
2
–1

5
3
0
–2

2
–2
0
...
6 0
0
...
2 –1

4
2
0
...
2
1
...
2
4
...
6
(e)

3
1
2
4

5
5
0
...
6 0
–0
...
5

5
5
4
2
0
...
4 –0
...
2 0
...
5
0
...
6 –3
...
4 –2
0
...
2 –1

4
4
2
3
...
6 –3
...
4 –
...
2 –0
...
4 –0
...
6 –3
...
5
(f)

5
5
0
...
2 0
...
6 0

4
2
0
...
2
4 –0
...
4 –3
(i)

0
0
0
1

˘

5
0
0
0

5
4
2 0:4
0
4
0
0

2
0:2
0:5
3

˘

U

(j)

Figure 28
...
(a) The input matrix A with the identity
permutation of the rows on the left
...
(b) Rows 1 and 3 are swapped
and the permutation is updated
...
(c) The vector
is replaced by =5, and the lower right of the matrix is updated with the Schur complement
...
(d)–(f) The second step
...
No further changes
occur on the fourth (final) step
...


28
...
1-3
Solve the equation
1 5 4
2 0 3
5 8 2

x1
x2
x3

D

12
9
5

by using an LUP decomposition
...
1-4
Describe the LUP decomposition of a diagonal matrix
...
1-5
Describe the LUP decomposition of a permutation matrix A, and prove that it is
unique
...
1-6
Show that for all n
position
...
1-7
In LU-D ECOMPOSITION, is it necessary to perform the outermost for loop iteration when k D n? How about in LUP-D ECOMPOSITION?

28
...
In this
section, we show how to use LUP decomposition to compute a matrix inverse
...

Thus, we can use Strassen’s algorithm (see Section 4
...
Indeed, Strassen’s original paper was motivated by the problem
of showing that a set of a linear equations could be solved more quickly than by
the usual method
...
Using LUP-S OLVE, we can solve
an equation of the form Ax D b in time ‚
...
Since the LUP decomposition
depends on A but not b, we can run LUP-S OLVE on a second set of equations of
the form Ax D b 0 in additional time ‚
...
In general, once we have the LUP
decomposition of A, we can solve, in time ‚
...

We can think of the equation
AX D In ;

(28
...
To be precise, let Xi denote the ith column of X , and recall that the
unit vector ei is the ith column of In
...
10) for X by
using the LUP decomposition for A to solve each equation
AXi D ei
separately for Xi
...
n2 /, and so we can compute X from the LUP decomposition of A in time ‚
...
Since we can determine the LUP decomposition of A
in time ‚
...
n3 /
...
In fact, we prove something stronger:
matrix inversion is equivalent to matrix multiplication, in the following sense
...
n/ denotes the time to multiply two n n matrices, then we can invert a
nonsingular n n matrix in time O
...
n//
...
n/ denotes the time
to invert a nonsingular n n matrix, then we can multiply two n n matrices in
time O
...
n//
...

Theorem 28
...
n/, where I
...
n2 / and I
...
3n/ D O
...
n//, then we can multiply two n n
matrices in time O
...
n//
...
We define the 3n 3n matrix D by

28
...

We can construct matrix D in ‚
...
I
...
n/ D
...
I
...
I
...
n/
...
n/ D O
...
n//
...
n/ satisfies the regularity condition whenever I
...
nc lgd n/
for any constants c > 0 and d 0
...
3
...
2 (Inversion is no harder than multiplication)
Suppose we can multiply two n n real matrices in time M
...
n/ D

...
n/ satisfies the two regularity conditions M
...
M
...
n=2/ Ä cM
...

Then we can compute the inverse of any real nonsingular n n matrix in time
O
...
n//
...
Exercise 28
...

We can assume that n is an exact power of 2, since we have
à 1 Â
Â
Ã
A 1 0
A 0
D
0 Ik
0 Ik
for any k > 0
...
The first regularity condition on M
...

For the moment, let us assume that the n n matrix A is symmetric and positivedefinite
...
11)

Then, if we let
S DD

CB

1

CT

(28
...
3), we have
Â
à Â
Ã
R T
B 1 C B 1 C T S 1 CB 1
B 1C TS 1
1
D
;
(28
...
Because A is symmetric and positive-definite, Lemmas 28
...
5 in Section 28
...
By Lemma 28
...
3, therefore, the inverses B 1 and S 1 exist, and by Exercise D
...
B 1 /T D B 1 and
...
Therefore, we can compute the submatrices R, T , U , and V of A 1 as follows, where
all matrices mentioned are n=2 n=2:
1
...

2
...


3
...
1-2 and
...

4
...

5
...


1

6
...
1-2,

...
S 1 /T D S 1 )
...

7
...


1

C TS

1

CB

1

, and

Thus, we can invert an n n symmetric positive-definite matrix by inverting two
n=2 n=2 matrices in steps 2 and 5; performing four multiplications of n=2 n=2
matrices in steps 3, 4, 6, and 7; plus an additional cost of O
...
We get
the recurrence
I
...
n=2/ C 4M
...
n2 /
D 2I
...
M
...
M
...
2 Inverting matrices

831

The second line holds because the second regularity condition in the statement
of the theorem implies that 4M
...
n/ and because we assume that
M
...
n2 /
...
1)
...
The basic idea is that for any nonsingular matrix A, the matrix AT A is symmetric (by Exercise D
...
6)
...

The reduction is based on the observation that when A is an n n nonsingular
matrix, we have
A

1

D
...
AT A/ 1 AT /A D
...
AT A/ D In and a matrix inverse is unique
...
Each of these three
steps takes O
...
n// time, and thus we can invert any nonsingular matrix with real
entries in O
...
n// time
...
2 suggests a means of solving the equation Ax D b
by using LU decomposition without pivoting, so long as A is nonsingular
...
AT A/x D AT b
...
We then use forward and back substitution to solve for x with the right-hand
side AT b
...
LUP decomposition requires fewer
arithmetic operations by a constant factor, and it has somewhat better numerical
properties
...
2-1
Let M
...
n/ denote the time
required to square an n n matrix
...
n/-time matrix-multiplication algorithm implies an O
...
n//-time squaring algorithm, and an S
...
S
...


832

Chapter 28 Matrix Operations

28
...
n/ be the time to multiply two n n matrices, and let L
...
Show that multiplying matrices and computing LUP decompositions of matrices have essentially the same difficulty: an M
...
M
...
n/-time LUP-decomposition algorithm
implies an O
...
n//-time matrix-multiplication algorithm
...
2-3
Let M
...
n/ denote the
time required to find the determinant of an n n matrix
...
n/-time matrix-multiplication algorithm implies an O
...
n//-time determinant algorithm, and a D
...
D
...

28
...
n/ be the time to multiply two n n boolean matrices, and let T
...
(See Section 25
...
)
Show that an M
...
M
...
n/-time transitive-closure
algorithm implies an O
...
n//-time boolean matrix-multiplication algorithm
...
2-5
Does the matrix-inversion algorithm based on Theorem 28
...

28
...
2 to handle matrices of
complex numbers, and prove that your generalization works correctly
...
Instead of
symmetric matrices, consider Hermitian matrices, which are matrices A such that
A D A
...
3 Symmetric positive-definite matrices and least-squares approximation
Symmetric positive-definite matrices have many interesting and desirable properties
...
In this section, we shall

28
...

The first property we prove is perhaps the most basic
...
3
Any positive-definite matrix is nonsingular
...
Then by Corollary D
...
Hence, x T Ax D 0, and A cannot be positivedefinite
...
We begin by proving
properties about certain submatrices of A
...

Lemma 28
...

Proof That each leading submatrix Ak is symmetric is obvious
...
If Ak is not
T
positive-definite, then there exists a k-vector xk ¤ 0 such that xk Ak xk Ä 0
...
14)
AD
B C
for submatrices B (which is
...
n k/
...
Define
T
the n-vector x D
...
Then we have
ÃÂ
Ã
Â
xk
Ak B T
T
T
x Ax D
...
xk 0 /
Bxk
T
D xk Ak xk
Ä 0;

which contradicts A being positive-definite
...
Let A be
a symmetric positive-definite matrix, and let Ak be a leading k k submatrix
of A
...
14)
...
9) to define the Schur complement S of A with respect to Ak as
S DC

BAk 1 B T :

(28
...
4, Ak is symmetric and positive-definite; therefore, Ak 1 exists by
Lemma 28
...
) Note that our earlier definition (28
...
15), by letting k D 1
...
We used this
result in Theorem 28
...

Lemma 28
...

Proof Because A is symmetric, so is the submatrix C
...
2-6, the
product BAk 1 B T is symmetric, and by Exercise D
...

It remains to show that S is positive-definite
...
14)
...
Let us break x into two subvectors y and ´ compatible
with Ak and C , respectively
...
y ´ /
B C
´
Â
Ã
T
Ak y C B ´
D
...
y C Ak 1 B T ´/T Ak
...
C

BAk 1 B T /´ ;

(28
...
(Verify by multiplying through
...
(See Exercise 28
...
)
Since x T Ax > 0 holds for any nonzero x, let us pick any nonzero ´ and then
choose y D Ak 1 B T ´, which causes the first term in equation (28
...
C

BAk 1 B T /´ D ´T S´

as the value of the expression
...


28
...
6
LU decomposition of a symmetric positive-definite matrix never causes a division
by 0
...
We shall prove something
stronger than the statement of the corollary: every pivot is strictly positive
...
Let e1 be the first unit vector, from which we obtain a11 D e1 Ae1 > 0
...
a11 /, Lemma 28
...

Least-squares approximation
One important application of symmetric positive-definite matrices arises in fitting
curves to given sets of data points
...
x1 ; y1 /;
...
xm ; ym / ;
where we know that the yi are subject to measurement errors
...
x/ such that the approximation errors
Ái D F
...
17)

are small for i D 1; 2; : : : ; m
...
Here, we assume that it has the form of a linearly weighted sum,
F
...
x/ ;

j D1

where the number of summands n and the specific basis functions fj are chosen
based on knowledge of the problem at hand
...
x/ D x j 1 ,
which means that
F
...
Thus, given m data points
...
x2 ; y2 /;
: : : ;
...

By choosing n D m, we can calculate each yi exactly in equation (28
...
Such
a high-degree F “fits the noise” as well as the data, however, and generally gives
poor results when used to predict y for previously unseen values of x
...
Some theoretical

836

Chapter 28 Matrix Operations

principles exist for choosing n, but they are beyond the scope of this text
...
We now show
how to do so
...
x /
1

1

f1
...
x1 /
f2
...
x1 /
fn
...
xm / f2
...
xm /
denote the matrix of values of the basis functions at the given points; that is,
aij D fj
...
Let c D
...
Then,

˙ f
...
x2 /
:
:
:

Ac D


...
x //
F
1

m

f2
...
x2 /
:
:
:

:::
:::
::
:

˙c

fn
...
x2 /
:
:
:

1

c2
:
:
:

f2
...
xm /

cn

1

F
...
xm /
is the m-vector of “predicted values” for y
...

To minimize approximation errors, we choose to minimize the norm of the error
vector Á, which gives us a least-squares solution, since
!1=2
m
X
Á2
:
kÁk D
i
i D1

Because
kÁk2 D kAc

yk2 D

m
n
X X
i D1

!2
aij cj

yi

;

j D1

we can minimize kÁk by differentiating kÁk2 with respect to each ck and then
setting the result to 0:

28
...
18)

The n equations (28
...
Ac

y/T A D 0

or, equivalently (using Exercise D
...
Ac

y/ D 0 ;

which implies
AT Ac D AT y :

(28
...
The matrix AT A is symmetric
by Exercise D
...
6, AT A
is positive-definite as well
...
AT A/ 1 exists, and the solution to equation (28
...
AT A/ 1 AT y

D AC y ;

(28
...
AT A/ 1 AT / is the pseudoinverse of the matrix A
...
(Compare equation (28
...
)
As an example of producing a least-squares fit, suppose that we have five data
points

...
x2 ; y2 /

...
x4 ; y4 /

...
1; 2/ ;

...
2; 1/ ;

...
5; 3/ ;

shown as black dots in Figure 28
...
We wish to fit these points with a quadratic
polynomial
F
...
0
2
...
2 – 0
...
214x2

2
...
5
1
...
5
0
...
3 The least-squares fit of a quadratic polynomial to the set of five data points
f
...
1; 1/;
...
3; 0/;
...
The black dots are the data points, and the white dots are their
estimated values predicted by the polynomial F
...
Each shaded line shows the error for one data
point
...
3 Symmetric positive-definite matrices and least-squares approximation

F
...

As a practical matter, we solve the normal equation (28
...
If A has full rank, the
matrix AT A is guaranteed to be nonsingular, because it is symmetric and positivedefinite
...
1-2 and Theorem D
...
)
Exercises
28
...

Ã
a b
Let A D
be a 2 2 symmetric positive-definite matrix
...
5
...
3-2

Â

28
...

28
...

28
...

Prove that det
...
Ak 1 / is the kth pivot during LU decomposition, where,
by convention, det
...

28
...
x/ D c1 C c2 x lg x C c3 e x
that is the best least-squares fit to the data points

...
2; 1/;
...
4; 8/ :

840

Chapter 28 Matrix Operations

28
...
AAC /T

...
Find an LU decomposition of A
...
Solve the equation Ax D
stitution
...
Find the inverse of A
...
Show how, for any n n symmetric positive-definite, tridiagonal matrix A and
any n-vector b, to solve the equation Ax D b in O
...
Argue that any method based on forming A 1 is asymptotically more expensive in the worst case
...
Show how, for any n n nonsingular, tridiagonal matrix A and any n-vector b, to
solve the equation Ax D b in O
...

28-2 Splines
A practical method for interpolating a set of points with a curve is to use cubic splines
...
xi ; yi / W i D 0; 1; : : : ; ng of n C 1 point-value
< xn
...
x/ to the points
...
x/ is made up of n cubic polynomials fi
...
x/ D fi
...

The points xi at which the cubic polynomials are “pasted” together are called knots
...

To ensure continuity of f
...
xi /

D fi
...
xi C1/ D fi
...
To ensure that f
...
xi C1 / D fi0
...
0/
for i D 0; 1; : : : ; n

2
...
Suppose that for i D 0; 1; : : : ; n, we are given not only the point-value pairs
f
...
xi / at each knot
...

(Remember that xi D i
...
x/ at the knots
...
xi C1 / D fi00
...
0/
C1
for i D 0; 1; : : : ; n 2
...
x0 / D
f000
...
xn / D fn00 1
...
x/ a natural
cubic spline
...
Use the continuity constraints on the second derivative to show that for i D
1; 2; : : : ; n 1,
Di

1

C 4Di C Di C1 D 3
...
21)

c
...
y1
Dn 1 C 2Dn D 3
...
22)
(28
...
Rewrite equations (28
...
23) as a matrix equation involving the vector
D D hD0 ; D1 ; : : : ; Dn i of unknowns
...
Argue that a natural cubic spline can interpolate a set of n C 1 point-value pairs
in O
...


842

Chapter 28 Matrix Operations

f
...
xi ; yi / satisfying x0 < x1 <
equal to i
...
The following are especially readable: George
and Liu [132], Golub and Van Loan [144], Press, Teukolsky, Vetterling, and Flannery [283, 284], and Strang [323, 324]
...
They show why det
...
They also address
the question of how to compute this value without actually computing A 1
...
It was also
one of the earliest numerical algorithms
...
F
...
In his famous paper
[325], Strassen showed that an n n matrix can be inverted in O
...
Winograd [358] originally proved that matrix multiplication is no harder than matrix
inversion, and the converse is due to Aho, Hopcroft, and Ullman [5]
...
The SVD factors an m n matrix A into A D Q1 †Q2 , where † is an
m n matrix with nonzero values only on the diagonal, Q1 is m m with mutually
orthonormal columns, and Q2 is n n, also with mutually orthonormal columns
...
The books by Strang [323, 324] and Golub and Van Loan [144] contain good
treatments of the SVD
...


29

Linear Programming

Many problems take the form of maximizing or minimizing an objective, given
limited resources and competing constraints
...
Linear programs arise in a variety of practical applications
...

A political problem
Suppose that you are a politician trying to win an election
...
These areas have, respectively, 100,000, 200,000, and 50,000 registered voters
...
You are honorable and would never consider supporting policies in which you
do not believe
...
Your primary issues are building more roads, gun
control, farm subsidies, and a gasoline tax dedicated to improved public transit
...
This information appears in the table of Figure 29
...
In this table,
each entry indicates the number of thousands of either urban, suburban, or rural
voters who would be won over by spending $1,000 on advertising in support of a
particular issue
...
Your task is to
figure out the minimum amount of money that you need to spend in order to win
50,000 urban votes, 100,000 suburban votes, and 25,000 rural votes
...

For example, you could devote $20,000 of advertising to building roads, $0 to gun
control, $4,000 to farm subsidies, and $9,000 to a gasoline tax
...
1 The effects of policies on voters
...
Negative entries denote votes that would be lost
...
2/C0
...
0/C9
...
5/C0
...
0/C9
...
3/C0
...
10/C9
...
You would win the exact number of votes desired in the
urban and suburban areas and more than enough votes in the rural area
...
) In order to
garner these votes, you would have paid for 20 C 0 C 4 C 9 D 33 thousand dollars
of advertising
...
That is,
could you achieve your goals while spending less on advertising? Additional trial
and error might help you to answer this question, but wouldn’t you rather have a
systematic method for answering such questions? In order to develop one, we shall
formulate this question mathematically
...

We can write the requirement that we win at least 50,000 urban votes as
2x1 C 8x2 C 0x3 C 10x4

50 :

(29
...
2)

25 :

(29
...
1)–(29
...
In order to

Chapter 29

Linear Programming

845

keep costs as small as possible, you would like to minimize the amount spent on
advertising
...
4)

Although negative advertising often occurs in political campaigns, there is no such
thing as negative-cost advertising
...
5)

Combining inequalities (29
...
3) and (29
...
4), we obtain what is known as a “linear program
...
6)
50
100
25
0 :

(29
...
8)
(29
...
10)

The solution of this linear program yields your optimal strategy
...
Given a set of real numbers a1 ; a2 ; : : : ; an and
a set of variables x1 ; x2 ; : : : ; xn , we define a linear function f on those variables
by
f
...
x1 ; x2 ; : : : ; xn / D b
is a linear equality and the inequalities
f
...
x1 ; x2 ; : : : ; xn /

b

846

Chapter 29 Linear Programming

are linear inequalities
...
In linear programming, we do not allow
strict inequalities
...
If we are to minimize, then we call the linear program a minimization
linear program, and if we are to maximize, then we call the linear program a
maximization linear program
...
Although several polynomial-time algorithms for linear programming have
been developed, we will not study them in this chapter
...
The simplex
algorithm does not run in polynomial time in the worst case, but it is fairly efficient
and widely used in practice
...
We shall use two forms, standard
and slack, in this chapter
...
1
...
We shall typically use
standard form for expressing linear programs, but we find it more convenient to
use slack form when we describe the details of the simplex algorithm
...

Let us first consider the following linear program with two variables:
maximize
subject to

x1 C

x2

x2
4x1
2x1 C x2
2x2
5x1
x1 ; x2

(29
...
12)
(29
...
14)
(29
...
12)–(29
...
If we graph the constraints in the
...
2(a), we see

Chapter 29

847

x2

1

5x

=
x2
8
4

≤1

=
x2

+ x2

+
x1

2x 1

x1 ≥ 0

+
x1

4x1 –
x2 ≤ 8

– 2x

2

≥ –2

x2

Linear Programming

0
x1

=
x2

x1

+
x1

x2 ≥ 0

0

(a)

(b)

Figure 29
...
12)–(29
...
Each constraint is represented by
a line and a direction
...

(b) The dotted lines show, respectively, the points for which the objective value is 0, 4, and 8
...


that the set of feasible solutions (shaded in the figure) forms a convex region1 in
the two-dimensional space
...
Conceptually, we could evaluate the objective function x1 C x2 at each point in the feasible region; we call the
value of the objective function at a particular point the objective value
...

For this example (and for most linear programs), the feasible region contains an
infinite number of points, and so we need to determine an efficient way to find a
point that achieves the maximum objective value without explicitly evaluating the
objective function at every point in the feasible region
...
The set of points
for which x1 Cx2 D ´, for any ´, is a line with a slope of 1
...
2(b)
...
In this case, that intersection of the line with the
feasible region is the single point
...
More generally, for any ´, the intersection

1 An

intuitive definition of a convex region is that it fulfills the requirement that for any two points in
the region, all points on a line segment between them are also in the region
...
Figure 29
...
Because the feasible region in Figure 29
...
Any point at which this occurs is an optimal
solution to the linear program, which in this case is the point x1 D 2 and x2 D 6
with objective value 8
...
The maximum value of ´ for which the line x1 C x2 D ´
intersects the feasible region must be on the boundary of the feasible region, and
thus the intersection of this line with the boundary of the feasible region is either a
single vertex or a line segment
...
If the intersection is a line segment,
every point on that line segment must have the same objective value; in particular,
both endpoints of the line segment are optimal solutions
...

Although we cannot easily graph linear programs with more than two variables,
the same intuition holds
...
The intersection of these halfspaces forms the feasible region
...
If all
coefficients of the objective function are nonnegative, and if the origin is a feasible
solution to the linear program, then as we move this plane away from the origin, in
a direction normal to the objective function, we find points of increasing objective
value
...
) As in two
dimensions, because the feasible region is convex, the set of points that achieve
the optimal objective value must include a vertex of the feasible region
...
We call the feasible region formed by the intersection of these half-spaces a
simplex
...

The simplex algorithm takes as input a linear program and returns an optimal
solution
...
In each iteration, it moves along an edge of the simplex from a current vertex
to a neighboring vertex whose objective value is no smaller than that of the current
vertex (and usually is larger
...
Because the feasible region is convex and the objective
function is linear, this local optimum is actually a global optimum
...
4,

Chapter 29

Linear Programming

849

we shall use a concept called “duality” to show that the solution returned by the
simplex algorithm is indeed optimal
...
3
...
We
first write the given linear program in slack form, which is a set of linear equalities
...
” We move from one vertex
to another by making a basic variable become nonbasic and making a nonbasic
variable become basic
...

The two-variable example described above was particularly simple
...
These issues include identifying linear programs that have no solutions, linear programs that have no finite
optimal solution, and linear programs for which the origin is not a feasible solution
...
Any textbook on operations research is filled with examples of linear programming, and linear programming has become a standard tool taught to students in most business schools
...
Two more examples of linear programming are the following:
An airline wishes to schedule its flight crews
...
The airline wants to schedule
crews on all of its flights using as few crew members as possible
...
Siting a drill at a particular location has an associated cost and, based on geological surveys, an expected
payoff of some number of barrels of oil
...

With linear programs, we also model and solve graph and combinatorial problems, such as those appearing in this textbook
...
4
...
2, we shall study how to formulate several graph and
network-flow problems as linear programs
...
4, we shall use linear
programming as a tool to find an approximate solution to another graph problem
...
This algorithm, when implemented
carefully, often solves general linear programs quickly in practice
...
The first polynomial-time algorithm for linear programming was the ellipsoid
algorithm, which runs slowly in practice
...
In contrast to the simplex algorithm,
which moves along the exterior of the feasible region and maintains a feasible solution that is a vertex of the simplex at each iteration, these algorithms move through
the interior of the feasible region
...
For large
inputs, interior-point algorithms can run as fast as, and sometimes faster than, the
simplex algorithm
...

If we add to a linear program the additional requirement that all variables take
on integer values, we have an integer linear program
...
5-3 asks you
to show that just finding a feasible solution to this problem is NP-hard; since
no polynomial-time algorithms are known for any NP-hard problems, there is no
known polynomial-time algorithm for integer linear programming
...

In this chapter, if we have a linear program with variables x D
...
x1 ; x2 ; : : : ; xn /
...
1 Standard and slack forms
This section describes two formats, standard form and slack form, that are useful when we specify and work with linear programs
...

Standard form
In standard form, we are given n real numbers c1 ; c2 ; : : : ; cn ; m real numbers
b1 ; b2 ; : : : ; bm ; and mn real numbers aij for i D 1; 2; : : : ; m and j D 1; 2; : : : ; n
...
1 Standard and slack forms

maximize

n
X

851

cj xj

(29
...
17)

for j D 1; 2; : : : ; n :

(29
...
16) the objective function and the n C m inequalities in
lines (29
...
18) the constraints
...
18) are the
nonnegativity constraints
...
Sometimes we find it convenient
to express a linear program in a more compact form
...
aij /, an m-vector b D
...
cj /, and an n-vector x D
...
16)–(29
...
19)

Ax Ä b
x
0:

maximize
subject to

(29
...
21)

In line (29
...
In inequality (29
...
21), x 0 means that each entry
of the vector x must be nonnegative
...
A; b; c/, and we shall adopt the convention that A, b,
and c always have the dimensions given above
...
We used
some of this terminology in the earlier example of a two-variable linear program
...
We say that a solution x has objective value c T x
...

If a linear program has no feasible solutions, we say that the linear program is infeasible; otherwise it is feasible
...
Exercise 29
...


852

Chapter 29 Linear Programming

Converting linear programs into standard form
It is always possible to convert a linear program, given as minimizing or maximizing a linear function subject to linear constraints, into standard form
...
The objective function might be a minimization rather than a maximization
...
There might be variables without nonnegativity constraints
...
There might be equality constraints, which have an equal sign rather than a
less-than-or-equal-to sign
...
There might be inequality constraints, but instead of having a less-than-orequal-to sign, they have a greater-than-or-equal-to sign
...
To
capture this idea, we say that two maximization linear programs L and L0 are
equivalent if for each feasible solution x to L with objective value ´, there is
N
a corresponding feasible solution x 0 to L0 with objective value ´, and for each
N
feasible solution x 0 to L0 with objective value ´, there is a corresponding feasible
N
solution x to L with objective value ´
...
) A minimization linear program L
N
and a maximization linear program L0 are equivalent if for each feasible solution x
to L with objective value ´, there is a corresponding feasible solution x 0 to L0 with
N
objective value ´, and for each feasible solution x 0 to L0 with objective value ´,
N
there is a corresponding feasible solution x to L with objective value ´
...
After removing each one, we shall argue that the new linear program is
equivalent to the old one
...
Since
L and L0 have identical sets of feasible solutions and, for any feasible solution, the
objective value in L is the negative of the objective value in L0 , these two linear
programs are equivalent
...
1 Standard and slack forms

maximize
subject to

2x1

853

3x2

x1 C x2
2x2
x1
x1

D 7
Ä 4
0 :

Next, we show how to convert a linear program in which some of the variables
do not have nonnegativity constraints into one in which each variable has a nonnegativity constraint
...
Then, we replace each occurrence of xj by xj xj , and add the non0
00
0 and xj
0
...
Any feasible solution x to the new linear program cory0 y00
responds to a feasible solution x to the original linear program with xj D xj xj
N
N
and with the same objective value
...
The two
y00
N
y0
N
xj D xj and xj D 0 if xj
y0
linear programs have the same objective value regardless of the sign of xj
...
We apply this conversion scheme to each
variable that does not have a nonnegativity constraint to yield an equivalent linear
program in which all variables have nonnegativity constraints
...
Variable x1 has such a constraint, but variable x2 does
0
00
not
...
22)

Next, we convert equality constraints into inequality constraints
...
x1 ; x2 ; : : : ; xn / D b
...

pair of inequality constraints f
...
x1 ; x2 ; : : : ; xn /
Repeating this conversion for each equality constraint yields a linear program in
which all constraints are inequalities
...
That is, any
inequality of the form

854

Chapter 29 Linear Programming
n
X

aij xj

bi

j D1

is equivalent to
n
X

aij xj Ä

bi :

j D1

Thus, by replacing each coefficient aij by aij and each value bi by bi , we obtain
an equivalent less-than-or-equal-to constraint
...
22) by two inequalities, obtaining
maximize
subject to

0
3x2

00
C 3x2

0
x1 C x2
0
x1 C x2
0
x1
2x2
0
00
x1 ; x2 ; x2

00
x2
00
x2
00
C 2x2

2x1

Ä 7
7
Ä 4
0 :

(29
...
23)
...
24)
Ä
Ä
Ä

7
7
4
0 :

(29
...
26)
(29
...
28)

Converting linear programs into slack form
To efficiently solve a linear program with the simplex algorithm, we prefer to express it in a form in which some of the constraints are equality constraints
...
Let
n
X
j D1

aij xj Ä bi

(29
...
1 Standard and slack forms

855

be an inequality constraint
...
29) as the two constraints
s D bi

n
X

aij xj ;

(29
...
31)

We call s a slack variable because it measures the slack, or difference, between
the left-hand and right-hand sides of equation (29
...
(We shall soon see why we
find it convenient to write the constraint with only the slack variable on the lefthand side
...
29) is true if and only if both equation (29
...
31) are true, we can convert each inequality constraint of a linear program in this way to obtain an equivalent linear program in which the only
inequality constraints are the nonnegativity constraints
...
The ith constraint is therefore
xnCi D bi

n
X

aij xj ;

(29
...

By converting each constraint of a linear program in standard form, we obtain a
linear program in a different form
...
24)–(29
...
33)

x2 C x3
C x2
x3
C 2x2
2x3
0 :

(29
...
35)
(29
...
37)

In this linear program, all the constraints except for the nonnegativity constraints
are equalities, and each variable is subject to a nonnegativity constraint
...
Furthermore, each equation has the same
set of variables on the right-hand side, and these variables are also the only ones
that appear in the objective function
...

For linear programs that satisfy these conditions, we shall sometimes omit the
words “maximize” and “subject to,” as well as the explicit nonnegativity constraints
...
We call the resulting format slack form
...
33)–(29
...
38)
(29
...
40)
(29
...
As we shall see in Section 29
...
We use N to denote
the set of indices of the nonbasic variables and B to denote the set of indices of
the basic variables
...
The equations are indexed by the entries of B, and the variables
on the right-hand sides are indexed by the entries of N
...
We also use to denote
an optional constant term in the objective function
...
) Thus we can concisely define a slack form by a
tuple
...
42)
´ D
C
j 2N

xi

X

D bi

aij xj

for i 2 B ;

(29
...
Because we subtract
all
the sum j 2N aij xj in (29
...

For example, in the slack form
´ D 28
x1 D

8

x2 D

4

x4 D 18

C

x3
6
x3
6
8x3
3
x3
2

C

C

x5
6
x5
6
2x5
3
x5
2

C
;

we have B D f1; 2; 4g, N D f3; 5; 6g,

2x6
3
x6
3
x6
3

29
...
Note that the
D
c D c3 c5 c6
indices into A, b, and c are not necessarily sets of contiguous integers; they depend
on the index sets B and N
...

Exercises
29
...
24)–(29
...
19)–(29
...
1-2
Give three feasible solutions to the linear program in (29
...
28)
...
1-3
For the slack form in (29
...
41), what are N , B, A, b, c, and ?
29
...
1-5
Convert the following linear program into slack form:
maximize
subject to

6x3

2x1
x1 C x2
x2
3x1
x1 C 2x2
x1 ; x2 ; x3

x3
C 2x3

Ä 7
8
0
0 :

What are the basic and nonbasic variables?
29
...
1-7
Show that the following linear program is unbounded:
maximize
subject to

x1

x2

2x1 C x2
2x2
x1
x1 ; x2

Ä
Ä

1
2
0 :

29
...
Give an upper bound on the
number of variables and constraints in the resulting linear program
...
1-9
Give an example of a linear program for which the feasible region is not bounded,
but the optimal objective value is finite
...
2 Formulating problems as linear programs

859

29
...

Once we cast a problem as a polynomial-sized linear program, we can solve it
in polynomial time by the ellipsoid algorithm or interior-point methods
...

We shall look at several concrete examples of linear-programming problems
...

We then describe the minimum-cost-flow problem
...
Finally, we describe the multicommodityflow problem, for which the only known polynomial-time algorithm is based on
linear programming
...
u; /:f
...
Therefore, when we express variables in linear programs, we shall indicate vertices and edges through subscripts
...

Similarly, we denote the flow from vertex u to vertex not by
...

For quantities that are given as inputs to problems, such as edge weights or capacities, we shall continue to use notations such as w
...
u: /
...

In this section, we shall focus on how to formulate the single-pair shortest-path
problem, leaving the extension to the more general single-source shortest-paths
problem as Exercise 29
...

In the single-pair shortest-path problem, we are given a weighted, directed graph
G D
...
We wish to compute the
value d t , which is the weight of a shortest path from s to t
...
Fortunately, the Bellman-Ford algorithm does exactly this
...
u; / 2 E, we have d Ä du C w
...


860

Chapter 29 Linear Programming

The source vertex initially receives a value ds D 0, which never changes
...
44)

d Ä du C w
...
u; / 2 E ;
ds D 0 :

maximize
subject to

(29
...
46)

You might be surprised that this linear program maximizes an objective function
when it is supposed to compute shortest paths
...
We
N
maximize because an optimal«solution to the shortest-paths problem sets each d
˚
N
Nu C w
...
u; /2E d
«
˚
N
equal to all of the values in the set du C w
...
We want to maximize d
for all vertices on a shortest path from s to t subject to these constraints on all
vertices , and maximizing d t achieves this goal
...
It also
has jEj C 1 constraints: one for each edge, plus the additional constraint that the
source vertex’s shortest-path weight always has the value 0
...
Recall that we
are given a directed graph G D
...
u; / 2 E has a
nonnegative capacity c
...
As defined in Section 26
...
A
maximum flow is a flow that satisfies these constraints and maximizes the flow
value, which is the total flow coming out of the source minus the total flow into the
source
...
Recalling also that we assume that c
...
u; / 62 E and
that there are no antiparallel edges, we can express the maximum-flow problem as
a linear program:
X
X
fs
fs
(29
...
u; /
X
D
fu

for each u; 2 V ;

(29
...
49)

fs; tg ;

2V

fu

0

for each u; 2 V :

(29
...
2 Formulating problems as linear programs

861

This linear program has jV j2 variables, corresponding to the flow between each
pair of vertices, and it has 2 jV j2 C jV j 2 constraints
...
The linear
program in (29
...
50) has, for ease of notation, a flow and capacity of 0 for
each pair of vertices u; with
...
It would be more efficient to rewrite the
linear program so that it has O
...
Exercise 29
...

Minimum-cost flow
In this section, we have used linear programming to solve problems for which we
already knew efficient algorithms
...

The real power of linear programming comes from the ability to solve new problems
...

The problem of obtaining a sufficient number of votes, while not spending too
much money, is not solved by any of the algorithms that we have studied in this
book, yet we can solve it by linear programming
...
Linear programming is also
particularly useful for solving variants of problems for which we may not already
know of an efficient algorithm
...
Suppose that, in addition to a capacity c
...
u; /, we are
given a real-valued cost a
...
As in the maximum-flow problem, we assume that
c
...
u; / 62 E, and that there are no antiparallel edges
...
u; /, we incur a cost of a
...
We are also given a
flow demand d
...
u; /2E a
...
This problem is known as the
minimum-cost-flow problem
...
3(a) shows an example of the minimum-cost-flow problem
...
Any
particular legal flow, that is, a function f satisfying constraints (29
...
49),
P
incurs a total cost of
...
u; /fu
...
Figure 29
...
u; /2E a
...
2 2/ C
...
3 1/ C
...
1 3/ D 27:
There are polynomial-time algorithms specifically designed for the minimumcost-flow problem, but they are beyond the scope of this book
...
The linear program
looks similar to the one for the maximum-flow problem with the additional con-

862

Chapter 29 Linear Programming

5
c= 2
a=

x

s

c=
a= 2
5

c=
a= 2
7
c=1
a=3

2/5 2
a=
t

1/1
a=3

s

2/
a= 2
5

4
c= 1
=
a

y
(a)

1/
a= 2
7

x

y

t

3/4 1
a=

(b)

Figure 29
...
We denote the capacities by c and
the costs by a
...
(b) A solution to the minimum-cost flow problem in which 4 units of flow are sent from s
to t
...


straint that the value of the flow be exactly d units, and with the new objective
function of minimizing the cost:
X
a
...
51)
minimize
subject to


...
u; / for each u; 2 V ;

fu

D 0

for each u 2 V

fs; tg ;

2V

fs

2V

X

f

s

D d;

2V

fu

0

for each u; 2 V :

(29
...
Suppose that the Lucky
Puck company from Section 26
...
Each piece of
equipment is manufactured in its own factory, has its own warehouse, and must
be shipped, each day, from factory to warehouse
...
The capacity of the shipping network
does not change, however, and the different items, or commodities, must share the
same network
...
In this problem,
we are again given a directed graph G D
...
u; / 2 E
has a nonnegative capacity c
...
As in the maximum-flow problem, we implicitly assume that c
...
u; / 62 E, and that the graph has no antipar-

29
...
In addition, we are given k different commodities, K1 ; K2 ; : : : ; Kk ,
where we specify commodity i by the triple Ki D
...
Here, vertex si is
the source of commodity i, vertex ti is the sink of commodity i, and di is the demand for commodity i, which is the desired flow value for the commodity from si
to ti
...
We now define fu , the aggregate
Pk
flow, to be the sum of the various commodity flows, so that fu D i D1 fi u
...
u; / must be no more than the capacity of edge
...

We are not trying to minimize any objective function in this problem; we need
only determine whether such a flow exists
...
u; / for each u; 2 V ;

fi

D 0

for each i D 1; 2; : : : ; k and
for each u 2 V fsi ; ti g ;

D di

for each i D 1; 2; : : : ; k ;

0

for each u; 2 V and
for each i D 1; 2; : : : ; k :

i D1

fi;si ;

X

u

2V

X

fi;

;si

2V

fi u

The only known polynomial-time algorithm for this problem expresses it as a linear
program and then solves it with a polynomial-time linear-programming algorithm
...
2-1
Put the single-pair shortest-path linear program from (29
...
46) into standard
form
...
2-2
Write out explicitly the linear program corresponding to finding the shortest path
from node s to node y in Figure 24
...

29
...
Given a graph G, write a

864

Chapter 29 Linear Programming

linear program for which the solution has the property that d is the shortest-path
weight from s to for each vertex 2 V
...
2-4
Write out explicitly the linear program corresponding to finding the maximum flow
in Figure 26
...

29
...
47)–(29
...
V C E/ constraints
...
2-6
Write a linear program that, given a bipartite graph G D
...

29
...
V; E/ in which each edge
...
u; / 0
and a cost a
...
As in the multicommodity-flow problem, we are given k different commodities, K1 ; K2 ; : : : ; Kk , where we specify commodity i by the triple
Ki D
...
We define the flow fi for commodity i and the aggregate flow fu
on edge
...
A feasible flow is one
in which the aggregate flow on each edge
...
u; /
...
u; /fu , and the goal is to find the
feasible flow of minimum cost
...


29
...
In contrast to most of the other algorithms in this book, its running time is not polynomial
in the worst case
...

In addition to having a geometric interpretation, described earlier in this chapter,
the simplex algorithm bears some similarity to Gaussian elimination, discussed in
Section 28
...
Gaussian elimination begins with a system of linear equalities whose
solution is unknown
...
After some number of iterations, we have
rewritten the system so that the solution is simple to obtain
...


29
...

Associated with each iteration will be a “basic solution” that we can easily obtain
from the slack form of the linear program: set each nonbasic variable to 0 and
compute the values of the basic variables from the equality constraints
...
The objective value of the
associated basic feasible solution will be no less than that at the previous iteration,
and usually greater
...
The amount by which we can increase
the variable is limited by the other constraints
...
We then rewrite the slack form, exchanging the roles
of that basic variable and the chosen nonbasic variable
...
It simply rewrites
the linear program until an optimal solution becomes “obvious
...
Consider the following linear program in
standard form:
3x1 C

x2

C 2x3

x1 C x2
2x1 C 2x2
4x1 C x2
x1 ; x2 ; x3

maximize
subject to

C 3x3
C 5x3
C 2x3

(29
...
54)
(29
...
56)
(29
...
1
...
Recalling from Section 29
...
Similarly, a setting of the nonbasic variables that would make a basic variable become negative violates that
constraint
...

Associating the slack variables x4 , x5 , and x6 with inequalities (29
...
56),
respectively, and putting the linear program into slack form, we obtain

866

Chapter 29 Linear Programming

´
x4
x5
x6

D
D 30
D 24
D 36

3x1
x1
2x1
4x1

C

x2
x2
2x2
x2

C 2x3
3x3
5x3
2x3 :

(29
...
59)
(29
...
61)

The system of constraints (29
...
61) has 3 equations and 6 variables
...
A solution is
feasible if all of x1 ; x2 ; : : : ; x6 are nonnegative, and there can be an infinite number of feasible solutions as well
...
We focus on the basic solution: set all the (nonbasic) variables on the right-hand side to 0 and then compute
the values of the (basic) variables on the left-hand side
...
x1 ; x2 ; : : : ; x6 / D
...
3 0/ C
...
2 0/ D 0
...
An iteration of the simplex algorithm rewrites the set of equations
and the objective function so as to put a different set of variables on the righthand side
...

We emphasize that the rewrite does not in any way change the underlying linearprogramming problem; the problem at one iteration has the identical set of feasible
solutions as the problem at the previous iteration
...

If a basic solution is also feasible, we call it a basic feasible solution
...

We shall see in Section 29
...

Our goal, in each iteration, is to reformulate the linear program so that the basic
solution has a greater objective value
...
The variable xe becomes
basic, and some other variable xl becomes nonbasic
...

To continue the example, let’s think about increasing the value of x1
...
Because we have a nonnegativity constraint for each variable, we cannot allow any of them to become negative
...
The third constraint (29
...
Therefore, we
switch the roles of x1 and x6
...
61) for x1 and obtain
x2 x3 x6
:
(29
...
3 The simplex algorithm

867

To rewrite the other equations with x6 on the right-hand side, we substitute for x1
using equation (29
...
Doing so for equation (29
...
63)
D 21
4
2
4
Similarly, we combine equation (29
...
60) and with objective
function (29
...
64)
4
2
4
x3
x6
x2
(29
...
66)
x4 D 21
4
2
4
3x2
x6
4x3 C
:
(29
...
As demonstrated above, a pivot chooses a nonbasic
variable xe , called the entering variable, and a basic variable xl , called the leaving
variable, and exchanges their roles
...
64)–(29
...
58)–(29
...
We perform two operations
in the simplex algorithm: rewrite equations so that variables move between the lefthand side and the right-hand side, and substitute one equation into another
...
(See Exercise 29
...
)
To demonstrate this equivalence, observe that our original basic solution
...
65)–(29
...
1=4/ 0 C
...
3=4/ 36 D 0
...
9; 0; 0; 21; 6; 0/, with objective value ´ D 27
...
59)–(29
...
58), has
objective value
...
1 0/ C
...

Continuing the example, we wish to find a new variable whose value we wish to
increase
...
We can attempt to increase either x2 or x3 ; let us choose x3
...
65)
limits it to 18, constraint (29
...
67) limits
it to 3=2
...
We then substitute this new equation, x3 D 3=2 3x2 =8 x5 =4 C x6 =8, into
equations (29
...
66) and obtain the new, but equivalent, system
x2
x5
11x6
111
C
(29
...
69)
x1 D
4
16
8
16
3x2
x5
x6
3
C
(29
...
71)
x4 D
4
16
8
16
This system has the associated basic solution
...
Now the only way to increase the objective value is to increase x2
...

(We get an upper bound of 1 from constraint (29
...
This constraint, therefore, places
no restriction on how much we can increase x2
...
Then we solve equation (29
...
72)
6
6
3
x5
x6
x3
C
(29
...
74)
x2 D 4
3
3
3
x5
x3
C
:
(29
...
As we shall see
later in this chapter, this situation occurs only when we have rewritten the linear
program so that the basic solution is an optimal solution
...
8; 4; 0; 18; 0; 0/, with objective value 28, is optimal
...
53)–(29
...
The only variables
in the original linear program are x1 , x2 , and x3 , and so our solution is x1 D 8,
x2 D 4, and x3 D 0, with objective value
...
1 4/ C
...
Note
that the values of the slack variables in the final solution measure how much slack
remains in each inequality
...
54), the
left-hand side, with value 8 C 4 C 0 D 12, is 18 less than the right-hand side of 30
...
55) and (29
...
Observe also that even though the
coefficients in the original slack form are integral, the coefficients in the other
linear programs are not necessarily integral, and the intermediate solutions are not
´ D 28

29
...
Furthermore, the final solution to a linear program need not
be integral; it is purely coincidental that this example has an integral solution
...
The procedure P IVOT takes as input a slack form, given by the tuple
...
It returns the tuple
y y y y y

...
(Recall again that the entries of
y
the m n matrices A and A are actually the negatives of the coefficients that appear
in the slack form
...
N; B; A; b; c; ; l; e/
1 / Compute the coefficients of the equation for new basic variable xe
...

/
8 for each i 2 B flg
y
y
9
bi D bi ai e be
10
for each j 2 N feg
11
aij D aij ai e aej
y
y
12
ai l D ai e ael
y
y
13 / Compute the objective function
...

/
y D N feg [ flg
19 N
y
20 B D B flg [ feg
y y y y y
21 return
...
Lines 3–6 compute the coefficients in the new equation
for xe by rewriting the equation that has xl on the left-hand side to instead have xe
on the left-hand side
...
Lines 14–17
do the same substitution for the objective function, and lines 19 and 20 update the

870

Chapter 29 Linear Programming

sets of nonbasic and basic variables
...
As given,
if ale D 0, P IVOT would cause an error by dividing by 0, but as we shall see in the
proofs of Lemmas 29
...
12, we call P IVOT only when ale ¤ 0
...

Lemma 29
...
N; B; A; b; c; ; l; e/ in which ale ¤ 0
...
N ; B; A; b; c ; y/, and let x denote the basic solution after
N
the call
...
xj D 0 for each j 2 N
...
xe D bl =ale
...
xi D bi
N

y
y
ai e be for each i 2 B

feg
...
When we set each nonbasic variable to 0 in a constraint
X
y
aij xj ;
y
xi D bi
y
j 2N

y
y
y
we have that xi D bi for each i 2 B
...
Similarly, using line 9 for each i 2 B
we have
y
y
xi D bi D bi ai e be ;
N

feg,

which proves the third statement
...
That example was a particularly nice one, and we could have had several
other issues to address:
How do we determine whether a linear program is feasible?
What do we do if the linear program is feasible, but the initial basic solution is
not feasible?
How do we determine whether a linear program is unbounded?
How do we choose the entering and leaving variables?

29
...
5, we shall show how to determine whether a problem is feasible,
and if so, how to find a slack form in which the initial basic solution is feasible
...
A; b; c/
that takes as input a linear program in standard form, that is, an m n matrix
A D
...
bi /, and an n-vector c D
...
If the problem is
infeasible, the procedure returns a message that the program is infeasible and then
terminates
...

The procedure S IMPLEX takes as input a linear program in standard form, as just
described
...
xj / that is an optimal solution to the linear
N
N
program described in (29
...
21)
...
A; b; c/
1
...
A; b; c/
2 let  be a new vector of length n
3 while some index j 2 N has cj > 0
4
choose an index e 2 N for which ce > 0
5
for each index i 2 B
6
if ai e > 0
7
i D bi =ai e
8
else i D 1
9
choose an index l 2 B that minimizes i
10
if l == 1
11
return “unbounded”
12
else
...
N; B; A; b; c; ; l; e/
13 for i D 1 to n
14
if i 2 B
15
xi D bi
N
16
else xi D 0
N
N
17 return
...
In line 1, it calls the procedure
I NITIALIZE -S IMPLEX
...
The while loop of lines 3–12 forms the main part of the algorithm
...

Otherwise, line 4 selects a variable xe , whose coefficient in the objective function
is positive, as the entering variable
...

Next, lines 5–9 check each constraint and pick the one that most severely limits
the amount by which we can increase xe without violating any of the nonnegativ-

872

Chapter 29 Linear Programming

ity constraints; the basic variable associated with this constraint is xl
...
If none of the constraints limits the amount by which the entering variable can increase, the algorithm returns
“unbounded” in line 11
...
N; B; A; b; c; ; l; e/, as described above
...

To show that S IMPLEX is correct, we first show that if S IMPLEX has an initial
feasible solution and eventually terminates, then it either returns a feasible solution
or determines that the linear program is unbounded
...
Finally, in Section 29
...
10) we show that the solution
returned is optimal
...
2
Given a linear program
...

Then if S IMPLEX returns a solution in line 17, that solution is a feasible solution to
the linear program
...

Proof

We use the following three-part loop invariant:

At the start of each iteration of the while loop of lines 3–12,
1
...
for each i 2 B, we have bi 0, and
3
...

Initialization: The equivalence of the slack forms is trivial for the first iteration
...
Thus, the third part of the invariant is true
...
Furthermore, since the
0 for all
basic solution sets each basic variable xi to bi , we have that bi
i 2 B
...

Maintenance: We shall show that each iteration of the while loop maintains the
loop invariant, assuming that the return statement in line 11 does not execute
...


29
...
By Exercise 29
...

We now demonstrate the second part of the loop invariant
...
Since
the only changes to the variables bi and the set B of basic variables occur in this
assignment, it suffices to show that line 12 maintains this part of the invariant
...

y
First, we observe that be 0 because bl 0 by the loop invariant, ale > 0 by
y
lines 6 and 9 of S IMPLEX, and be D bl =ale by line 3 of P IVOT
...
bl =ale / (by line 3 of P IVOT)
...
76)

We have two cases to consider, depending on whether ai e > 0 or ai e Ä 0
...
77)

we have
y
bi

D bi ai e
...
76))
bi ai e
...
77))
D bi bi
D 0;

y
0
...
76) implies that bi must be nonnegative, too
...
e
...
The nonbasic variables are set to 0 and thus are nonnegative
...
Using the second part of the loop invariant, we
N
conclude that each basic variable xi is nonnegative
...
If it terminates
because of the condition in line 3, then the current basic solution is feasible and
line 17 returns this solution
...
In this case, for each iteration of the for loop in lines 5–8,
N
when line 6 is executed, we find that ai e Ä 0
...
e
...
The nonbasic variables other than xe are 0, and xe D 1 > 0; thus all
N
nonbasic variables are nonnegative
...

N

0, and we have ai e Ä 0 and xe D 1 > 0
...
From
N
equation (29
...

It remains to show that S IMPLEX terminates, and when it does terminate, the
solution it returns is optimal
...
4 will address optimality
...

Termination
In the example given in the beginning of this section, each iteration of the simplex
algorithm increased the objective value associated with the basic solution
...
3-2 asks you to show, no iteration of S IMPLEX can decrease the objective
value associated with the basic solution
...
This phenomenon is called degeneracy,
and we shall now study it in greater detail
...
3 The simplex algorithm

875

y
The assignment in line 14 of P IVOT, y D C ce be , changes the objective value
...
e
...
This value is assigned
y
as be D bl =ale in line 3 of P IVOT
...

Indeed, this situation can occur
...

After pivoting, we obtain
´ D 8
x1 D 8
x5 D

C x3
x2
x2

x4
x4

x3 :

At this point, our only choice is to pivot with x3 entering and x5 leaving
...
Fortunately, if we
pivot again, with x2 entering and x1 leaving, the objective value increases (to 16),
and the simplex algorithm can continue
...
Because of degeneracy, S IMPLEX could choose a
sequence of pivot operations that leave the objective value unchanged but repeat
a slack form within the sequence
...

Cycling is the only reason that S IMPLEX might not terminate
...

At each iteration, S IMPLEX maintains A, b, c, and in addition to the sets
N and B
...
In other words, the sets of basic and nonbasic variables suffice to uniquely
determine the slack form
...


876

Chapter 29 Linear Programming

Lemma 29
...
For each j 2 I , let ˛j and ˇj be real numbers, and let xj
be a real-valued variable
...
Suppose that for any settings
of the xj , we have
X
X
˛j xj D C
ˇj xj :
(29
...


Proof Since equation (29
...
If we let xj D 0 for each j 2 I ,
we conclude that D 0
...
Then we must have ˛j D ˇj
...

A particular linear program has many different slack forms; recall that each slack
form has the same set of feasible and optimal solutions as the original linear program
...
That is, given the set of basic variables, a unique slack
form (unique set of coefficients and right-hand sides) is associated with those basic
variables
...
4
Let
...
Given a set B of basic variables,
the associated slack form is uniquely determined
...
The slack forms must also have
identical sets N D f1; 2; : : : ; n C mg B of nonbasic variables
...
79)
´ D
C
j 2N

xi

D bi

X

aij xj for i 2 B ;

(29
...
81)

j 2N

xi

D bi0

X

j 2N

0
aij xj for i 2 B :

(29
...
3 The simplex algorithm

877

Consider the system of equations formed by subtracting each equation in
line (29
...
80)
...
aij aij /xj for i 2 B
0 D
...
bi
j 2N

bi0 / C

X

0
aij xj for i 2 B :

j 2N

0
Now, for each i 2 B, apply Lemma 29
...
Since ˛i D ˇi , we have that aij D aij for each j 2 N , and since D 0,
0
we have that bi D bi
...
Using a similar argument, Exercise 29
...


We can now show that cycling is the only possible reason that S IMPLEX might
not terminate
...
5
If S IMPLEX fails to terminate in at most

nCm
m

iterations, then it cycles
...
4, the set B of basic variables uniquely determines a slack
form
...
Thus, there are only at most nCm unique slack forms
...

m
Cycling is theoretically possible, but extremely rare
...
One option is to
perturb the input slightly so that it is impossible to have two solutions with the
same objective value
...
We omit the proof
that these strategies avoid cycling
...
6
If lines 4 and 9 of S IMPLEX always break ties by choosing the variable with the
smallest index, then S IMPLEX must terminate
...


878

Chapter 29 Linear Programming

Lemma 29
...

m
Proof Lemmas 29
...
6 show that if I NITIALIZE -S IMPLEX returns a slack
form for which the basic solution is feasible, S IMPLEX either reports that a linear
program is unbounded, or it terminates with a feasible solution
...
5, if S IMPLEX terminates with a feasible solution, then it
terminates in at most nCm iterations
...
3-1
Complete the proof of Lemma 29
...

29
...

29
...

29
...
A; b; c/ in standard form to slack form
...

29
...
4 Duality

879

29
...
3-7
Solve the following linear program using S IMPLEX:
minimize
subject to

x1 C

x2

2x1 C 7:5x2
5x2
20x1 C
x1 ; x2 ; x3

C

x3

C 3x3
C 10x3

10000
30000
0 :

29
...
5, we argued that there are at most mCn ways to choose
n
a set B of basic variables
...

n

29
...
We have not
yet shown that it actually finds an optimal solution to a linear program, however
...

Duality enables us to prove that a solution is indeed optimal
...
6, the max-flow min-cut theorem
...
How do we know whether f is a maximum flow? By the max-flow
min-cut theorem, if we can find a cut whose value is also jf j, then we have verified that f is indeed a maximum flow
...

Given a linear program in which the objective is to maximize, we shall describe
how to formulate a dual linear program in which the objective is to minimize and

880

Chapter 29 Linear Programming

whose optimal value is identical to that of the original linear program
...

Given a primal linear program in standard form, as in (29
...
18), we define
the dual linear program as
minimize

m
X

bi y i

(29
...
84)

yi

0

for i D 1; 2; : : : ; m :

(29
...
Each of the m constraints
in the primal has an associated variable yi in the dual, and each of the n constraints
in the dual has an associated variable xj in the primal
...
53)–(29
...
The dual of this linear program is
minimize
subject to

30y1 C 24y2
y1 C 2y2
y1 C 2y2
3y1 C 5y2
y1 ; y2 ; y3

C 36y3
C
C
C

4y3
y3
2y3

(29
...
87)
(29
...
89)
(29
...
10 that the optimal value of the dual linear program is always equal to the optimal value of the primal linear program
...

We begin by demonstrating weak duality, which states that any feasible solution to the primal linear program has a value no greater than that of any feasible
solution to the dual linear program
...
8 (Weak linear-programming duality)
Let x be any feasible solution to the primal linear program in (29
...
18) and
N
let y be any feasible solution to the dual linear program in (29
...
85)
...
4 Duality

Proof
n
X

881

We have

cj xj
N

Ä

j D1

n
m
X X
j D1

D

m
n
X X
i D1

Ä

i D1

m
X

!
aij yi xj
N N

(by inequalities (29
...
17))
...
9
Let x be a feasible solution to a primal linear program
...
If
n
m
X
X
cj xj D
N
bi y i ;
N
j D1

i D1

then x and y are optimal solutions to the primal and dual linear programs, respecN
N
tively
...
8, the objective value of a feasible solution to the primal
cannot exceed that of a feasible solution to the dual
...
Thus, if feasible
solutions x and y have the same objective value, neither can be improved
...
When
we ran the simplex algorithm on the linear program in (29
...
57), the final
iteration yielded the slack form (29
...
75) with objective ´ D 28 x3 =6
x5 =6 2x6 =3, B D f1; 2; 4g, and N D f3; 5; 6g
...
53)–(29
...
x1 ; x2 ; x3 / D
...
3 8/ C
...
2 0/ D 28
...

More precisely, suppose that the last slack form of the primal is
X
0
cj xj
´ D 0C
j 2N

xi

D

bi0

X

j 2N

0
aij xj for i 2 B :

882

Chapter 29 Linear Programming

Then, to produce an optimal dual solution, we set
(
0
cnCi if
...
91)

Thus, an optimal solution to the dual linear program defined in (29
...
90)
0
0
N
N
is y1 D 0 (since n C 1 D 4 2 B), y2 D c5 D 1=6, and y3 D c6 D 2=3
...
86), we obtain an objective value of

...
24
...
36
...
Combining these
calculations with Lemma 29
...
We now show that this approach applies in general:
we can find an optimal solution to the dual and simultaneously prove that a solution
to the primal is optimal
...
10 (Linear-programming duality)
N
N N
N
Suppose that S IMPLEX returns values x D
...
A; b; c/
...
y1 ; y2 ; : : : ; ym / be defined by equation (29
...
Then x is an optimal soN
N N
lution to the primal linear program, y is an optimal solution to the dual linear
N
program, and
n
X
j D1

cj xj D
N

m
X

bi y i :
N

(29
...
9, if we can find feasible solutions x and y that satisfy
N
N
equation (29
...
We
N
N
shall now show that the solutions x and y described in the statement of the theorem
N
N
satisfy equation (29
...

Suppose that we run S IMPLEX on a primal linear program, as given in lines
(29
...
18)
...
93)
´D 0C
j 2N

Since S IMPLEX terminated with a solution, by the condition in line 3 we know that
0
cj Ä 0 for all j 2 N :

(29
...
4 Duality

883

If we define
0
cj D 0 for all j 2 B ;

(29
...
93) as
X
0
cj xj
´ D 0C
j 2N

D

0

X

C

0
cj xj C

j 2N

D

0

C

nCm
X

X

0
0
cj xj (because cj D 0 if j 2 B)

j 2B
0
cj xj

(because N [ B D f1; 2; : : : ; n C mg)
...
96)

j D1

For the basic solution x associated with this final slack form, xj D 0 for all j 2 N ,
N
N
and ´ D 0
...
97)

j 2N

X

0
cj xj
N

j 2B
0

...
0 xj /
N

(29
...
91), is feasible for the dual
N
Pn
Pm
N
N
linear program and that its objective value i D1 bi yi equals j D1 cj xj
...
97) says that the first and last slack forms, evaluated at x, are equal
...
x1 ; x2 ; : : : ; xn /, we have
n
X
j D1

cj xj D

0

C

nCm
X

0
cj xj :

j D1

Therefore, for any particular set of values x D
...
32))


...
aij yi / xj
N N

j D1 i D1
0
cj

C

j D1

m
X

aij xj
N

i D1 j D1

n
X

so that

!

j D1

i D1

i D1

n
X


...
91) and (29
...
yi / xnCi
N N

i D1

n
X
j D1

0

0
cnCi xnCi
N

i D1

j D1

D

0
cj xj
N

C

m
X

!

aij yi xj ;
N N

i D1

n
X
j D1

0
cj

C

m
X

!
aij yi xj :
N N

(29
...
3 to equation (29
...
100)

D cj for j D 1; 2; : : : ; n :

(29
...
100), we have that i D1 bi yi D 0 , and hence the objective value
Á
Pm
N is equal to that of the primal ( 0 )
...
4 Duality

885

that the solution y is feasible for the dual problem
...
94) and
N
0
equations (29
...
Hence, for any
j D 1; 2; : : : ; n, equations (29
...
84) of the dual
...
91), we have that each yi 0,
N
N
and so the nonnegativity constraints are satisfied as well
...
We have also
shown how to construct an optimal solution to the dual linear program
...
4-1
Formulate the dual of the linear program given in Exercise 29
...

29
...
We could
produce the dual by first converting it to standard form, and then taking the dual
...

Explain how we can directly take the dual of an arbitrary linear program
...
4-3
Write down the dual of the maximum-flow linear program, as given in lines
(29
...
50) on page 860
...

29
...
51)–(29
...
Explain how to interpret this problem in terms of
graphs and flows
...
4-5
Show that the dual of the dual of a linear program is the primal linear program
...
4-6
Which result from Chapter 26 can be interpreted as weak duality for the maximumflow problem?

29
...

We conclude by proving the fundamental theorem of linear programming, which
says that the S IMPLEX procedure always produces the correct result
...
3, we assumed that we had a procedure I NITIALIZE -S IMPLEX that
determines whether a linear program has any feasible solutions, and if it does, gives
a slack form for which the basic solution is feasible
...

A linear program can be feasible, yet the initial basic solution might not be
feasible
...
102)
Ä
Ä

2
4
0 :

(29
...
104)
(29
...
This solution violates constraint (29
...
Thus, I NITIALIZE -S IMPLEX cannot just return the obvious slack
form
...
For this auxiliary linear program,
we can find (with a little work) a slack form for which the basic solution is feasible
...

Lemma 29
...
16)–(29
...
Let x0 be
a new variable, and let Laux be the following linear program with n C 1 variables:

29
...
106)

x0 Ä bi

for i D 1; 2; : : : ; m ;

(29
...
108)

j D1

xj

Then L is feasible if and only if the optimal objective value of Laux is 0
...
x1 ; x2 ; : : : ; xn /
...
Since x0
maximize x0 , this solution must be optimal for Laux
...
Then x0 D 0,
and the remaining solution values of x satisfy the constraints of L
...
A; b; c/
1 let k be the index of the minimum bi
2 if bk 0
/ is the initial basic solution feasible?
/
3
return
...
N; B; A; b; c; / be the resulting slack form for Laux
6 l D nCk
7 / Laux has n C 1 nonbasic variables and m basic variables
...
N; B; A; b; c; / D P IVOT
...

/
10 iterate the while loop of lines 3–12 of S IMPLEX until an optimal solution
to Laux is found
N
11 if the optimal solution to Laux sets x0 to 0
12
if x0 is basic
N
13
perform one (degenerate) pivot to make it nonbasic
14
from the final slack form of Laux , remove x0 from the constraints and
restore the original objective function of L, but replace each basic
variable in this objective function by the right-hand side of its
associated constraint
15
return the modified final slack form
16 else return “infeasible”

888

Chapter 29 Linear Programming

I NITIALIZE -S IMPLEX works as follows
...

(Creating the slack form requires no explicit effort, as the values of A, b, and c are
the same in both slack and standard forms
...

feasible—that is, xi
N
Otherwise, in line 4, we form the auxiliary linear program Laux as in Lemma 29
...

Since the initial basic solution to L is not feasible, the initial basic solution to the
slack form for Laux cannot be feasible either
...
Line 6 selects l D n C k as the index of the
basic variable that will be the leaving variable in the upcoming pivot operation
...
Line 8 performs that call of P IVOT, with
x0 entering and xl leaving
...
Now that we have a slack form for which
the basic solution is feasible, we can, in line 10, repeatedly call P IVOT to fully
solve the auxiliary linear program
...
To do so, we first,
in lines 12–13, handle the degenerate case in which x0 may still be basic with
value x0 D 0
...
The new basic
solution remains feasible; the degenerate pivot does not change the value of any
variable
...
The original objective function may contain both basic
and nonbasic variables
...
Line 15 then returns
this modified slack form
...

We now demonstrate the operation of I NITIALIZE -S IMPLEX on the linear program (29
...
105)
...
103) and (29
...
Using
Lemma 29
...
109)
Ä
Ä

2
4
0 :

(29
...
111)

By Lemma 29
...
If the optimal objective

29
...

We write this linear program in slack form, obtaining
´ D
x3 D
x4 D

2
4

2x1
x1

C x2
C 5x2

x0
C x0
C x0 :

We are not out of the woods yet, because the basic solution, which would set
x4 D 4, is not feasible for this auxiliary linear program
...
As line 8 indicates, we choose x0 to be the entering variable
...
After pivoting, we have the slack form
´ D
x0 D
x3 D

4
4
6

x1
C x1
x1

C 5x2
5x2
4x2

x4
C x4
C x4 :

The associated basic solution is
...
4; 0; 0; 6; 0/, which is feasiN N N N N
ble
...
In
this case, one call to P IVOT with x2 entering and x0 leaving yields
´ D

x0
x0
x1
x4
4
C
C
x2 D
5
5
5
5
4x0
9x1
x4
14
C
C
:
x3 D
5
5
5
5
This slack form is the final solution to the auxiliary problem
...
Furthermore, since
x0 D 0, we can just remove it from the set of constraints
...
In our example, we get the objective function
Ã
Â
x1
x4
4 x0
C
C
:
2x1 x2 D 2x1
5
5
5
5
Setting x0 D 0 and simplifying, we get the objective function
4 9x1
C
5
5

x4
;
5

and the slack form

890

Chapter 29 Linear Programming

9x1
x4
4
C
5
5
5
x1
x4
4
C
C
x2 D
5
5
5
9x1
x4
14
C
:
x3 D
5
5
5
This slack form has a feasible basic solution, and we can return it to procedure
S IMPLEX
...

´ D

Lemma 29
...
” Otherwise, it returns a valid slack form for which the basic solution
is feasible
...
Then by
Lemma 29
...
106)–(29
...
Furthermore, this objective value must be finite, since setting
xi D 0, for i D 1; 2; : : : ; n, and x0 D jminm fbi gj is feasible, and this solution
i D1
has objective value jminm fbi gj
...
Let x be the basic solution
N
associated with the final slack form
...

Thus the test in line 11 results in line 16 returning “infeasible
...
From
Exercise 29
...
In this case, lines 2–3 return the
slack form associated with the input
...
)
In the remainder of the proof, we handle the case in which the linear program is
feasible but we do not return in line 3
...
First, by lines 1–2, we must have
bk < 0 ;
and
bk Ä bi for each i 2 B :

(29
...
We now show

29
...
Letting x be the basic solution after the call to P IVOT, and
y and B be values returned by P IVOT, Lemma 29
...
113)
xi D
N
if i D e :
bl =ale
The call to P IVOT in line 8 has e D 0
...
107), to
include coefficients ai 0 ,
n
X

aij xj Ä bi for i D 1; 2; : : : ; m ;

(29
...
115)

(Note that ai 0 is the coefficient of x0 as it appears in inequalities (29
...
)
N
Since l 2 B, we also have that ale D 1
...
For
the remaining basic variables, we have
xi
N

D bi
D bi
D bi
0

y
ai e be
(by equation (29
...
bl =ale / (by line 3 of P IVOT)
bl
(by equation (29
...
112)) ,

1)

which implies that each basic variable is now nonnegative
...
We next execute line 10, which
solves Laux
...
11
implies that Laux has an optimal solution with objective value 0
...

Line 15 then returns this slack form
...
In particular, any linear program either is infeasible, is unbounded, or has an optimal
solution with a finite objective value
...


892

Chapter 29 Linear Programming

Theorem 29
...
has an optimal solution with a finite objective value,
2
...
is unbounded
...
” If L is unbounded, S IMPLEX
returns “unbounded
...

Proof By Lemma 29
...
” Now suppose that the linear program L is feasible
...
12,
I NITIALIZE -S IMPLEX returns a slack form for which the basic solution is feasible
...
7, therefore, S IMPLEX either returns “unbounded” or terminates
with a feasible solution
...
10
tells us that this solution is optimal
...
2 tells us the linear program L is indeed unbounded
...

Exercises
29
...

29
...

29
...
Show that the optimal objective value of L is 0
...
5-4
Suppose that we allow strict inequalities in a linear program
...


29
...
5-5
Solve the following linear program using S IMPLEX:
maximize
subject to

x1 C 3x2
x2
x1
x2
x1
x1 C 4x2
x1 ; x2

Ä
Ä
Ä

8
3
2
0 :

29
...
5-7
Solve the following linear program using S IMPLEX:
maximize
subject to

x1 C 3x2
x1 C x2
x2
x1
x1 C 4x2
x1 ; x2

Ä
Ä
Ä

1
3
2
0 :

29
...
6)–(29
...

29
...
Let D be the dual of P
...
Both P and D have optimal solutions with finite objective values
...
P is feasible, but D is infeasible
...
D is feasible, but P is infeasible
...
Neither P nor D is feasible
...

a
...
The number of variables and constraints that you use in the linear-programming problem should be polynomial
in n and m
...
Show that if we have an algorithm for the linear-inequality feasibility problem,
we can use it to solve a linear-programming problem
...

29-2 Complementary slackness
Complementary slackness describes a relationship between the values of primal
variables and dual constraints and between the values of dual variables and primal constraints
...
16)–(29
...
83)–(29
...
Complementary slackness states that the following conditions
are necessary and sufficient for x and y to be optimal:
N
N
m
X

aij yi D cj or xj D 0 for j D 1; 2; : : : ; n
N
N

i D1

and
n
X
j D1

aij xj D bi or yi D 0 for i D 1; 2; : : : ; m :
N
N

Problems for Chapter 29

895

a
...
53)–(29
...

b
...

c
...
16)–(29
...
y1 ; y2 ; : : : ; ym /
N
N N
such that
1
...
83)–(29
...
i D1 aij yi D cj for all j such that xj > 0, and
Pn
N
3
...

N
29-3 Integer linear programming
An integer linear-programming problem is a linear-programming problem with
the additional constraint that the variables x must take on integral values
...
5-3 shows that just determining whether an integer linear program has a
feasible solution is NP-hard, which means that there is no known polynomial-time
algorithm for this problem
...
Show that weak duality (Lemma 29
...

b
...
10) does not always hold for an integer linear
program
...
Given a primal linear program in standard form, let us define P to be the optimal objective value for the primal linear program, D to be the optimal objective
value for its dual, IP to be the optimal objective value for the integer version of
the primal (that is, the primal with the added constraint that the variables take
on integer values), and ID to be the optimal objective value for the integer version of the dual
...
Then Farkas’s lemma states that
exactly one of the systems

896

Chapter 29 Linear Programming

Ax Ä 0 ;
c Tx > 0
and
AT y D c ;
y
0
is solvable, where x is an n-vector and y is an m-vector
...

29-5 Minimum-cost circulation
In this problem, we consider a variant of the minimum-cost-flow problem from
Section 29
...
Instead,
we are given, as before, a flow network and edge costs a
...
A flow is feasible
if it satisfies the capacity constraint on every edge and flow conservation at every
vertex
...
We
call this problem the minimum-cost-circulation problem
...
Formulate the minimum-cost-circulation problem as a linear program
...
Suppose that for all edges
...
u; / > 0
...

c
...
That is given a maximum-flow problem instance G D
...
V 0 ; E 0 / with edge
capacities c 0 and edge costs a0 such that you can discern a solution to the
maximum-flow problem from a solution to the minimum-cost-circulation problem
...
Formulate the single-source shortest-path problem as a minimum-cost-circulation problem linear program
...
A number of books are devoted exclusively to linear programming, including those by
Chv´ tal [69], Gass [130], Karloff [197], Schrijver [303], and Vanderbei [344]
...
The
coverage in this chapter draws on the approach taken by Chv´ tal
...
Dantzig
in 1947
...
As a result, applications of linear programming flourished, along with
several algorithms
...
This history appears in a number of places, including the notes in [69] and [197]
...
G
...
Z
...
B
...
S
...
Gr¨ tschel, Lov´ sz, and Schrijver
o
a
[154] describe how to use the ellipsoid algorithm to solve a variety of problems in
combinatorial optimization
...

Karmarkar’s paper [198] includes a description of the first interior-point algorithm
...
Good surveys appear in the article of Goldfarb and Todd [141] and the book by Ye [361]
...
V
...
J
...
The simplex algorithm usually performs very well in
practice and many researchers have tried to give theoretical justification for this
empirical observation
...
H
...
Spielman and
Teng [322] made progress in this area, introducing the “smoothed analysis of algorithms” and applying it to the simplex algorithm
...
Particularly noteworthy is the network-simplex algorithm, which is the simplex algorithm, specialized to network-flow problems
...
See, for example, the article by Orlin [268] and the citations therein
...
n/
time, but the straightforward method of multiplying them takes ‚
...
In this
chapter, we shall show how the fast Fourier transform, or FFT, can reduce the time
to multiply polynomials to ‚
...

The most common use for Fourier transforms, and hence the FFT, is in signal
processing
...
Fourier analysis allows us to express the signal as a weighted sum of
phase-shifted sinusoids of varying frequencies
...
Among the
many everyday applications of FFT’s are compression techniques used to encode
digital video and audio information, including MP3 files
...

Polynomials
A polynomial in the variable x over an algebraic field F represents a function A
...
x/ D

n 1
X

aj x j :

j D0

We call the values a0 ; a1 ; : : : ; an 1 the coefficients of the polynomial
...
A
polynomial A
...
A/ D k
...
Therefore, the degree of a polynomial of
degree-bound n may be any integer between 0 and n 1, inclusive
...
For polynomial addition, if A
...
x/ are polynomials of degree-bound n, their sum is a polyno-

Chapter 30

Polynomials and the FFT

899

mial C
...
x/ D A
...
x/ for all x in the
underlying field
...
x/ D

n 1
X

aj x j

j D0

and
B
...
x/ D

n 1
X

cj x j ;

j D0

where cj D aj C bj for j D 0; 1; : : : ; n 1
...
x/ D 6x 3 C 7x 2 10x C 9 and B
...
x/ D 4x 3 C 7x 2 6x C 4
...
x/ and B
...
x/ is a polynomial of degree-bound 2n 1 such that
C
...
x/B
...
You probably have multiplied polynomials before, by multiplying each term in A
...
x/
and then combining terms with equal powers
...
x/ D 6x 3 C 7x 2 10x C 9 and B
...
x/ is
C
...
1)

j D0

where
cj D

j
X
kD0

ak bj

k

:

(30
...
C / D degree
...
B/, implying that if A is a polynomial of degree-bound na and B is a polynomial of degree-bound nb , then C is a
polynomial of degree-bound na C nb 1
...

Chapter outline
Section 30
...
The straightforward methods for multiplying polynomials—equations (30
...
2)—take ‚
...
n/ time when we represent them
in point-value form
...
n lg n/ time by converting between the two representations
...
2
...
2, to perform the conversions
...
3 shows how to implement
the FFT quickly in both serial and parallel models
...

the symbol i exclusively to denote

30
...
In this section, we introduce the two representations and show
how to combine them so that we can multiply two degree-bound n polynomials
in ‚
...

Coefficient representation
Pn 1
j
of degreeA coefficient representation of a polynomial A
...
a0 ; a1 ; : : : ; an 1 /
...

The coefficient representation is convenient for certain operations on polynomials
...
x/ at a given
point x0 consists of computing the value of A
...
We can evaluate a polynomial
in ‚
...
x0 / D a0 C x0
...
a2 C

C x0
...
an 1 //

// :

30
...
a0 ; a1 ; : : : ; an 1 / and b D
...
n/ time: we just produce
the coefficient vector c D
...

Now, consider multiplying two degree-bound n polynomials A
...
x/ represented in coefficient form
...
1)
and (30
...
n2 /, since we must multiply
each coefficient in the vector a by each coefficient in the vector b
...
The resulting
coefficient vector c, given by equation (30
...
Since multiplying polynomials and
computing convolutions are fundamental computational problems of considerable
practical importance, this chapter concentrates on efficient algorithms for them
...
x/ of degree-bound n is a set of
n point-value pairs
f
...
x1 ; y1 /; : : : ;
...
xk /

(30
...
A polynomial has many different point-value representations, since we can use any set of n distinct points x0 ; x1 ; : : : ; xn 1 as a basis for
the representation
...
xk / for k D 0; 1; : : : ; n 1
...
n2 /
...
n lg n/
...
The following theorem shows
that interpolation is well defined when the desired interpolating polynomial must
have a degree-bound equal to the given number of point-value pairs
...
1 (Uniqueness of an interpolating polynomial)
For any set f
...
x1 ; y1 /; : : : ;
...
x/ of degree-bound n
such that yk D A
...


902

Chapter 30 Polynomials and the FFT

Proof The proof relies on the existence of the inverse of a certain matrix
...
3) is equivalent to the matrix equation

˙1
1
:
:
:

2
x0
2
x1
:
:
:

x0
x1
:
:
:

1 xn

1

2
xn

1

n
x0
n
x1
:
::
:
:
:
n
xn

1
1

1
1

˙

˙

a0
a1
:
:
:
an

D
1

y0
y1
:
:
:
yn

:

(30
...
x0 ; x1 ; : : : ; xn 1 / and is known as a Vandermonde matrix
...
xk xj / ;
0Äj
and therefore, by Theorem D
...
Thus, we can solve for the coefficients aj uniquely given the point-value
representation:
a D V
...
1 describes an algorithm for interpolation based on
solving the set (30
...
Using the LU decomposition algorithms
of Chapter 28, we can solve these equations in time O
...

A faster algorithm for n-point interpolation is based on Lagrange’s formula:
Y

...
5)
yk Y
A
...
xk xj /
kD0
j ¤k

You may wish to verify that the right-hand side of equation (30
...
xk / D yk for all k
...
1-5 asks you
how to compute the coefficients of A using Lagrange’s formula in time ‚
...

Thus, n-point evaluation and interpolation are well-defined inverse operations
that transform between the coefficient representation of a polynomial and a pointvalue representation
...
n2 /
...
For addition, if C
...
x/ C B
...
xk / D A
...
xk / for
any point xk
...
Although
the approaches described here are mathematically correct, small differences in the inputs or round-off
errors during computation can cause large differences in the result
...
1 Representing polynomials

903

f
...
x1 ; y1 /; : : : ;
...
x0 ; y0 /;
...
xn 1 ; yn 1 /g

(note that A and B are evaluated at the same n points), then a point-value representation for C is
0
0
f
...
x1 ; y1 C y1 /; : : : ;
...
n/
...
If C
...
x/B
...
xk / D A
...
xk / for any point xk , and
we can pointwise multiply a point-value representation for A by a point-value representation for B to obtain a point-value representation for C
...
C / D degree
...
B/; if A and B are of
degree-bound n, then C is of degree-bound 2n
...
When we
multiply these together, we get n point-value pairs, but we need 2n pairs to interpolate a unique polynomial C of degree-bound 2n
...
1-4
...
Given an extended point-value representation
for A,
f
...
x1 ; y1 /; : : : ;
...
x0 ; y0 /;
...
x2n 1 ; y2n 1 /g ;

then a point-value representation for C is
0
0
0
f
...
x1 ; y1 y1 /; : : : ;
...
n/, much less than
the time required to multiply polynomials in coefficient form
...
For this problem, we know of no simpler approach than converting the
polynomial to coefficient form first, and then evaluating it at the new point
...
n2 /

Evaluation
Time ‚
...
!2n /; B
...
!2n /; B
...
!2n 1 /; B
...
n lg n/

Pointwise multiplication
Time ‚
...
!2n /
1
C
...
!2n 1 /

Figure 30
...
Representations
on the top are in coefficient form, while those on the bottom are in point-value form
...
The !2n terms are complex
...


on whether we can convert a polynomial quickly from coefficient form to pointvalue form (evaluate) and vice versa (interpolate)
...
n lg n/
time
...
2, if we choose “complex roots of unity” as
the evaluation points, we can produce a point-value representation by taking the
discrete Fourier transform (or DFT) of a coefficient vector
...
Section 30
...
n lg n/ time
...
1 shows this strategy graphically
...
The product of two polynomials of degree-bound n is a polynomial of
degree-bound 2n
...

Because the vectors have 2n elements, we use “complex
...
1
...
n lg n/-time procedure for multiplying
two polynomials A
...
x/ of degree-bound n, where the input and output
representations are in coefficient form
...

1
...
x/ and B
...


30
...
Evaluate: Compute point-value representations of A
...
x/ of length 2n
by applying the FFT of order 2n on each polynomial
...
2n/th roots of unity
...
Pointwise multiply: Compute a point-value representation for the polynomial
C
...
x/B
...
This representation contains the value of C
...
2n/th root of unity
...
Interpolate: Create the coefficient representation of the polynomial C
...

Steps (1) and (3) take time ‚
...
n lg n/
...

Theorem 30
...
n lg n/, with both
the input and output representations in coefficient form
...
1-1
Multiply the polynomials A
...
1) and (30
...


x2 C x

10 and B
...
1-2
Another way to evaluate a polynomial A
...
x/ by the polynomial
...
x/
of degree-bound n 1 and a remainder r, such that
A
...
x/
...
x0 / D r
...
x/ in time ‚
...

30
...
x/ D j D0 an 1 j x j from a pointPn 1
value representation for A
...

30
...
(Hint: Using Theorem 30
...
1-5
Show how to use equation (30
...
n2 /
...
x xj / and then divide by

...
1-2
...
n/
...
1-6
Explain what is wrong with the “obvious” approach to polynomial division using
a point-value representation, i
...
, dividing the corresponding y values
...

30
...
We
wish to compute the Cartesian sum of A and B, defined by
C D fx C y W x 2 A and y 2 Bg :
Note that the integers in C are in the range from 0 to 20n
...
Show how to solve the problem in O
...
(Hint:
Represent A and B as polynomials of degree at most 10n
...
2 The DFT and FFT
In Section 30
...
n lg n/ time
...
n lg n/ time
...

To interpret this formula, we use the definition of the exponential of a complex
number:
e i u D cos
...
u/ :
Figure 30
...
The value

30
...
2 The values of !8 ; !8 ; : : : ; !8 in the complex plane, where !8 D e 2
cipal 8th root of unity
...
6)

is the principal nth root of unity;2 all other complex nth roots of unity are powers
of !n
...
3)
...
Zn ; C/ modulo n, since !n D !n D 1 implies that
j k
j

...
Similarly, !n 1 D !n 1
...

Lemma 30
...
7)

The lemma follows directly from equation (30
...
This alternative definition tends to be
n
n
used for signal-processing applications
...


908

Chapter 30 Polynomials and the FFT

Corollary 30
...
2-1
...
5 (Halving lemma)
If n > 0 is even, then the squares of the n complex nth roots of unity are the n=2
complex
...

k
k
Proof By the cancellation lemma, we have
...
Note that if we square all of the complex nth roots of unity, then we
obtain each
...
!n
D
D
D

2kCn
!n
2k n
!n !n
2k
!n
k

...
We could also have used CorolThus, !n and !n
n=2
kCn=2
k
D !n , and
lary 30
...
!n /
...
!n

As we shall see, the halving lemma is essential to our divide-and-conquer approach for converting between coefficient and point-value representations of polynomials, since it guarantees that the recursive subproblems are only half as large
...
6 (Summation lemma)
For any integer n 1 and nonzero integer k not divisible by n,
n 1
X

k
!n

j

D0:

j D0

Proof
have

Equation (A
...
2 The DFT and FFT
n 1
X

k
!n

j

D

j D0

909

k

...
!n /k 1
k
!n 1
k
1

...


The DFT
Recall that we wish to evaluate a polynomial
A
...
We assume that A is given in coefficient form: a D
...
Let
us define the results yk , for k D 0; 1; : : : ; n 1, by
k
yk D A
...
8)

j D0

The vector y D
...
a0 ; a1 ; : : : ; an 1 /
...
a/
...
a/ in time ‚
...
n2 / time of the straightforward
method
...
Although strategies

3 The length n is actually what we referred to as 2n in Section 30
...
In the context of polynomial multiplication, therefore,
we are actually working with complex
...


910

Chapter 30 Polynomials and the FFT

for dealing with non-power-of-2 sizes are known, they are beyond the scope of this
book
...
x/ separately to define the two new polynomials
AŒ0
...
x/ of degree-bound n=2:
AŒ0
...
x/ D a1 C a3 x C a5 x 2 C

C an 2 x n=2
C an 1 x n=2

1
1

;
:

Note that AŒ0 contains all the even-indexed coefficients of A (the binary representation of the index ends in 0) and AŒ1 contains all the odd-indexed coefficients (the
binary representation of the index ends in 1)
...
x/ D AŒ0
...
x 2 / ;
0
1
n
so that the problem of evaluating A
...
9)
1

reduces to

1
...
x/ and AŒ1
...
!n /2 ;
...
!n 1 /2 ;

(30
...
combining the results according to equation (30
...

By the halving lemma, the list of values (30
...
n=2/th roots of unity, with each root occurring
exactly twice
...
n=2/th roots of unity
...

We have now successfully divided an n-element DFTn computation into two n=2element DFTn=2 computations
...
a0 ; a1 ; : : : ; an 1 /, where n is a power of 2
...
2 The DFT and FFT

911

R ECURSIVE -FFT
...
a0 ; a2 ; : : : ; an 2 /
7 aŒ1 D
...
aŒ0 /
9 y Œ1 D R ECURSIVE -FFT
...
n=2/ D yk
! yk
13
! D ! !n
14 return y
/ y is assumed to be a column vector
/
The R ECURSIVE -FFT procedure works as follows
...
Lines
4, 5, and 13 guarantee that ! is updated properly so that whenever lines 11–12
k
are executed, we have ! D !n
...
) Lines 8–9 perform the recursive DFTn=2 computations, setting, for k D
0; 1; : : : ; n=2 1,
Œ0
k
yk D AŒ0
...
!n=2 / ;
k
2k
or, since !n=2 D !n by the cancellation lemma,
Œ0
2k
yk D AŒ0
...
!n / :

912

Chapter 30 Polynomials and the FFT

Lines 11–12 combine the results of the recursive DFTn=2 calculations
...
!n / C !n AŒ1
...
9))
...
!n /

For yn=2 ; yn=2C1 ; : : : ; yn 1 , letting k D 0; 1; : : : ; n=2
Œ0
ykC
...
n=2/ Œ1
D yk C !n
yk

kC
...
n=2/ Œ1
2k
A
...
!n / C !n
2kCn
kC
...
!n / (since !n
D !n )
D AŒ0
...
n=2/
/
(by equation (30
...

D A
...

Œ1
k
Lines 11 and 12 multiply each value yk by !n , for k D 0; 1; : : : ; n=2 1
...
Because we use each
k
k
factor !n in both its positive and negative forms, we call the factors !n twiddle
factors
...
n/, where n is the
length of the input vector
...
n/ D 2T
...
n/
D ‚
...
n lg n/ using the fast Fourier transform
...
We interpolate by writing the DFT
as a matrix equation and then looking at the form of the matrix inverse
...
4), we can write the DFT as the matrix product y D Vn a,
where Vn is a Vandermonde matrix containing the appropriate powers of !n :

30
...
n
!n

1
3
!n
6
!n
9
!n
:
:
:
1/

3
...
n
!n 1/
3
...
n
!n

1/
...
k; j / entry of Vn is !n , for j; k D 0; 1; : : : ; n 1
...

For the inverse operation, which we write as a D DFTn 1
...


Theorem 30
...
j; k/ entry of Vn 1 is !n kj =n
...
Consider the
...
!n kj =n/
...
j
!n

0

j/

=n :

kD0

This summation equals 1 if j 0 D j , and it is 0 otherwise by the summation lemma
(Lemma 30
...
Note that we rely on
...

Given the inverse matrix Vn 1 , we have that DFTn 1
...
11)

kD0

for j D 0; 1; : : : ; n 1
...
8) and (30
...
2-4)
...
n lg n/ time as well
...
n lg n/
...


914

Chapter 30 Polynomials and the FFT

Theorem 30
...
DFT2n
...
b// ;

where the vectors a and b are padded with 0s to length 2n and denotes the componentwise product of two 2n-element vectors
...
2-1
Prove Corollary 30
...

30
...
0; 1; 2; 3/
...
2-3
Do Exercise 30
...
n lg n/-time scheme
...
2-4
Write pseudocode to compute DFTn 1 in ‚
...

30
...
Give a recurrence for the running time, and solve the recurrence
...
2-6 ?
Suppose that instead of performing an n-element FFT over the field of complex
numbers (where n is even), we use the ring Zm of integers modulo m, where
m D 2t n=2 C 1 and t is an arbitrary positive integer
...
Prove that the DFT and the inverse DFT
are well defined in this system
...
2-7
Given a list of values ´0 ; ´1 ; : : : ; ´n 1 (possibly with repetitions), show how to find
the coefficients of a polynomial P
...
Your procedure should run in time
O
...
(Hint: The polynomial P
...
x/ is a
multiple of
...
)
30
...
a0 ; a1 ; : : : ; an 1 / is the vector y D
Pn 1

...
The

30
...

Show how to evaluate the chirp transform in time O
...
(Hint: Use the equation
yk D ´k

2 =2

n 1
X

aj ´j

2 =2

Á
´


...
)

30
...
First, we
shall examine an iterative version of the FFT algorithm that runs in ‚
...
2
...
) Then, we shall use the insights that
led us to the iterative implementation to design an efficient parallel FFT circuit
...
In compiler terminology, we call such a value a
common subexpression
...

for k D 0 to n=2 1
Œ1
t D ! yk
Œ0
yk D yk C t
Œ0
t
ykC
...
3
...
In Figure 30
...
The tree has one node for each call of the procedure, labeled

916

Chapter 30 Polynomials and the FFT

Œ0
yk

Œ0
k Œ1
yk C !n yk

+

k
!n
Œ1
yk

Œ0
k Œ1
yk C !n yk

Œ0
yk
k
!n

Œ0
yk





k Œ1
!n yk

Œ0
yk

Œ1
yk

(a)

k Œ1
!n yk

(b)

Figure 30
...
(a) The two input values enter from the left, the twiddle facŒ1
k
tor !n is multiplied by yk , and the sum and difference are output on the right
...
We will use this representation in a parallel FFT circuit
...
4 The tree of input vectors to the recursive calls of the R ECURSIVE -FFT procedure
...


by the corresponding input vector
...
The first call appears in
the left child, and the second call appears in the right child
...
First, we take the elements in pairs, compute the DFT of each pair using
one butterfly operation, and replace the pair with its DFT
...
Next, we take these n=2 DFTs in pairs and compute the
DFT of the four vector elements they come from by executing two butterfly operations, replacing two 2-element DFTs with one 4-element DFT
...
We continue in this manner until the vector holds two

...

To turn this bottom-up approach into code, we use an array AŒ0 : : n 1 that
initially holds the elements of the input vector a in the order in which they appear

30
...
4
...
) Because we have to combine
DFTs on each level of the tree, we introduce a variable s to count the levels, ranging
from 1 (at the bottom, when we are combining pairs to form 2-element DFTs)
to lg n (at the top, when we are combining two
...
The algorithm therefore has the following structure:
1 for s D 1 to lg n
2
for k D 0 to n 1 by 2s
3
combine the two 2s 1 -element DFTs in
AŒk : : k C 2s 1 1 and AŒk C 2s 1 : : k C 2s
into one 2s -element DFT in AŒk : : k C 2s 1



We can express the body of the loop (line 3) as more precise pseudocode
...
The twiddle factor used in each butterfly operation depends on the value of s; it is a power of !m ,
where m D 2s
...
)
We introduce another temporary variable u that allows us to perform the butterfly
operation in place
...
The code first calls the auxiliary procedure
B IT-R EVERSE -C OPY
...

I TERATIVE -FFT
...
a; A/
2 n D a:length
/ n is a power of 2
/
3 for s D 1 to lg n
4
m D 2s
5
!m D e 2 i=m
6
for k D 0 to n 1 by m
7
! D1
8
for j D 0 to m=2 1
9
t D ! AŒk C j C m=2
10
u D AŒk C j 
11
AŒk C j  D u C t
12
AŒk C j C m=2 D u t
13
! D ! !m
14 return A
How does B IT-R EVERSE -C OPY get the elements of the input vector a into the
desired order in the array A? The order in which the leaves appear in Figure 30
...
That is, if we let rev
...
k/
...
4, for example, the leaves appear in the order 0; 4; 2; 6; 1; 5; 3; 7; this sequence in binary is
000; 100; 010; 110; 001; 101; 011; 111, and when we reverse the bits of each value
we get the sequence 000; 001; 010; 011; 100; 101; 110; 111
...
Stripping off the low-order bit at each level, we continue this process down the tree, until we get the order given by the bit-reversal
permutation at the leaves
...
k/, the B IT-R EVERSE -C OPY procedure is simple:
B IT-R EVERSE -C OPY
...
k/ D ak
The iterative FFT implementation runs in time ‚
...
The call to B ITR EVERSE -C OPY
...
n lg n/ time, since we iterate n times
and can reverse an integer between 0 and n 1, with lg n bits, in O
...

(In practice, because we usually know the initial value of n in advance, we would
probably code a table mapping k to rev
...
n/ time with a low hidden constant
...
) To complete the
proof that I TERATIVE -FFT runs in time ‚
...
n/, the number
of times the body of the innermost loop (lines 8–13) executes, is ‚
...
The
for loop of lines 6–13 iterates n=m D n=2s times for each value of s, and the
innermost loop of lines 8–13 iterates m=2 D 2s 1 times
...
n/ D

D

lg n
X n
2s
2s
sD1
lg n
Xn
sD1

2

D ‚
...
3 Efficient FFT implementations

919

a0

y0
0
!2

a1

y1
0
!4

a2

y2
1
!4

0
!2

a3

y3
0
!8

a4

y4
0
!2

1
!8

a5

y5
0
!4

2
!8

1
!4

3
!8

a6

y6
0
!2

a7

y7
stage s D 1

stage s D 2

stage s D 3

Figure 30
...
Each
butterfly operation takes as input the values on two wires, along with a twiddle factor, and it produces
as outputs the values on two wires
...
Only the top and bottom wires passing
through a butterfly interact with it; wires that pass through the middle of a butterfly do not affect
that butterfly, nor are their values changed by that butterfly
...
This circuit has depth ‚
...
n lg n/ butterfly operations altogether
...
We
will express the parallel FFT algorithm as a circuit
...
5 shows a parallel
FFT circuit, which computes the FFT on n inputs, for n D 8
...
The depth of the circuit—the
maximum number of computational elements between any output and any input
that can reach it—is therefore ‚
...

The leftmost part of the parallel FFT circuit performs the bit-reverse permutation, and the remainder mimics the iterative I TERATIVE -FFT procedure
...
The value of s in each iteration within

920

Chapter 30 Polynomials and the FFT

I TERATIVE -FFT corresponds to a stage of butterflies shown in Figure 30
...
For
s D 1; 2; : : : ; lg n, stage s consists of n=2s groups of butterflies (corresponding to
each value of k in I TERATIVE -FFT), with 2s 1 butterflies per group (corresponding
to each value of j in I TERATIVE -FFT)
...
5 correspond to the butterfly operations of the innermost loop (lines 9–12 of I TERATIVE FFT)
...

Exercises
30
...
0; 2; 3; 1; 4;
5; 7; 9/
...
3-2
Show how to implement an FFT algorithm with the bit-reversal permutation occurring at the end, rather than at the beginning, of the computation
...
)
30
...

30
...
Suppose that exactly one adder has failed, but that you don’t know
which one
...
How efficient is your method?

Problems
30-1 Divide-and-conquer multiplication
a
...
(Hint: One of the multiplications is
...
c C d /
...
Give two divide-and-conquer algorithms for multiplying two polynomials of
degree-bound n in ‚
...
The first algorithm should divide the input
polynomial coefficients into a high half and a low half, and the second algorithm
should divide them according to whether their index is odd or even
...
Show how to multiply two n-bit integers in O
...

30-2 Toeplitz matrices
A Toeplitz matrix is an n n matrix A D
...


1;j 1

for

a
...
Describe how to represent a Toeplitz matrix so that you can add two n
Toeplitz matrices in O
...


n

c
...
n lg n/-time algorithm for multiplying an n n Toeplitz matrix by a
vector of length n
...

d
...
Analyze
its running time
...
8) to d dimensions
...
aj1 ;j2 ;:::;jd /
whose dimensions are n1 ; n2 ; : : : ; nd , where n1 n2 nd D n
...
, 0 Ä kd < nd
...
Show that we can compute a d -dimensional DFT by computing 1-dimensional
DFTs on each dimension in turn
...
Then, using the result of the DFTs
along dimension 1 as the input, we compute n=n2 separate 1-dimensional DFTs
along dimension 2
...

b
...


922

Chapter 30 Polynomials and the FFT

c
...
n lg n/,
independent of d
...
x/ of degree-bound n, we define its tth derivative by

„ A
...
t 1/
...
t /
...
a0 ; a1 ; : : : ; an 1 / of A
...
t /
...

a
...
x/ D

n 1
X

bj
...
t /
...
Explain how to find b0 ; b1 ; : : : ; bn
k D 0; 1; : : : ; n 1
...
n/ time
...
n lg n/ time, given A
...
Prove that
A
...
j /g
...
j / D aj j Š and
(
g
...
l/Š if
...
Explain how to evaluate A
...
n lg n/
time
...
x/ at x0 in
O
...


Problems for Chapter 30

923

30-5 Polynomial evaluation at multiple points
We have seen how to evaluate a polynomial of degree-bound n at a single point in
O
...
We have also discovered how to evaluate such a
polynomial at all n complex roots of unity in O
...
We
shall now show how to evaluate a polynomial of degree-bound n at n arbitrary
points in O
...

To do so, we shall assume that we can compute the polynomial remainder when
one such polynomial is divided by another in O
...
For example, the remainder of 3x 3 C x 2 3x C 1 when divided by
x 2 C x C 2 is

...
x 2 C x C 2/ D

7x C 5 :

Pn 1
Given the coefficient representation of a polynomial A
...
xQ A
...
xn 1 /
...
x/ D kDi
...
x/ D A
...
x/
...
x/ has degree at most j i
...
Prove that A
...
x

´/ D A
...


b
...
x/ D A
...
x/ D A
...

c
...
x/ D Qij
...
x/ and
Qkj
...
x/ mod Pkj
...

d
...
n lg2 n/-time algorithm to evaluate A
...
x1 /; : : : ; A
...

30-6 FFT using modular arithmetic
As defined, the discrete Fourier transform requires us to compute with complex
numbers, which can result in a loss of precision due to round-off errors
...
An example of such a problem is that of multiplying two polynomials
with integer coefficients
...
2-6 gives one approach, using a modulus of
length
...
This problem gives another approach, which uses a modulus of the more reasonable length O
...
Let n be a power of 2
...
Suppose that we search for the smallest k such that p D k n C 1 is prime
...

(The value of k might be much larger or smaller, but we can reasonably expect
to examine O
...
) How does the expected
length of p compare to the length of n?

924

Chapter 30 Polynomials and the FFT

Let g be a generator of Zp , and let w D g k mod p
...
Argue that the DFT and the inverse DFT are well-defined inverse operations
modulo p, where w is used as a principal nth root of unity
...
Show how to make the FFT and its inverse work modulo p in time O
...
lg n/ bits take unit time
...

d
...
0; 5; 3; 7; 7; 2; 1; 6/
...


Chapter notes
Van Loan’s book [343] provides an outstanding treatment of the fast Fourier transform
...
For an excellent introduction
to signal processing, a popular FFT application area, see the texts by Oppenheim
and Schafer [266] and Oppenheim and Willsky [267]
...

Fourier analysis is not limited to 1-dimensional data
...
The books by Gonzalez and
Woods [146] and Pratt [281] discuss multidimensional Fourier transforms and their
use in image processing, and books by Tolimieri, An, and Lu [338] and Van Loan
[343] discuss the mathematics of multidimensional fast Fourier transforms
...

The FFT had in fact been discovered many times previously, but its importance was
not fully realized before the advent of modern digital computers
...
F
...

Frigo and Johnson [117] developed a fast and flexible implementation of the
FFT, called FFTW (“fastest Fourier transform in the West”)
...
Before
actually computing the DFTs, FFTW executes a “planner,” which, by a series of
trial runs, determines how best to decompose the FFT computation for the given
problem size on the host machine
...
Furthermore, FFTW has the unusual advantage of taking

...


Notes for Chapter 30

925

Although the standard Fourier transform assumes that the input represents points
that are uniformly spaced in the time domain, other techniques can approximate the
FFT on “nonequispaced” data
...


31

Number-Theoretic Algorithms

Number theory was once viewed as a beautiful but largely useless subject in pure
mathematics
...
These
schemes are feasible because we can find large primes easily, and they are secure
because we do not know how to factor the product of large primes (or solve related
problems, such as computing discrete logarithms) efficiently
...

Section 31
...
Section 31
...
Section 31
...
Section 31
...
mod n/ by using Euclid’s algorithm
...
5
...
6
considers powers of a given number a, modulo n, and presents a repeated-squaring
algorithm for efficiently computing ab mod n, given a, b, and n
...
Section 31
...
Section 31
...
We can use this test to find large primes efficiently,
which we need to do in order to create keys for the RSA cryptosystem
...
9 reviews a simple but effective heuristic for factoring small integers
...

Size of inputs and cost of arithmetic computations
Because we shall be working with large integers, we need to adjust how we think
about the size of an input and about the cost of elementary arithmetic operations
...
Thus,

31
...
An algorithm
with integer inputs a1 ; a2 ; : : : ; ak is a polynomial-time algorithm if it runs in time
polynomial in lg a1 ; lg a2 ; : : : ; lg ak , that is, polynomial in the lengths of its binaryencoded inputs
...
By counting the number of such
arithmetic operations that an algorithm performs, we have a basis for making a
reasonable estimate of the algorithm’s actual running time on a computer
...
It
thus becomes convenient to measure how many bit operations a number-theoretic
algorithm requires
...
ˇ 2 / bit operations
...
ˇ 2 / by simple algorithms
...
1-12
...
For example, a simple divide-and-conquer method for multiplying two
ˇ-bit integers has a running time of ‚
...
ˇ lg ˇ lg lg ˇ/
...
ˇ 2 /
algorithm is often best, and we shall use this bound as a basis for our analyses
...


31
...

Divisibility and divisors
The notion of one integer being divisible by another is key to the theory of numbers
...

Every integer divides 0
...
If d j a, then we also
say that a is a multiple of d
...

If d j a and d
0, we say that d is a divisor of a
...
A

928

Chapter 31 Number-Theoretic Algorithms

divisor of a nonzero integer a is at least 1 but not greater than jaj
...

Every positive integer a is divisible by the trivial divisors 1 and a
...
For example, the factors of 20 are 2, 4, 5, and 10
...
Primes have many special properties and play a
critical role in number theory
...
1-2 asks you to prove that there are infinitely many primes
...
For
example, 39 is composite because 3 j 39
...
Similarly, the integer 0 and all negative integers are
neither prime nor composite
...
Much number theory is based upon refining
this partition by classifying the nonmultiples of n according to their remainders
when divided by n
...

We omit the proof (but see, for example, Niven and Zuckerman [265])
...
1 (Division theorem)
For any integer a and any positive integer n, there exist unique integers q and r
such that 0 Ä r < n and a D q n C r
...
The value r D a mod n
is the remainder (or residue) of the division
...

We can partition the integers into n equivalence classes according to their remainders modulo n
...
Using the notation defined on page 54, we can say that writing
a 2 Œbn is the same as writing a Á b
...
The set of all such equivalence
classes is

31
...
1)

When you see the definition
Zn D f0; 1; : : : ; n

1g ;

(31
...
1) with the understanding that 0
represents Œ0n , 1 represents Œ1n , and so on; each class is represented by its smallest
nonnegative element
...
For example, if we refer to 1 as a member of Zn , we are really referring
to Œn 1n , since 1 Á n 1
...

Common divisors and greatest common divisors
If d is a divisor of a and d is also a divisor of b, then d is a common divisor of a
and b
...
Note that 1 is a common divisor
of any two integers
...
a C b/ and d j
...
3)

More generally, we have that
d j a and d j b implies d j
...
4)

for any integers x and y
...
5)

The greatest common divisor of two integers a and b, not both zero, is the
largest of the common divisors of a and b; we denote it by gcd
...
For example,
gcd
...
5; 7/ D 1, and gcd
...
If a and b are both nonzero,
then gcd
...
jaj ; jbj/
...
0; 0/ to
be 0; this definition is necessary to make standard properties of the gcd function
(such as equation (31
...

The following are elementary properties of the gcd function:
gcd
...
a; b/
gcd
...
a; 0/
gcd
...
b; a/ ;
gcd
...
jaj ; jbj/ ;
jaj ;
for any k 2 Z :
jaj

(31
...
7)
(31
...
9)
(31
...
a; b/
...
2
If a and b are any integers, not both zero, then gcd
...

Proof Let s be the smallest positive such linear combination of a and b, and let
s D ax C by for some x; y 2 Z
...
Equation (3
...
ax C by/
D a
...
qy/ ;
and so a mod s is a linear combination of a and b as well
...
Therefore, we have that s j a and, by analogous reasoning, s j b
...
Equation (31
...
a; b/
implies that gcd
...
a; b/ divides both a and b and s is a linear
combination of a and b
...
a; b/ j s and s > 0 imply that gcd
...

Combining gcd
...
a; b/ Ä s yields gcd
...
We conclude
that s is the greatest common divisor of a and b
...
3
For any integers a and b, if d j a and d j b, then d j gcd
...

Proof This corollary follows from equation (31
...
a; b/ is a linear
combination of a and b by Theorem 31
...

Corollary 31
...
an; bn/ D n gcd
...
If n > 0, then gcd
...

Corollary 31
...
a; n/ D 1, then n j b
...
1-5
...
1 Elementary number-theoretic notions

931

Relatively prime integers
Two integers a and b are relatively prime if their only common divisor is 1, that
is, if gcd
...
For example, 8 and 15 are relatively prime, since the divisors
of 8 are 1, 2, 4, and 8, and the divisors of 15 are 1, 3, 5, and 15
...

Theorem 31
...
a; p/ D 1 and gcd
...
ab; p/ D 1
...
2 that there exist integers x, y, x 0 , and y 0 such

ax C py D 1 ;
bx 0 C py 0 D 1 :
Multiplying these equations and rearranging, we have
ab
...
ybx 0 C y 0 ax C pyy 0 / D 1 :
Since 1 is thus a positive linear combination of ab and p, an appeal to Theorem 31
...

Integers n1 , n2 ,
...
ni ; nj / D 1
...

Theorem 31
...

Proof Assume for the purpose of contradiction that p j ab, but that p − a and
p − b
...
a; p/ D 1 and gcd
...
Theorem 31
...
ab; p/ D 1, contradicting our assumption that p j ab, since p j ab
implies gcd
...
This contradiction completes the proof
...
7 is that we can uniquely factor any composite
integer into a product of primes
...
8 (Unique factorization)
There is exactly one way to write any composite integer a as a product of the form
e
e
a D p11 p22

e
pr r ;

where the pi are prime, p1 < p2 <
Proof

< pr , and the ei are positive integers
...
1-11
...

Exercises
31
...

31
...
(Hint: Show that none of the primes
p1 ; p2 ; : : : ; pk divide
...
)
31
...

31
...
k; p/ D 1
...
1-5
Prove Corollary 31
...

31
...
Conclude that for all integers


...
mod p/ :
31
...
x mod b/ mod a D x mod a
for any x
...
mod b/ implies x Á y
...


31
...
1-8
For any integer k > 0, an integer n is a kth power if there exists an integer a such
that ak D n
...
Show how to determine whether a given ˇ-bit integer n is a
nontrivial power in time polynomial in ˇ
...
1-9
Prove equations (31
...
10)
...
1-10
Show that the gcd operator is associative
...
a; gcd
...
gcd
...
1-11 ?
Prove Theorem 31
...

31
...
Your algorithms should run in time ‚
...

31
...
Argue that if multiplication or division of integers whose length
is at most ˇ takes time M
...
M
...
(Hint: Use a divide-and-conquer approach, obtaining the top and
bottom halves of the result with separate recursions
...
2 Greatest common divisor
In this section, we describe Euclid’s algorithm for efficiently computing the greatest common divisor of two integers
...

We restrict ourselves in this section to nonnegative integers
...
8), which states that gcd
...
jaj ; jbj/
...
a; b/ for positive integers a and b from the
prime factorizations of a and b
...
11)

f
f
b D p1 1 p2 2

f
pr r ;

(31
...
2-1 asks you to show,
min
...
e
gcd
...
e
pr r ;fr / :

(31
...
9, however, the best algorithms to date for factoring
do not run in polynomial time
...

Euclid’s algorithm for computing greatest common divisors relies on the following theorem
...
9 (GCD recursion theorem)
For any nonnegative integer a and any positive integer b,
gcd
...
b; a mod b/ :
Proof We shall show that gcd
...
b; a mod b/ divide each other, so
that by equation (31
...

We first show that gcd
...
b; a mod b/
...
a; b/, then
d j a and d j b
...
8), a mod b D a qb, where q D ba=bc
...
4) implies that
d j
...
Therefore, since d j b and d j
...
3 implies
that d j gcd
...
a; b/ j gcd
...
14)

Showing that gcd
...
a; b/ is almost the same
...
b; a mod b/, then d j b and d j
...
Since a D qb C
...
a mod b/
...
4), we conclude that d j a
...
a; b/ by Corollary 31
...
b; a mod b/ j gcd
...
15)

Using equation (31
...
14) and (31
...


31
...
C
...
We express Euclid’s algorithm as a
recursive program based directly on Theorem 31
...
The inputs a and b are arbitrary
nonnegative integers
...
a; b/
1 if b == 0
2
return a
3 else return E UCLID
...
30; 21/:
E UCLID
...
21; 9/
E UCLID
...
3; 0/
3:

This computation calls E UCLID recursively three times
...
9 and the property that if
the algorithm returns a in line 2, then b D 0, so that equation (31
...
a; b/ D gcd
...
The algorithm cannot recurse indefinitely, since the
second argument strictly decreases in each recursive call and is always nonnegative
...

The running time of Euclid’s algorithm
We analyze the worst-case running time of E UCLID as a function of the size of
a and b
...
To justify this
assumption, observe that if b > a 0, then E UCLID
...
b; a/
...
Similarly, if b D a > 0, the procedure terminates after one recursive call,
since a mod b D 0
...
Our analysis makes use of the Fibonacci numbers Fk , defined by
the recurrence (3
...

Lemma 31
...
a; b/ performs k
a FkC2 and b FkC1
...
For the basis of the induction, let
2 D F3
...
Then, b
1 D F2 , and since a > b, we must have a
b >
...

Assume inductively that the lemma holds if k 1 recursive calls are made; we
shall then prove that the lemma holds for k recursive calls
...
a; b/ calls E UCLID
...
The inductive hypothesis then implies that b FkC1
(thus proving part of the lemma), and a mod b Fk
...
a mod b/ D b C
...
Thus,

b C
...


Theorem 31
...
a; b/
makes fewer than k recursive calls
...
11 is the best possible by
1 recursive calls
showing that the call E UCLID
...
We use induction on k
...
F3 ; F2 / makes exactly one recursive call, to E UCLID
...
(We have to
start at k D 2, because when k D 1 we do not have F2 > F1
...
Fk ; Fk 1 / makes exactly k 2 recursive calls
...
1-1,
we have FkC1 mod Fk D Fk 1
...
FkC1 ; Fk / D gcd
...
Fk ; Fk 1 / :
Therefore, the call E UCLID
...
Fk ; Fk 1 /, or exactly k 1 times, meeting the upper bound of Theorem 31
...

p
p
Since Fk is approximately k = 5, where is the golden ratio
...
24), the number of recursive calls in E UCLID is O
...
(See

31
...
1 How E XTENDED -E UCLID computes gcd
...
Each line shows one level of the
recursion: the values of the inputs a and b, the computed value ba=bc, and the values d , x, and y
returned
...
d; x; y/ returned becomes the triple
...
The call E XTENDED -E UCLID
...
3; 11; 14/, so that gcd
...
11/ C 78 14
...
2-5 for a tighter bound
...
ˇ/ arithmetic operations and O
...
ˇ 2 / bit operations)
...
ˇ 2 / bound on the number of bit
operations
...

Specifically, we extend the algorithm to compute the integer coefficients x and y
such that
d D gcd
...
16)

Note that x and y may be zero or negative
...
The procedure E XTENDED E UCLID takes as input a pair of nonnegative integers and returns a triple of the
form
...
16)
...
a; b/
1 if b == 0
2
return
...
d 0 ; x 0 ; y 0 / D E XTENDED -E UCLID
...
d; x; y/ D
...
d; x; y/
Figure 31
...
99; 78/
...

Line 1 is equivalent to the test “b == 0” in line 1 of E UCLID
...
If b ¤ 0, E XTENDED -E UCLID first
computes
...
b; a mod b/ and
d 0 D bx 0 C
...
17)

As for E UCLID, we have in this case d D gcd
...
b; a mod b/
...
17)
using the equation d D d 0 and equation (3
...
a b ba=bc/y 0
D ay 0 C b
...

Since the number of recursive calls made in E UCLID is equal to the number
of recursive calls made in E XTENDED -E UCLID, the running times of E UCLID
and E XTENDED -E UCLID are the same, to within a constant factor
...
lg b/
...
2-1
Prove that equations (31
...
12) imply equation (31
...

31
...
d; x; y/ that the call E XTENDED -E UCLID
...

31
...
a; n/ D gcd
...
2-4
Rewrite E UCLID in an iterative form that uses only a constant amount of memory
(that is, stores only a constant number of integer values)
...
2-5
If a > b 0, show that the call E UCLID
...
Improve this bound to 1 C log
...
a; b//
...
2-6
What does E XTENDED -E UCLID
...


31
...
2-7
Define the gcd function for more than two arguments by the recursive equation
gcd
...
a0 ; gcd
...
Show that the gcd function
returns the same answer independent of the order in which its arguments are specified
...
a0 ; a1 ; : : : ; an / D
a0 x0 C a1 x1 C C an xn
...
n C lg
...

31
...
a1 ; a2 ; : : : ; an / to be the least common multiple of the n integers
a1 ; a2 ; : : : ; an , that is, the smallest nonnegative integer that is a multiple of each ai
...
a1 ; a2 ; : : : ; an / efficiently using the (two-argument) gcd
operation as a subroutine
...
2-9
Prove that n1 , n2 , n3 , and n4 are pairwise relatively prime if and only if
gcd
...
n1 n3 ; n2 n4 / D 1 :
More generally, show that n1 ; n2 ; : : : ; nk are pairwise relatively prime if and only
if a set of dlg ke pairs of numbers derived from the ni are relatively prime
...
3 Modular arithmetic
Informally, we can think of modular arithmetic as arithmetic as usual over the
integers, except that if we are working modulo n, then every result x is replaced
by the element of f0; 1; : : : ; n 1g that is equivalent to x, modulo n (that is, x is
replaced by x mod n)
...
A more formal model for modular
arithmetic, which we now give, is best described within the framework of group
theory
...
S; ˚/ is a set S together with a binary operation ˚ defined on S for
which the following properties hold:
1
...

2
...

3
...
a ˚ b/ ˚ c D a ˚
...


940

Chapter 31 Number-Theoretic Algorithms

4
...

As an example, consider the familiar group
...
If a group
...
If a group
...

The groups defined by modular addition and multiplication
We can form two finite abelian groups by using addition and multiplication modulo n, where n is a positive integer
...
1
...

We can easily define addition and multiplication operations for Zn , because the
equivalence class of two integers uniquely determines the equivalence class of their
sum or product
...
mod n/ and b Á b 0
...
mod n/ ;
ab
Á a0 b 0

...
18)

(We can define subtraction similarly on Zn by Œan n Œbn D Œa bn , but division is more complicated, as we shall see
...
We add, subtract,
and multiply as usual on the representatives, but we replace each result x by the
representative of its class, that is, by x mod n
...
Zn ; Cn /
...

Figure 31
...
Z6 ; C6 /
...
12
The system
...

Proof Equation (31
...
Zn ; Cn / is closed
...
3 Modular arithmetic

941

+6

0

1

2

3

4

5

·15

1

0
1
2

0
1
2
3
4
5

1
2
3
4
5
0

2
3
4
5
0
1

3
4
5
0
1
2

4
5
0
1
2
3

5
0
1
2
3
4

1
2
4
7
8
11
13
14

1 2 4 7 8 11 13 14
2 4 8 14 1 7 11 13
4 8 1 13 2 14 7 11
7 14 13 4 11 2 1 8
8 1 2 11 4 13 14 7
11 7 14 2 13 1 8 4
13 11 7 1 14 8 4 2
14 13 11 8 7 4 2 1

3
4
5

2

4

(a)

7

8

11 13 14

(b)

Figure 31
...
Equivalence classes are denoted by their representative elements
...
Z6 ; C6 /
...
Z15 ; 15 /
...
Œan Cn Œbn / Cn Œcn D
D
D
D
D

Œa C bn Cn Œcn
Œ
...
b C c/n
Œan Cn Œb C cn
Œan Cn
...
Zn ; Cn / is 0 (that is, Œ0n )
...

Using the definition of multiplication modulo n, we define the multiplicative
group modulo n as
...
The elements of this group are the set Zn of elements
in Zn that are relatively prime to n, so that each one has a unique inverse, modulo n:
Zn D fŒan 2 Zn W gcd
...
a C k n/

...
By Exercise 31
...
a; n/ D 1 implies
gcd
...
Since Œan D fa C k n W k 2 Zg, the set Zn
is well defined
...
(Here we denote an element Œa15 as a; for example, we denote Œ715 as 7
...
2(b) shows the
group
...
For example, 8 11 Á 13
...
The identity for this group is 1
...
13
The system
...

Proof Theorem 31
...
Zn ; n / is closed
...
12
...
To show the existence of inverses, let a be an element
of Zn and let
...
a; n/
...
19)

or, equivalently,
ax Á 1
...
Furthermore, we claim
that Œxn 2 Zn
...
19) demonstrates that the smallest positive linear combination of x and n must be 1
...
2 implies
that gcd
...
We defer the proof that inverses are uniquely defined until
Corollary 31
...

As an example of computing multiplicative inverses, suppose that a D 5 and
n D 11
...
a; n/ returns
...
1; 2; 1/, so that
1 D 5
...
Thus, Œ 211 (i
...
, Œ911 ) is the multiplicative inverse of Œ511
...
Zn ; Cn / and
...
Also,
equivalences modulo n may also be interpreted as equations in Zn
...
mod n/ ;
Œan n Œxn D Œbn :
As a further convenience, we sometimes refer to a group
...
We may thus refer to the groups

...
Zn ; n / as Zn and Zn , respectively
...
a 1 mod n/
...
mod n/
...
3 Modular arithmetic

943

we have that 7 1 Á 13
...
mod 15/, so that
4=7 Á 4 13 Á 7
...

The size of Zn is denoted
...
This function, known as Euler’s phi function,
satisfies the equation
Ã
Y Â
1
;
(31
...
n/ D n
p
p W p is prime and p j n

so that p runs over all the primes dividing n (including n itself, if n is prime)
...
Intuitively, we begin with a list of the n
remainders f0; 1; : : : ; n 1g and then, for each prime p that divides n, cross out
every multiple of p in the list
...
45/ D 45 1
3
5
 Ã Ã
4
2
D 45
3
5
D 24 :
If p is prime, then Zp D f1; 2; : : : ; p
Ã
Â
1

...
n/ < n
n

...
21)

1, although it can be shown that
(31
...
A somewhat simpler
(but looser) lower bound for n > 5 is
n
:
(31
...
n/ >
6 ln ln n
The lower bound (31
...
n/
De :
(31
...
S; ˚/ is a group, S 0 Â S, and
...
S 0 ; ˚/ is a subgroup
of
...
For example, the even integers form a subgroup of the integers under the
operation of addition
...


944

Chapter 31 Number-Theoretic Algorithms

Theorem 31
...
S; ˚/ is a finite group and S 0 is any nonempty subset of S such that a ˚ b 2 S 0
for all a; b 2 S 0 , then
...
S; ˚/
...
3-3
...

The following theorem provides an extremely useful constraint on the size of a
subgroup; we omit the proof
...
15 (Lagrange’s theorem)
If
...
S 0 ; ˚/ is a subgroup of
...

A subgroup S 0 of a group S is a proper subgroup if S 0 ¤ S
...
8 of the Miller-Rabin primality
test procedure
...
16
If S 0 is a proper subgroup of a finite group S, then jS 0 j Ä jSj =2
...
14 gives us an easy way to produce a subgroup of a finite group
...
Specifically, define a
...
k/ D

k
M

˚a
œ:

a D a˚a˚

i D1

k

For example, if we take a D 2 in the group Z6 , the sequence a
...
2/ ; a
...
k/ D ka mod n, and in the group Zn , we have a
...
We define the subgroup generated by a, denoted hai or
...
k/ W k

1g :

We say that a generates the subgroup hai or that a is a generator of hai
...
Since the associativity
of ˚ implies

31
...
i / ˚ a
...
i Cj / ;
hai is closed and therefore, by Theorem 31
...
For example,
in Z6 , we have
h0i D f0g ;
h1i D f0; 1; 2; 3; 4; 5g ;
h2i D f0; 2; 4g :
Similarly, in Z7 , we have
h1i D f1g ;
h2i D f1; 2; 4g ;
h3i D f1; 2; 3; 4; 5; 6g :
The order of a (in the group S), denoted ord
...
t / D e
...
17
For any finite group
...
a/ D jhaij
...
a/
...
t / D e and a
...
t / ˚ a
...
k/ for
k
1, if i > t, then a
...
j / for some j < i
...
t /
...
1/ ; a
...
t / g,
and so jhaij Ä t
...
1/ ; a
...
t / is distinct
...
i / D a
...
Then, a
...
j Ck/
for k 0
...
i C
...
j C
...
t j / < t but t is the least positive value such that a
...
Therefore, each element of the sequence a
...
2/ ; : : : ; a
...
We
conclude that ord
...

Corollary 31
...
1/ ; a
...
a/; that is, a
...
j /
if and only if i Á j
...

Consistent with the above corollary, we define a
...
i / as a
...
a/, for all integers i
...
19
If
...
jSj/ D e :

946

Chapter 31 Number-Theoretic Algorithms

Proof Lagrange’s theorem (Theorem 31
...
a/ j jSj, and so
jSj Á 0
...
a/
...
jSj/ D a
...

Exercises
31
...
Z4 ; C4 / and
...
Show that
these groups are isomorphic by exhibiting a one-to-one correspondence ˛ between
their elements such that a C b Á c
...
a/ ˛
...
c/

...

31
...

31
...
14
...
3-4
Show that if p is prime and e is a positive integer, then

...
p

1/ :

31
...
x/ D ax mod n is a permutation of Zn
...
4 Solving modular linear equations
We now consider the problem of finding solutions to the equation
ax Á b
...
25)

where a > 0 and n > 0
...
7
...
25)
...

Let hai denote the subgroup of Zn generated by a
...
x/ W x > 0g D
fax mod n W x > 0g, equation (31
...
Lagrange’s theorem (Theorem 31
...
The
following theorem gives us a precise characterization of hai
...
4 Solving modular linear equations

947

Theorem 31
...
a; n/, then
hai D hd i D f0; d; 2d; : : : ;
...
26)

in Zn , and thus
jhaij D n=d :
Proof We begin by showing that d 2 hai
...
a; n/
produces integers x 0 and y 0 such that ax 0 C ny 0 D d
...
mod n/, so
that d 2 hai
...

Since d 2 hai, it follows that every multiple of d belongs to hai, because any
multiple of a multiple of a is itself a multiple of a
...
n=d / 1/d g
...

We now show that hai  hd i
...
However, d j a and d j n, and
so d j m by equation (31
...
Therefore, m 2 hd i
...
To see that jhaij D n=d ,
observe that there are exactly n=d multiples of d between 0 and n 1, inclusive
...
21
The equation ax Á b
...
a; n/
...
mod n/ is solvable if and only if Œb 2 hai, which
is the same as saying

...
n=d /

1/d g ;

by Theorem 31
...
If 0 Ä b < n, then b 2 hai if and only if d j b, since the
members of hai are precisely the multiples of d
...
b mod n/, since b
and b mod n differ by a multiple of n, which is itself a multiple of d
...
22
The equation ax Á b
...
a; n/, or it has no solutions
...
mod n/ has a solution, then b 2 hai
...
17,
ord
...
18 and Theorem 31
...
If b 2 hai, then b
appears exactly d times in the sequence ai mod n, for i D 0; 1; : : : ; n 1, since

948

Chapter 31 Number-Theoretic Algorithms

the length-
...
The indices x of the d positions for which ax mod n D b are the solutions
of the equation ax Á b
...

Theorem 31
...
a; n/, and suppose that d D ax 0 C ny 0 for some integers x 0 and y 0
(for example, as computed by E XTENDED -E UCLID)
...
mod n/ has as one of its solutions the value x0 , where
x0 D x 0
...
b=d /
...
b=d /
...
mod n/ ;

(because ax 0 Á d
...
mod n/
...
24
Suppose that the equation ax Á b
...
a; n/) and that x0 is any solution to this equation
...
n=d / for
i D 0; 1; : : : ; d 1
...
n=d / < n for i D 0; 1; : : : ; d 1, the
values x0 ; x1 ; : : : ; xd 1 are all distinct, modulo n
...
mod n/, we have ax0 mod n Á b
...
Thus, for i D 0; 1; : : : ; d 1, we
have
axi mod n D
D
D
Á

a
...
ax0 C ai n=d / mod n
ax0 mod n (because d j a implies that ai n=d is a multiple of n)
b
...
mod n/, making xi a solution, too
...
22, the
equation ax Á b
...

We have now developed the mathematics needed to solve the equation ax Á b

...
The inputs
a and n are arbitrary positive integers, and b is an arbitrary integer
...
4 Solving modular linear equations

949

M ODULAR -L INEAR -E QUATION -S OLVER
...
d; x 0 ; y 0 / D E XTENDED -E UCLID
...
b=d / mod n
4
for i D 0 to d 1
5
print
...
n=d // mod n
6 else print “no solutions”
As an example of the operation of this procedure, consider the equation 14x Á
30
...
Calling E XTENDED E UCLID in line 1, we obtain
...
2; 7; 1/
...
Line 3 computes x0 D
...
15/ mod 100 D 95
...

The procedure M ODULAR -L INEAR -E QUATION -S OLVER works as follows
...
a; n/, along with two values x 0 and y 0 such that d D
ax 0 C ny 0 , demonstrating that x 0 is a solution to the equation ax 0 Á d
...

If d does not divide b, then the equation ax Á b
...
21
...
Otherwise, line 3 computes a solution x0 to ax Á b
...
23
...
24 states that
adding multiples of
...
The for
loop of lines 4–5 prints out all d solutions, beginning with x0 and spaced n=d
apart, modulo n
...
lg n C gcd
...
lg n/ arithmetic operations, and each iteration of the for loop of lines 4–5 performs a constant number of
arithmetic operations
...
24 give specializations of particular
interest
...
25
For any n > 1, if gcd
...
mod n/ has a unique
solution, modulo n
...

Corollary 31
...
a; n/ D 1, then the equation ax Á 1
...
Otherwise, it has no solution
...
26, we can use the notation a 1 mod n to refer to the
multiplicative inverse of a, modulo n, when a and n are relatively prime
...
a; n/ D 1, then the unique solution to the equation ax Á 1
...
a; n/ D 1 D ax C ny
implies ax Á 1
...
Thus, we can compute a
E XTENDED -E UCLID
...
4-1
Find all solutions to the equation 35x Á 10
...

31
...
mod n/ implies x Á y
...
a; n/ D 1
...
a; n/ D 1 is necessary by supplying a
counterexample with gcd
...

31
...
b=d / mod
...

31
...
mod p/ be a polynoLet p be prime and f
...
We say that a 2 Zp
is a zero of f if f
...
mod p/
...
x/ Á
...
x/
...
x/ of degree t 1
...
x/ of degree t can have
at most t distinct zeros modulo p
...
5 The Chinese remainder theorem
Around A
...
100, the Chinese mathematician Sun-Ts˘ solved the problem of findu
ing those integers x that leave remainders 2, 3, and 2 when divided by 3, 5, and 7
respectively
...
5 The Chinese remainder theorem

951

for arbitrary integers k
...

The Chinese remainder theorem has two major applications
...
First, the Chinese remainder theorem is a descriptive “structure theorem”
that describes the structure of Zn as identical to that of the Cartesian product
Znk with componentwise addition and multiplication modulo ni
Zn1 Zn2
in the ith component
...

Theorem 31
...
Consider the
correspondence
a $
...
27)

where a 2 Zn , ai 2 Zni , and
ai D a mod ni
for i D 1; 2; : : : ; k
...
27) is a one-to-one correspondence (bijecZnk
...
That is, if
a $
...
b1 ; b2 ; : : : ; bk / ;
then

...
a1 C b1 / mod n1 ; : : : ;
...
a b/ mod n $
...
ak bk / mod nk / ;

...
a1 b1 mod n1 ; : : : ; ak bk mod nk / :

(31
...
29)
(31
...

Going from a to
...

Computing a from inputs
...
We begin
by defining mi D n=ni for i D 1; 2; : : : ; k; thus mi is the product of all of the nj ’s
other than ni : mi D n1 n2 ni 1 ni C1 nk
...
mi 1 mod ni /

(31
...
Equation (31
...
6), Corollary 31
...
Finally, we can compute a as a function of a1 , a2 ,
...
a1 c1 C a2 c2 C

C ak ck /
...
32)

We now show that equation (31
...
mod ni / for i D
1; 2; : : : ; k
...
mod ni /, which implies that cj Á
mj Á 0
...
Note also that ci Á 1
...
31)
...
0; 0; : : : ; 0; 1; 0; : : : ; 0/ ;
a vector that has 0s everywhere except in the ith coordinate, where it has a 1; the ci
thus form a “basis” for the representation, in a certain sense
...
mod ni /
a Á ai ci
1
Á ai mi
...
mod ni /

...
mod ni / for i D 1; 2; : : : ; k
...

Finally, equations (31
...
30) follow directly from Exercise 31
...
x mod n/ mod ni for any x and i D 1; 2; : : : ; k
...

Corollary 31
...
mod ni / ;
for i D 1; 2; : : : ; k, has a unique solution modulo n for the unknown x
...
29
If n1 ; n2 ; : : : ; nk are pairwise relatively prime and n D n1 n2
integers x and a,
x Á a
...
mod n/ :

nk , then for all

31
...
3 An illustration of the Chinese remainder theorem for n1 D 5 and n2 D 13
...
In row i, column j is shown the value of a, modulo 65, such
that a mod 5 D i and a mod 13 D j
...
Similarly, row 4,
column 12 contains a 64 (equivalent to 1)
...

Similarly, c2 D 40 means that moving right by a column increases a by 40
...


As an example of the application of the Chinese remainder theorem, suppose we
are given the two equations
a Á 2
...
mod 13/ ;
so that a1 D 2, n1 D m2 D 5, a2 D 3, and n2 D m1 D 13, and we wish
to compute a mod 65, since n D n1 n2 D 65
...
mod 5/ and
5 1 Á 8
...
2 mod 5/ D 26 ;
c2 D 5
...
mod 65/
Á 52 C 120

...
mod 65/ :
See Figure 31
...

Thus, we can work modulo n by working modulo n directly or by working in the
transformed representation using separate modulo ni computations, as convenient
...

Exercises
31
...
mod 5/ and x Á 5
...


954

Chapter 31 Number-Theoretic Algorithms

31
...

31
...
27, if gcd
...
a

1

mod n/ $
...
a2 1 mod n2 /; : : : ;
...
5-4
Under the definitions of Theorem 31
...
x/ Á 0
...
x/ Á 0
...
x/ Á 0
...
,
f
...
mod nk /
...
6 Powers of an element
Just as we often consider the multiples of a given element a, modulo n, we consider
the sequence of powers of a, modulo n, where a 2 Zn :
a0 ; a1 ; a2 ; a3 ; : : : ;

(31
...
Indexing from 0, the 0th value in this sequence is a mod n D 1, and
the ith value is ai mod n
...
a/ (the “order of a, modulo n”) denote the order of a
in Zn
...
2/ D 3
...
n/ as the size of Zn (see Section 31
...
19 into the notation of Zn to obtain Euler’s theorem and specialize it
to Zp , where p is prime, to obtain Fermat’s theorem
...
30 (Euler’s theorem)
For any integer n > 1,
a


...
mod n/ for all a 2 Zn :

31
...
31 (Fermat’s theorem)
If p is prime, then
ap

1

Á 1
...
21),
...


Fermat’s theorem applies to every element in Zp except 0, since 0 62 Zp
...
mod p/ if p is prime
...
g/ D jZn j, then every element in Zn is a power of g, modulo n, and
g is a primitive root or a generator of Zn
...
If Zn possesses a primitive
root, the group Zn is cyclic
...

Theorem 31
...

If g is a primitive root of Zn and a is any element of Zn , then there exists a ´ such
that g ´ Á a
...
This ´ is a discrete logarithm or an index of a, modulo n,
to the base g; we denote this value as indn;g
...

Theorem 31
...
mod n/ holds if and
only if the equation x Á y
...
n// holds
...
mod
...
Then, x D y C k
...
Therefore,
gx Á
Á
Á
Á

g yCk
...
g
...
mod

...
mod

...
mod n/
...
n/, Corollary 31
...
n/
...
mod n/, then we must have x Á y
...
n//
...
The
following theorem will be useful in our development of a primality-testing algorithm in Section 31
...


956

Chapter 31 Number-Theoretic Algorithms

Theorem 31
...
mod p e /
has only two solutions, namely x D 1 and x D
Proof
e

p j
...
34)
1
...
34) is equivalent to
1/
...
x 1/ or p j
...
(Otherwise,
by property (31
...
x C 1/
...
)
If p −
...
p e ; x 1/ D 1, and by Corollary 31
...
x C 1/
...
mod p e /
...
x C 1/,
then gcd
...
5 implies that p e j
...
mod p e /
...
mod p e / or x Á 1
...

A number x is a nontrivial square root of 1, modulo n, if it satisfies the equation
x 2 Á 1
...
For example, 6 is a nontrivial square root of 1, modulo 35
...
34 in the correctness proof in
Section 31
...

Corollary 31
...

Proof By the contrapositive of Theorem 31
...

If x 2 Á 1
...
mod 2/, and so all square roots of 1, modulo 2,
are trivial
...
Finally, we must have n > 1 for a nontrivial
square root of 1 to exist
...

Raising to powers with repeated squaring
A frequently occurring operation in number-theoretic computations is raising one
number to a power modulo another number, also known as modular exponentiation
...
Modular exponentiation is an essential operation in many primality-testing routines and in the RSA
public-key cryptosystem
...

Let hbk ; bk 1 ; : : : ; b1 ; b0 i be the binary representation of b
...
6 Powers of an element

i
bi
c
d

9
1
1
7

8
0
2
49

7
0
4
157

6
0
8
526

957

5
1
17
160

4
1
35
241

3
0
70
298

2
0
140
166

1
0
280
67

0
0
560
1

Figure 31
...
mod n/, where
a D 7, b D 560 D h1000110000i, and n D 561
...
The final result is 1
...
) The following procedure computes ac mod n as c is increased by
doublings and incrementations from 0 to b
...
a; b; n/
1 c D0
2 d D1
3 let hbk ; bk 1 ; : : : ; b0 i be the binary representation of b
4 for i D k downto 0
5
c D 2c
6
d D
...
d a/ mod n
10 return d
The essential use of squaring in line 6 of each iteration explains the name “repeated
squaring
...
4; the sequence
of exponents used appears in the row of the table labeled by c
...
The value of c is the same as the prefix hbk ; bk 1 ; : : : ; bi C1 i of the binary
representation of b, and
2
...

We use this loop invariant as follows:
Initialization: Initially, i D k, so that the prefix hbk ; bk 1 ; : : : ; bi C1 i is empty,
which corresponds to c D 0
...


958

Chapter 31 Number-Theoretic Algorithms

Maintenance: Let c 0 and d 0 denote the values of c and d at the end of an iteration
of the for loop, and thus the values prior to the next iteration
...
If bi D 0, then d 0 D d 2 mod n D
...
If bi D 1, then d 0 D d 2 a mod n D
...
In either case, d D ac mod n prior to the next
iteration
...
Thus, c D b, since c has the value of the
prefix hbk ; bk 1 ; : : : ; b0 i of b’s binary representation
...

If the inputs a, b, and n are ˇ-bit numbers, then the total number of arithmetic operations required is O
...
ˇ 3 /
...
6-1
Draw a table showing the order of every element in Z11
...
x/ for all x 2 Z11
...
6-2
Give a modular exponentiation algorithm that examines the bits of b from right to
left instead of left to right
...
6-3
Assuming that you know
...


1

mod n for any a 2 Zn

31
...
A public-key cryptosystem also enables a party
to append an unforgeable “digital signature” to the end of an electronic message
...
It can be easily checked by anyone, forged by no one, yet loses its validity
if any bit of the message is altered
...
It is the perfect tool

31
...

The RSA public-key cryptosystem relies on the dramatic difference between the
ease of finding large prime numbers and the difficulty of factoring the product of
two large prime numbers
...
8 describes an efficient procedure for finding
large prime numbers, and Section 31
...

Public-key cryptosystems
In a public-key cryptosystem, each participant has both a public key and a secret
key
...
For example, in the RSA cryptosystem,
each key consists of a pair of integers
...

Each participant creates his or her own public and secret keys
...
In fact,
it is often convenient to assume that everyone’s public key is available in a public directory, so that any participant can easily obtain the public key of any other
participant
...

Let D denote the set of permissible messages
...
In the simplest, and original, formulation of publickey cryptography, we require that the public and secret keys specify one-to-one
functions from D to itself
...
/ and the function corresponding to her secret key SA by SA
...
/ and SA
...
We assume that the functions
PA
...
/ are efficiently computable given the corresponding key PA or SA
...
That is,
M
M

D SA
...
M // ;
D PA
...
M //

(31
...
36)

for any message M 2 D
...

In a public-key cryptosystem, we require that no one but Alice be able to compute the function SA
...
This assumption is crucial
to keeping encrypted mail sent to Alice private and to knowing that Alice’s digital signatures are authentic
...

The assumption that only Alice can compute SA
...
M /

SA

M

eavesdropper

C

Figure 31
...
Bob encrypts the message M using Alice’s public
key PA and transmits the resulting ciphertext C D PA
...
An eavesdropper who captures the transmitted ciphertext gains no information about M
...
C /
...
/, the inverse function to SA
...
In order
to design a workable public-key cryptosystem, we must figure out how to create
a system in which we can reveal a transformation PA
...
This task appears
formidable, but we shall see how to accomplish it
...
5
...
The scenario for sending the message
goes as follows
...

Bob computes the ciphertext C D PA
...

When Alice receives the ciphertext C , she applies her secret key SA to retrieve
the original message: SA
...
PA
...

Because SA
...
/ are inverse functions, Alice can compute M from C
...
/, Alice is the only one who can compute M
from C
...
/, only Alice can understand the transmitted message
...
(There are other ways of approaching the problem of
constructing digital signatures, but we shall not go into them here
...
Figure 31
...

Alice computes her digital signature
key SA and the equation D SA
...


for the message M 0 using her secret

31
...
M 0 /

PA
=?

M0

0


...
6 Digital signatures in a public-key system
...
M 0 / to it
...
M 0 ; / to Bob,
who verifies it by checking the equation M 0 D PA
...
M 0 ; / as
a message that Alice has signed
...
M 0 ; / to Bob
...
M 0 ; /, he can verify that it originated from Alice by using Alice’s public key to verify the equation M 0 D PA
...
) If the equation
holds, then Bob concludes that the message M 0 was actually signed by Alice
...
M 0 ; /
is an attempted forgery
...

A digital signature must be verifiable by anyone who has access to the signer’s
public key
...
For example, the message might
be an electronic check from Alice to Bob
...

A signed message is not necessarily encrypted; the message can be “in the clear”
and not protected from disclosure
...

The signer first appends his or her digital signature to the message and then encrypts the resulting message/signature pair with the public key of the intended recipient
...
The recipient can then
verify the signature using the public key of the signer
...

The RSA cryptosystem
In the RSA public-key cryptosystem, a participant creates his or her public and
secret keys with the following procedure:
1
...
The primes
p and q might be, say, 1024 bits each
...
Compute n D pq
...
Select a small odd integer e that is relatively prime to
...
20), equals
...
q 1/
...
Compute d as the multiplicative inverse of e, modulo
...
(Corollary 31
...
We can use the technique of
Section 31
...
n/
...
Publish the pair P D
...

6
...
d; n/ as the participant’s RSA secret key
...
To transform a message M associated with a public key P D
...
M / D M e mod n :

(31
...
d; n/, compute
S
...
38)

These equations apply to both encryption and signatures
...
To verify a signature, the public key of the signer is applied to it, rather
than to a message to be encrypted
...
6
...
e; n/ and secret key
...
1/, lg d Ä ˇ, and lg n Ä ˇ
...
1/ modular multiplications and uses O
...
Applying a secret
key requires O
...
ˇ 3 / bit operations
...
36 (Correctness of RSA)
The RSA equations (31
...
38) define inverse transformations of Zn satisfying equations (31
...
36)
...
7 The RSA public-key cryptosystem

Proof

963

From equations (31
...
38), we have that for any M 2 Zn ,

P
...
M // D S
...
M // D M ed
...
n/ D
...
p

1/
...
q

1/,

1/

for some integer k
...
mod p/, we have
M ed

Á
Á
Á
Á

M
...
q 1/
M
...
q
M
...
q 1/
M

1/


...
mod

...
mod

p/
p/
p/
p/ :

(by Theorem 31
...
mod p/ if M Á 0
...
Thus,
M ed Á M
...
Similarly,
M ed Á M
...
Thus, by Corollary 31
...
mod n/
for all M
...
If an adversary can factor the modulus n in a public key, then
the adversary can derive the secret key from the public key, using the knowledge
of the factors p and q in the same way that the creator of the public key used them
...
The converse statement, that if factoring large integers is hard, then breaking RSA is hard, is unproven
...
And as we shall see in Section 31
...
By randomly selecting and multiplying together two 1024-bit
primes, we can create a public key that cannot be “broken” in any feasible amount
of time with current technology
...

In order to achieve security with the RSA cryptosystem, however, we should
use integers that are quite long—hundreds or even more than one thousand bits

964

Chapter 31 Number-Theoretic Algorithms

long—to resist possible advances in the art of factoring
...

To create moduli of such sizes, we must be able to find large primes efficiently
...
8 addresses this problem
...
With such a system, the encryption and
decryption keys are identical
...
Here, C is as long as M , but K
is quite short
...
Since K is
short, computing PB
...
M /)
...
C; PB
...
K/ to obtain K and then uses K
to decrypt C , obtaining M
...

This approach combines RSA with a public collision-resistant hash function h—a
function that is easy to compute but for which it is computationally infeasible to
find two messages M and M 0 such that h
...
M 0 /
...
M / is
a short (say, 256-bit) “fingerprint” of the message M
...
M /, which she
then encrypts with her secret key
...
M; SA
...
M /// to Bob as her signed
version of M
...
M / and verifying
that PA applied to SA
...
M // as received equals h
...
Because no one can create
two messages with the same fingerprint, it is computationally infeasible to alter a
signed message and preserve the validity of the signature
...
For example, assume there is a “trusted authority” T whose public key
is known by everyone
...
” This certificate is “self-authenticating” since
everyone knows PT
...
Because her key was signed by T , the recipient knows that Alice’s
key is really Alice’s
...
7-1
Consider an RSA key set with p D 11, q D 29, n D 319, and e D 3
...
8 Primality testing

965

31
...
n/, then the adversary can factor Alice’s modulus n
in time polynomial in the number of bits in n
...
See Miller [255]
...
7-3 ?
Prove that RSA is multiplicative in the sense that
PA
...
M2 / Á PA
...
mod n/ :
Use this fact to prove that if an adversary had a procedure that could efficiently
decrypt 1 percent of messages from Zn encrypted with PA , then he could employ
a probabilistic algorithm to decrypt every message encrypted with PA with high
probability
...
8 Primality testing
In this section, we consider the problem of finding large primes
...

The density of prime numbers
For many applications, such as cryptography, we need to find large “random”
primes
...
The prime distribution function
...

For example,
...
The prime number theorem gives a useful approximation
to
...

Theorem 31
...
n/
D1:
lim
n!1 n= ln n
The approximation n= ln n gives reasonably accurate estimates of
...
For example, it is off by less than 6% at n D 109 , where
...
(To a number theorist, 109 is a small number
...
4)
...
The geometric distribution tells us how many trials we need
to obtain a success, and by equation (C
...
Thus, we would expect to examine approximately ln n integers
chosen randomly near n in order to find a prime that is of the same length as n
...
(Of
proximately ln 21024
course, we can cut this figure in half by choosing only odd integers
...
For notational convenience, we assume that n
has the prime factorization
e
e
n D p11 p22

e
pr r ;

(31
...
The integer n is prime if and only if r D 1 and e1 D 1
...
We
try dividing n by each integer 2; 3; : : : ; b nc
...
) It is easy to see that n is prime if and only if none of the trial divisors divides n
...
n/, which is exponential in the length of n
...
n C 1/e, and so n D ‚
...
)
Thus, trial division works well only if n is very small or happens to have a small
prime factor
...

In this section, we are interested only in finding out whether a given number n
is prime; if n is composite, we are not concerned with finding its prime factorization
...
9, computing the prime factorization of a
number is computationally expensive
...

Pseudoprimality testing
We now consider a method for primality testing that “almost works” and in fact
is good enough for many practical applications
...
8 Primality testing

967

finement of this method that removes the small defect
...

n
We say that n is a base-a pseudoprime if n is composite and
an

1

Á 1
...
40)

Fermat’s theorem (Theorem 31
...
40) for every a in ZC
...
40), then n is certainly composite
...

We test to see whether n satisfies equation (31
...
If not, we declare n
to be composite by returning COMPOSITE
...

The following procedure pretends in this manner to be checking the primality
of n
...
6
...

P SEUDOPRIME
...
2; n 1; n/ 6Á 1
...
That is, if it says that n
is composite, then it is always correct
...

How often does this procedure err? Surprisingly rarely
...
We won’t prove it, but the probability that this program makes an
error on a randomly chosen ˇ-bit number goes to zero as ˇ ! 1
...
So if you are merely
trying to find a large prime for some application, for all practical purposes you
almost never go wrong by choosing large numbers at random until one of them
causes P SEUDOPRIME to return PRIME
...


968

Chapter 31 Number-Theoretic Algorithms

As we shall see, a little more cleverness, and some randomization, will yield a
primality-testing routine that works well on all inputs
...
40) for a second base number, say a D 3, because there exist composite integers n, known as Carmichael numbers, that satisfy equation (31
...
(We note that equation (31
...
a; n/ > 1—that
is, when a 62 Zn —but hoping to demonstrate that n is composite by finding such
an a can be difficult if n has only large prime factors
...
Carmichael numbers are extremely rare; there
are, for example, only 255 of them less than 100,000,000
...
8-2 helps
explain why they are so rare
...

The Miller-Rabin randomized primality test
The Miller-Rabin primality test overcomes the problems of the simple test P SEU DOPRIME with two modifications:
It tries several randomly chosen base values a instead of just one base value
...
If it finds one, it stops
and returns COMPOSITE
...
35 from Section 31
...

The pseudocode for the Miller-Rabin primality test follows
...
The code uses the random-number generator
n
R ANDOM described on page 117: R ANDOM
...
The code uses an auxiliary procedure W ITNESS
such that W ITNESS
...
The test W ITNESS
...
mod n/

that formed the basis (using a D 2) for P SEUDOPRIME
...
e
...
Let n 1 D 2t u where t
the binary representation of n 1 is the binary representation of the odd integer u
t
followed by exactly t zeros
...
au /2
...
8 Primality testing

969

compute an 1 mod n by first computing au mod n and then squaring the result t
times successively
...
a; n/
1 let t and u be such that t 1, u is odd, and n 1 D 2t u
2 x0 D M ODULAR -E XPONENTIATION
...
By induction on i, the sequence x0 , x1 ,
...
mod n/ for i D 0; 1; : : : ; t, so that in
particular x t Á an 1
...
After line 4 performs a squaring step, however,
the loop may terminate early if lines 5–6 detect that a nontrivial square root of 1
has just been discovered
...
) If so, the algorithm stops and returns TRUE
...
mod n/ is not equal to 1, just as the P SEUDOPRIME procedure returns
COMPOSITE in this case
...

We now argue that if W ITNESS
...

If W ITNESS returns TRUE from line 8, then it has discovered that x t D
n 1
a
mod n ¤ 1
...
31) that an 1 Á 1
...
Therefore, n cannot be prime,
n
and the equation an 1 mod n ¤ 1 proves this fact
...
mod n/ yet
xi Á xi2 1 Á 1
...
Corollary 31
...

This completes our proof of the correctness of W ITNESS
...
a; n/ returns TRUE, then n is surely composite, and the witness a, along
with the reason that the procedure returns TRUE (did it return from line 6 or from
line 8?), provides a proof that n is composite
...

Note that if xi D 1 for some 0 Ä i < t, W ITNESS might not compute the rest
of the sequence
...
We have
four cases:
1
...
Return TRUE
in line 8; a is a witness to the compositeness of n (by Fermat’s Theorem)
...
X D h1; 1; : : : ; 1i: the sequence X is all 1s
...

3
...
Return FALSE; a is not a witness to the compositeness of n
...
X D h: : : ; d; 1; : : : ; 1i, where d ¤ ˙1: the sequence X ends in 1, but the last
non-1 is not 1
...

We now examine the Miller-Rabin primality test based on the use of W ITNESS
...

M ILLER -R ABIN
...
1; n 1/
3
if W ITNESS
...
The main loop (beginning on line 1) picks up to s random values of a
from ZC (line 2)
...
Such a result is always correct, by the correctness of W ITNESS
...
We shall see that this result is likely to be correct
if s is large enough, but that there is still a tiny chance that the procedure may be
unlucky in its choice of a’s and that witnesses do exist even though none has been
found
...
If the procedure chooses a D 7 as a base, Figure 31
...
6 shows that W ITNESS computes x0 Á a 35 Á 241
...
8 Primality testing

971

X D h241; 298; 166; 67; 1i
...
mod n/ and a560 Á 1
...

Therefore, a D 7 is a witness to the compositeness of n, W ITNESS
...

If n is a ˇ-bit number, M ILLER -R ABIN requires O
...
sˇ 3 / bit operations, since it requires asymptotically no more work than s
modular exponentiations
...
Unlike P SEUDOPRIME, however, the chance of error does not depend
on n; there are no bad inputs for this procedure
...
Moreover, since each test is
more stringent than a simple check of equation (31
...
The
following theorem presents a more precise argument
...
38
If n is an odd composite number, then the number of witnesses to the compositeness of n is at least
...

Proof The proof shows that the number of nonwitnesses is at most
...

We start by claiming that any nonwitness must be a member of Zn
...
It must satisfy an 1 Á 1
...
mod n/
...
mod n/ has a solution,
namely an 2
...
21, gcd
...
a; n/ D 1
...

To complete the proof, we show that not only are all nonwitnesses contained
in Zn , they are all contained in a proper subgroup B of Zn (recall that we say B
is a proper subgroup of Zn when B is subgroup of Zn but B is not equal to Zn )
...
16, we then have jBj Ä jZn j =2
...
n 1/=2
...
n 1/=2, so
that the number of witnesses must be at least
...

We now show how to find a proper subgroup B of Zn containing all of the
nonwitnesses
...

Case 1: There exists an x 2 Zn such that
xn

1

6Á 1
...
Because, as we noted earlier,
Carmichael numbers are extremely rare, case 1 is the main case that arises “in
practice” (e
...
, when n has been chosen randomly and is being tested for primality)
...
mod n/g
...

Since B is closed under multiplication modulo n, we have that B is a subgroup
of Zn by Theorem 31
...
Note that every nonwitness belongs to B, since a nonwitness a satisfies an 1 Á 1
...
Since x 2 Zn B, we have that B is a
proper subgroup of Zn
...
mod n/ :

(31
...
This case is extremely rare in practice
...

In this case, n cannot be a prime power
...
We derive a contradiction
as follows
...
Theorem 31
...
g/ D
jZn j D
...
1 1=p/ D
...
(The formula for
...
20)
...
41), we have g n 1 Á 1
...
Then the
discrete logarithm theorem (Theorem 31
...
mod
...
p

1/p e

1

j pe

1:

This is a contradiction for e > 1, since
...
Thus, n is not a prime power
...
(There may be several ways to decompose n, and it does not
e
e
e
matter which one we choose
...
)
choose n1 D p1 and n2 D p2 p3
Recall that we define t and u so that n 1 D 2t u, where t 1 and u is odd, and
that for an input a, the procedure W ITNESS computes the sequence
2

t

X D hau ; a2u ; a2 u ; : : : ; a2 u i
(all computations are performed modulo n)
...
; j / of integers acceptable if
2j u

Á

1
...
8 Primality testing

973

Acceptable pairs certainly exist since u is odd; we can choose D n 1 and
j D 0, so that
...
Now pick the largest possible j such
that there exists an acceptable pair
...
; j / is an acceptable
pair
...
mod n/g :

Since B is closed under multiplication modulo n, it is a subgroup of Zn
...
15, therefore, jBj divides jZn j
...
(If
...
)
We now use the existence of to demonstrate that there exists a w 2 Zn B,
j
and hence that B is a proper subgroup of Zn
...
mod n/, we have
2j u
Á 1
...
29 to the Chinese remainder theorem
...
28, there exists a w simultaneously satisfying the equations
w Á

...
mod n2 / :
Therefore,
ju

w2
w

2j u

Á

1
...
mod n2 / :
j

j

By Corollary 31
...
mod n1 / implies w 2 u 6Á 1
...
mod n2 / implies w 2 u 6Á 1
...
Hence, we conclude that
ju
w 2 6Á ˙1
...

It remains to show that w 2 Zn , which we do by first working separately modulo n1 and modulo n2
...
; n/ D 1, and so also gcd
...

Since w Á
...
w; n1 / D 1
...
mod n2 / implies gcd
...
To combine these results,
we use Theorem 31
...
w; n1 n2 / D gcd
...
That is,
w 2 Zn
...

In either case, we see that the number of witnesses to the compositeness of n is
at least
...

Theorem 31
...
n; s/ errs is at most 2 s
...
38, we see that if n is composite, then each execution of
the for loop of lines 1–4 has a probability of at least 1=2 of discovering a witness x
to the compositeness of n
...
The probability of such a sequence of misses is at most 2 s
...

When applying M ILLER -R ABIN to a large randomly chosen integer n, however,
we need to consider as well the prior probability that n is prime, in order to correctly interpret M ILLER -R ABIN’s result
...
Let A
denote the event that n is prime
...
37),
the probability that n is prime is approximately
1= ln n
1:443=ˇ :

Pr fAg

Now let B denote the event that M ILLER -R ABIN returns P RIME
...

But what is Pr fA j Bg, the probability that n is prime, given that M ILLER R ABIN has returned P RIME? By the alternate form of Bayes’s theorem (equation (C
...
ln n
1C2
1/

This probability does not exceed 1=2 until s exceeds lg
...
Intuitively, that
many initial trials are needed just for the confidence derived from failing to find a
witness to the compositeness of n to overcome the prior bias in favor of n being
composite
...
ln n

1/

lg
...
In any case, choosing s D 50 should suffice for almost any imaginable
application
...
If we are trying to find large primes by
applying M ILLER -R ABIN to large randomly chosen odd integers, then choosing
a small value of s (say 3) is very unlikely to lead to erroneous results, though

31
...
The reason is that for a randomly chosen odd composite
integer n, the expected number of nonwitnesses to the compositeness of n is likely
to be very much smaller than
...

If the integer n is not chosen randomly, however, the best that can be proven is
that the number of nonwitnesses is at most
...
38
...
n 1/=4
...
8-1
Prove that if an odd integer n > 1 is not a prime or a prime power, then there exists
a nontrivial square root of 1 modulo n
...
8-2 ?
It is possible to strengthen Euler’s theorem slightly to the form
a


...
mod n/ for all a 2 Zn ;

e
where n D p11

e
pr r and
...
n/ D lcm
...
pr r // :

(31
...
n/ j
...
A composite number n is a Carmichael number if

...
The smallest Carmichael number is 561 D 3 11 17; here,

...
2; 10; 16/ D 80, which divides 560
...
(For this reason, they are not very common
...
8-3
Prove that if x is a nontrivial square root of 1, modulo n, then gcd
...
x C 1; n/ are both nontrivial divisors of n
...
9 Integer factorization
Suppose we have an integer n that we wish to factor, that is, to decompose into a
product of primes
...
Factoring a large integer n
seems to be much more difficult than simply determining whether n is prime or
composite
...


976

Chapter 31 Number-Theoretic Algorithms

Pollard’s rho heuristic
Trial division by all integers up to R is guaranteed to factor completely any number
up to R2
...
Since the procedure is only
a heuristic, neither its running time nor its success is guaranteed, although the
procedure is highly effective in practice
...
(If you
wanted to, you could easily implement P OLLARD -R HO on a programmable pocket
calculator to find factors of small numbers
...
n/
1 i D1
2 x1 D R ANDOM
...
xi2 1 1/ mod n
8
d D gcd
...
Lines 1–2 initialize i to 1 and x1 to a randomly
chosen value in Zn
...
During each iteration of the while loop, line 7 uses the recurrence
xi D
...
43)

to produce the next value of xi in the infinite sequence
x1 ; x2 ; x3 ; x4 ; : : : ;

(31
...
The pseudocode is written using subscripted variables xi for clarity, but the program works the same if all of the subscripts are dropped, since only the most recent value of xi needs to be maintained
...

Every so often, the program saves the most recently generated xi value in the
variable y
...
9 Integer factorization

977

x1 ; x2 ; x4 ; x8 ; x16 ; : : : :
Line 3 saves the value x1 , and line 12 saves xk whenever i is equal to k
...
Therefore, k follows the sequence 1; 2; 4; 8; : : : and always gives the
subscript of the next value xk to be saved in y
...
Specifically, line 8 computes the greatest common divisor
d D gcd
...
If line 9 finds d to be a nontrivial divisor of n, then line 10
prints d
...

Note, however, that P OLLARD -R HO never prints an incorrect answer; any number it prints is a nontrivial divisor of n
...
We shall
see, however, that we have good reason to expect P OLLARD -R HO to print a facp
tor p of n after ‚
...
Thus, if n is composite, we
can expect this procedure to discover enough divisors to factor n completely after
approximately n1=4 updates, since every prime factor p of n except possibly the
p
largest one is less than n
...
Since Zn is finite, and
since each value in the sequence (31
...
44) eventually repeats itself
...

The reason for the name “rho heuristic” is that, as Figure 31
...

Let us consider the question of how long it takes for the sequence of xi to repeat
...
For the purpose of this estimation, let us assume that the function
fn
...
x 2

1/ mod n

behaves like a “random” function
...

We can then consider each xi to have been independently drawn from Zn according
to a uniform distribution on Zn
...
4
...
n/ steps to be taken before the sequence cycles
...
Let p be a nontrivial factor of n such that
e
e
e
gcd
...
For example, if n has the factorization n D p11 p22 pr r , then
e1
we may take p to be p1
...
)

978

Chapter 31 Number-Theoretic Algorithms

996

310

814

396

x7 177

x6 1186

00
x7
31

84

120

x5 1194

339

00
x5

529
595

x4

00
x6

18
26

11
47

1053

63

6

0
x4

00
x4

63

0
x7

x3
x2
x1

0
x3

8

0
x2

3
2

mod 1387
(a)

0
x1

8

00
x3

0
x6

16
0
x5

3
2

mod 19
(b)

00
x2

00
x1

8
3

2

mod 73
(c)

Figure 31
...
(a) The values produced by the recurrence xi C1 D
2

...
The prime factorization of 1387 is 19 73
...
The light
arrows point to unreached values in the iteration, to illustrate the “rho” shape
...
The factor 19 is discovered upon reaching x7 D 177, when
gcd
...
The first x value that would be repeated is 1186, but the
factor 19 is discovered before this value is repeated
...
Every value xi given in part (a) is equivalent, modulo 19, to the value xi shown here
...
(c) The values produced
by the same recurrence, modulo 73
...
By the Chinese remainder theorem, each node in part (a) corresponds to a pair
of nodes, one from part (b) and one from part (c)
...

Furthermore, because fn is defined using only arithmetic operations (squaring
and subtraction) modulo n, we can compute xi0 C1 from xi0 ; the “modulo p” view of

31
...
xi / mod p

...
1-7)

...
xi mod p/
0 2
1/ mod p

...
xi / :

Thus, although we are not explicitly computing the sequence hxi0 i, this sequence is
well defined and obeys the same recurrence as the sequence hxi i
...
p/
...
Indeed, as parts (b) and (c) of
Figure 31
...

Let t denote the index of the first repeated value in the hxi0 i sequence, and let
u > 0 denote the length of the cycle that has been thereby produced
...
By the
and u > 0 are the smallest values such that x t0 Ci D x t0 CuCi for all i
p
above arguments, the expected values of t and u are both ‚
...
Note that if
x t0 Ci D x t0 CuCi , then p j
...
Thus, gcd
...

t,
Therefore, once P OLLARD -R HO has saved as y any value xk such that k
then y mod p is always on the cycle modulo p
...
) Eventually, k is set to a value that
is greater than u, and the procedure then makes an entire loop around the cycle
modulo p without changing the value of y
...
mod p/
...
Since the expected values of both t and u are
p
p

...
p/
...
First, the
heuristic analysis of the running time is not rigorous, and it is possible that the cycle
p
of values, modulo p, could be much larger than p
...
In practice, this issue seems
to be moot
...
For example, suppose that n D pq, where p
and q are prime
...
Since both factors are revealed at the same

980

Chapter 31 Number-Theoretic Algorithms

time, the trivial factor pq D n is revealed, which is useless
...
If necessary, we can restart the heuristic with
a different recurrence of the form xi C1 D
...
(We should avoid the
values c D 0 and c D 2 for reasons we will not go into here, but other values are
fine
...
” Nonetheless, the procedure performs well in practice, and
it seems to be as efficient as this heuristic analysis indicates
...
To factor a ˇ-bit composite number n completely, we only need to find all prime factors less than bn1=2 c,
and so we expect P OLLARD -R HO to require at most n1=4 D 2ˇ=4 arithmetic operations and at most n1=4 ˇ 2 D 2ˇ=4 ˇ 2 bit operations
...
p/ of arithmetic operations is
often its most appealing feature
...
9-1
Referring to the execution history shown in Figure 31
...
9-2
Suppose that we are given a function f W Zn ! Zn and an initial value x0 2 Zn
...
xi 1 / for i D 1; 2; : : :
...
In the terminology of Pollard’s rho algorithm,
t is the length of the tail and u is the length of the cycle of the rho
...

31
...
9-4 ?
One disadvantage of P OLLARD -R HO as written is that it requires one gcd computation for each step of the recurrence
...
Describe carefully how you would
implement this idea, why it works, and what batch size you would pick as the most
effective when working on a ˇ-bit number n
...

This problem investigates the binary gcd algorithm, which avoids the remainder
computations used in Euclid’s algorithm
...
Prove that if a and b are both even, then gcd
...
a=2; b=2/
...
Prove that if a is odd and b is even, then gcd
...
a; b=2/
...
Prove that if a and b are both odd, then gcd
...
a

b/=2; b/
...
Design an efficient binary gcd algorithm for input integers a and b, where
a b, that runs in O
...
Assume that each subtraction, parity test,
and halving takes unit time
...
Consider the ordinary “paper and pencil” algorithm for long division: dividing
a by b, which yields a quotient q and remainder r
...
1 C lg q/ lg b/ bit operations
...
Define
...
1 C lg a/
...
Show that the number of bit operations
performed by E UCLID in reducing the problem of computing gcd
...
b; a mod b/ is at most c
...
b; a mod b// for some
sufficiently large constant c > 0
...
Show that E UCLID
...
a; b// bit operations in general and
O
...

31-3 Three algorithms for Fibonacci numbers
This problem compares the efficiency of three methods for computing the nth Fibonacci number Fn , given n
...
1/, independent of the size of the numbers
...
Show that the running time of the straightforward recursive method for computing Fn based on recurrence (3
...
(See, for example, the
F IB procedure on page 775
...
Show how to compute Fn in O
...


982

Chapter 31 Number-Theoretic Algorithms

c
...
lg n/ time using only integer addition and multiplication
...
)
d
...
ˇ/ time and that multiplying two ˇ-bit numbers takes ‚
...
What is the running time of these
three methods under this more reasonable cost measure for the elementary arithmetic operations?
31-4 Quadratic residues
Let p be an odd prime
...
mod p/ has a solution for the unknown x
...
Show that there are exactly
...


a
b
...
p /, for a 2 Zp , to be 1 if a is a
quadratic residue modulo p and 1 otherwise
...
p
p

1/=2


...
Analyze the efficiency of your algorithm
...
Prove that if p is a prime of the form 4k C 3 and a is a quadratic residue in Zp ,
then akC1 mod p is a square root of a, modulo p
...
Describe an efficient randomized algorithm for finding a nonquadratic residue,
modulo an arbitrary prime p, that is, a member of Zp that is not a quadratic
residue
...
Knuth [210] contains a good discussion of algorithms for finding the

Notes for Chapter 31

983

greatest common divisor, as well as other basic number-theoretic algorithms
...
Dixon [91] gives an overview of factorization and primality testing
...
More recently, Bach and Shallit [31] have provided an exceptional
overview of the basics of computational number theory
...
It appears in Book 7,
Propositions 1 and 2, of the Greek mathematician Euclid’s Elements, which was
written around 300 B
...
Euclid’s description may have been derived from an algorithm due to Eudoxus around 375 B
...
Euclid’s algorithm may hold the honor
of being the oldest nontrivial algorithm; it is rivaled only by an algorithm for multiplication known to the ancient Egyptians
...

Knuth attributes a special case of the Chinese remainder theorem (Theorem 31
...
C
...
D
...
The same special case was
given by the Greek mathematician Nichomachus around A
...
100
...
The Chinese remainder theorem was finally
stated and proved in its full generality by L
...

The randomized primality-testing algorithm presented here is due to Miller [255]
and Rabin [289]; it is the fastest randomized primality-testing algorithm known,
to within constant factors
...
39 is a slight adaptation of
one suggested by Bach [29]
...
For many years primality-testing was the classic
example of a problem where randomization appeared to be necessary to obtain
an efficient (polynomial-time) algorithm
...
Until then, the fastest deterministic primality testing algorithm
known, due to Cohen and Lenstra [73], ran in time
...
lg lg lg n/ on input n, which
is just slightly superpolynomial
...

The problem of finding large “random” primes is nicely discussed in an article
by Beauchemin, Brassard, Cr´ peau, Goutier, and Pomerance [36]
...

The RSA cryptosystem was proposed in 1977 by Rivest, Shamir, and Adleman
[296]
...
Our understanding
of the RSA cryptosystem has deepened, and modern implementations use significant refinements of the basic techniques presented here
...
For example, Goldwasser and Micali [142] show that randomization can be an effective
tool in the design of secure public-key encryption schemes
...
Menezes,
van Oorschot, and Vanstone [254] provide an overview of applied cryptography
...
The
version presented here is a variant proposed by Brent [56]
...
The general number-field sieve factoring algorithm (as developed by Buhler, Lenstra, and Pomerance [57] as an extension of the ideas in the number-field
sieve factoring algorithm by Pollard [278] and Lenstra et al
...
Although it is difficult to give a rigorous analysis of this
algorithm, under reasonable assumptions we can derive a running-time estimate of
˛
1 ˛
L
...
1/ , where L
...
ln n/
...

The elliptic-curve method due to Lenstra [233] may be more effective for some
inputs than the number-field sieve method, since, like Pollard’s rho method, it can
find a small prime factor p quite quickly
...
1=2; p/ 2Co
...


32

String Matching

Text-editing programs frequently need to find all occurrences of a pattern in the
text
...
Efficient algorithms for this problem—called
“string matching”—can greatly aid the responsiveness of the text-editing program
...
Internet search engines also use them to find
Web pages relevant to queries
...
We assume that the
text is an array T Œ1 : : n of length n and that the pattern is an array P Œ1 : : m
of length m Ä n
...
For example, we may have † D f0,1g
or † D fa; b; : : : ; zg
...

Referring to Figure 32
...
If P occurs with shift s in T , then we call s a valid shift; otherwise,
we call s an invalid shift
...


text T
pattern P

a b c a b a a b c a b a c
s=3

a b a a

Figure 32
...
The pattern occurs only once in the text,
at shift s D 3, which we call a valid shift
...


986

Chapter 32 String Matching

Algorithm
Naive
Rabin-Karp
Finite automaton
Knuth-Morris-Pratt

Preprocessing time
0

...
m j†j/

...
n
O
...
n/

...
2 The string-matching algorithms in this chapter and their preprocessing and matching
times
...
1,
each string-matching algorithm in this chapter performs some preprocessing based
on the pattern and then finds all valid shifts; we call this latter phase “matching
...
2 shows the preprocessing and matching times for each of the algorithms
in this chapter
...
Section 32
...
Although the ‚
...
It also generalizes nicely to other patternmatching problems
...
3 then describes a string-matching algorithm that
begins by constructing a finite automaton specifically designed to search for occurrences of the given pattern P in a text
...
m j†j/ preprocessing time, but only ‚
...
Section 32
...
n/ matching
time, and it reduces the preprocessing time to only ‚
...

Notation and terminology
We denote by † (read “sigma-star”) the set of all finite-length strings formed
using characters from the alphabet †
...
The zero-length empty string, denoted ", also belongs to †
...
The concatenation of two strings x and y,
denoted xy, has length jxj C jyj and consists of the characters from x followed by
the characters from y
...
Note that if w < x, then jwj Ä jxj
...
As
with a prefix, w = x implies jwj Ä jxj
...
The empty string " is both a suffix and a prefix of every string
...


Chapter 32

String Matching

987

x
z

x

x

z

z

y

y

x
y

x

x
y

(a)

y

(b)

y
(c)

Figure 32
...
1
...
The three parts
of the figure illustrate the three cases of the lemma
...
(a) If jxj Ä jyj, then x = y
...
(c) If jxj D jyj,
then x D y
...
The following lemma will be useful
later
...
1 (Overlapping-suffix lemma)
Suppose that x, y, and ´ are strings such that x = ´ and y = ´
...
If jxj jyj, then y = x
...

Proof

See Figure 32
...


For brevity of notation, we denote the k-character prefix P Œ1 : : k of the pattern
P Œ1 : : m by Pk
...
Similarly, we denote
the k-character prefix of the text T by Tk
...

In our pseudocode, we allow two equal-length strings to be compared for equality as a primitive operation
...

To be precise, the test “x == y” is assumed to take time ‚
...
(We write ‚
...
t/ to handle the case in which t D 0; the first characters compared
do not match, but it takes a positive amount of time to perform this comparison
...
1 The naive string-matching algorithm
The naive algorithm finds all valid shifts using a loop that checks the condition
P Œ1 : : m D T Œs C 1 : : s C m for each of the n m C 1 possible values of s
...
T; P /
1 n D T:length
2 m D P:length
3 for s D 0 to n m
4
if P Œ1 : : m == T Œs C 1 : : s C m
5
print “Pattern occurs with shift” s
Figure 32
...
The for loop of
lines 3–5 considers each possible shift explicitly
...

Line 5 prints out each valid shift s
...
n m C 1/m/, and this
bound is tight in the worst case
...
For each of the n mC1 possible values of the shift s,
the implicit loop on line 4 to compare corresponding characters must execute m
times to validate the shift
...
n m C 1/m/,
which is ‚
...
Because it requires no preprocessing, NAIVE S TRING -M ATCHER’s running time equals its matching time
...
4 The operation of the naive string matcher for the pattern P D aab and the text
T D acaabc
...
(a)–(d) The
four successive alignments tried by the naive string matcher
...
The algorithm finds one occurrence of the pattern, at shift s D 2, shown in
part (c)
...
1 The naive string-matching algorithm

989

As we shall see, NAIVE -S TRING -M ATCHER is not an optimal procedure for this
problem
...
The naive string-matcher is inefficient because
it entirely ignores information gained about the text for one value of s when it
considers other values of s
...
For
example, if P D aaab and we find that s D 0 is valid, then none of the shifts 1, 2,
or 3 are valid, since T Œ4 D b
...

Exercises
32
...

32
...
Show how to accelerate
NAIVE -S TRING -M ATCHER to run in time O
...

32
...
Show
that the expected number of character-to-character comparisons made by the implicit loop in line 4 of the naive algorithm is
1 d m
Ä 2
...
(Assume that the naive algorithm stops comparing
characters for a given shift once it finds a mismatch or matches the entire pattern
...



...
1-4
Suppose we allow the pattern P to contain occurrences of a gap character } that
can match an arbitrary string of characters (even one of zero length)
...
Give a polynomial-time algorithm to determine whether
such a pattern P occurs in a given text T , and analyze the running time of your
algorithm
...
2 The Rabin-Karp algorithm
Rabin and Karp proposed a string-matching algorithm that performs well in practice and that also generalizes to other algorithms for related problems, such as
two-dimensional pattern matching
...
m/ preprocessing time, and its worst-case running time is ‚
...
Based on certain
assumptions, however, its average-case running time is better
...
You might want to refer to
Section 31
...

For expository purposes, let us assume that † D f0; 1; 2; : : : ; 9g, so that each
character is a decimal digit
...
) We can then view a string of k
consecutive characters as representing a length-k decimal number
...
Because we interpret the input characters as both graphical symbols and digits, we find it convenient
in this section to denote them as we would digits, in our standard text font
...
In a similar manner, given a text T Œ1 : : n, let ts denote the decimal value of the length-m
substring T Œs C 1 : : s C m, for s D 0; 1; : : : ; n m
...
If we
could compute p in time ‚
...
n mC1/ time,1
then we could determine all valid shifts s in time ‚
...
n m C 1/ D ‚
...
(For the moment, let’s not worry about
the possibility that p and the ts values might be very large numbers
...
m/ using Horner’s rule (see Section 30
...
P Œm

1 C 10
...
P Œ2 C 10P Œ1/

// :

Similarly, we can compute t0 from T Œ1 : : m in time ‚
...


1 We write ‚
...
n m/ because s takes on n m C 1 different values
...
1/ time, not ‚
...


32
...
n
that we can compute tsC1 from ts in constant time, since
tsC1 D 10
...
1)

Subtracting 10m 1 T Œs C 1 removes the high-order digit from ts , multiplying the
result by 10 shifts the number left by one digit position, and adding T Œs C m C 1
brings in the appropriate low-order digit
...
31415
D 14152 :

10000 3/ C 2

If we precompute the constant 10m 1 (which we can do in time O
...
6, although for this application a straightforward O
...
1) takes a constant number of arithmetic operations
...
m/, and we can
compute all of t0 ; t1 ; : : : ; tn m in time ‚
...
Therefore, we can find all
occurrences of the pattern P Œ1 : : m in the text T Œ1 : : n with ‚
...
n m C 1/ matching time
...
If P contains m characters, then we cannot
reasonably assume that each arithmetic operation on p (which is m digits long)
takes “constant time
...
5
shows: compute p and the ts values modulo a suitable modulus q
...
m/ time and all the ts values modulo q in ‚
...

If we choose the modulus q as a prime such that 10q just fits within one computer
word, then we can perform all the necessary computations with single-precision
arithmetic
...
1) to
work modulo q, so that it becomes
tsC1 D
...
ts

T Œs C 1h/ C T Œs C m C 1/ mod q ;

(32
...
mod q/ is the value of the digit “1” in the high-order position
of an m-digit text window
...
mod q/
does not imply that ts D p
...
mod q/, then we
definitely have that ts ¤ p, so that shift s is invalid
...
mod q/ as a fast heuristic test to rule out invalid shifts s
...
mod q/ must be tested further to see whether s is really valid or
we just have a spurious hit
...
5 The Rabin-Karp algorithm
...
(a) A text string
...
The numerical value of the
shaded number, computed modulo 13, yields the value 7
...
Assuming the pattern P D 31415,
we look for windows whose value modulo 13 is 7, since 31415 Á 7
...
The algorithm finds
two such windows, shown shaded in the figure
...
(c) How
to compute the value for a window in constant time, given the value for the previous window
...
Dropping the high-order digit 3, shifting left (multiplying by 10), and
then adding in the low-order digit 2 gives us the new value 14152
...


32
...
If q is large enough, then we hope that spurious
hits occur infrequently enough that the cost of the extra checking is low
...
The inputs to the procedure
are the text T , the pattern P , the radix d to use (which is typically taken to be j†j),
and the prime q to use
...
T; P; d; q/
1 n D T:length
2 m D P:length
3 h D d m 1 mod q
4 p D0
5 t0 D 0
6 for i D 1 to m
/ preprocessing
/
7
p D
...
dt0 C T Œi/ mod q
9 for s D 0 to n m
/ matching
/
10
if p == ts
11
if P Œ1 : : m == T Œs C 1 : : s C m
12
print “Pattern occurs with shift” s
13
if s < n m
14
tsC1 D
...
ts T Œs C 1h/ C T Œs C m C 1/ mod q
The procedure R ABIN -K ARP -M ATCHER works as follows
...
The subscripts on t are provided only for clarity; the
program works correctly if all the subscripts are dropped
...
Lines 4–8 compute p
as the value of P Œ1 : : m mod q and t0 as the value of T Œ1 : : m mod q
...

If p D ts in line 10 (a “hit”), then line 11 checks to see whether P Œ1 : : m D
T Œs C 1 : : s C m in order to rule out the possibility of a spurious hit
...
If s < n m (checked in line 13), then the for
loop will execute at least one more time, and so line 14 first executes to ensure that
the loop invariant holds when we get back to line 10
...
2)
directly
...
m/ preprocessing time, and its matching time
is ‚
...
If P D am

994

Chapter 32 String Matching

and T D an , then verifying takes time ‚
...

In many applications, we expect few valid shifts—perhaps some constant c of
them
...
n m C 1/ C cm/ D O
...
We can base a heuristic analysis on the assumption that reducing values modulo q acts like a random mapping from † to Zq
...
3
...
It is difficult to formalize and prove such an
assumption, although one viable approach is to assume that q is chosen randomly
from integers of the appropriate size
...
)
We can then expect that the number of spurious hits is O
...

Since there are O
...
m/
time for each hit, the expected matching time taken by the Rabin-Karp algorithm
is
O
...
m
...
This running time is O
...
1/ and
we choose q m
...
1/)
and we choose the prime q to be larger than the length of the pattern, then we
can expect the Rabin-Karp procedure to use only O
...
Since
m Ä n, this expected matching time is O
...

Exercises
32
...
2-2
How would you extend the Rabin-Karp method to the problem of searching a text
string for an occurrence of any one of a given set of k patterns? Start by assuming
that all k patterns have the same length
...

32
...
(The pattern may be shifted
vertically and horizontally, but it may not be rotated
...
3 String matching with finite automata

995

32
...
Alice and Bob wish to know if their
files are identical
...
Together, they select a prime q > 1000n and randomly select
an integer x from f0; 1; : : : ; q 1g
...
x/ D
i D0

and Bob similarly evaluates B
...
Prove that if A ¤ B, there is at most one
chance in 1000 that A
...
x/, whereas if the two files are the same, A
...
x/
...
4-4
...
3 String matching with finite automata
Many string-matching algorithms build a finite automaton—a simple machine for
processing information—that scans the text string T for all occurrences of the pattern P
...
These
string-matching automata are very efficient: they examine each text character exactly once, taking constant time per text character
...
n/
...
Section 32
...

We begin this section with the definition of a finite automaton
...
Finally, we shall show how to construct the string-matching
automaton for a given input pattern
...
6, is a 5-tuple
...


996

Chapter 32 String Matching

state
0
1

input
a b
1
0

a
b

0

0
0

(a)

1
a
b
(b)

Figure 32
...
(a) A tabular representation of the transition function ı
...
State 1, shown blackend, is the only accepting state
...
For example, the edge from state 1 to state 0 labeled b indicates that
ı
...
This automaton accepts those strings that end in an odd number of a’s
...
For example, on input abaaa, including the start state, this automaton enters the sequence of
states h0; 1; 0; 1; 0; 1i, and so it accepts this input
...


The finite automaton begins in state q0 and reads the characters of its input string
one at a time
...
q; a/
...
An input that
is not accepted is rejected
...
w/ is the state M ends up in after scanning the string w
...
w/ 2 A
...
"/ D q0 ;

...
w/; a/ for w 2 † ; a 2 †
...
Figure 32
...
From now on, we shall
assume that P is a given fixed pattern string; for brevity, we shall not indicate the
dependence upon P in our notation
...
The function maps † to f0; 1; : : : ; mg such that
...
x/ D max fk W Pk = xg :

(32
...
3 String matching with finite automata

a
0

a

1

997

b

2

a

a

a

a
3

b

4

a

5

c

6

a

7

b
b
(a)
state
0
1
2
3
4
5
6
7

a
1
1
3
1
5
1
7
1

input
b c
0 0
2 0
0 0
4 0
0 0
4 6
0 0
2 0
(b)

P
a
b
a
b
a
c
a

i
T Œi
state
...
7 (a) A state-transition diagram for the string-matching automaton that accepts all
strings ending in the string ababaca
...
A directed edge from state i to state j labeled a represents ı
...
The
right-going edges forming the “spine” of the automaton, shown heavy in the figure, correspond to
successful matches between pattern and input characters
...
Some edges corresponding to failing matches are omitted; by convention, if a state i has
no outgoing edge labeled a for some a 2 †, then ı
...
(b) The corresponding transition
function ı, and the pattern string P D ababaca
...
(c) The operation of the automaton on the
text T D abababacaba
...
Ti / that the automaton is in after processing the prefix Ti
...


The suffix function is well defined since the empty string P0 D " is a suffix of every string
...
"/ D 0,

...
ccab/ D 2
...
x/ D m if and only if P = x
...
x/ Ä
...

We define the string-matching automaton that corresponds to a given pattern
P Œ1 : : m as follows:

998

Chapter 32 String Matching

The state set Q is f0; 1; : : : ; mg
...

The transition function ı is defined by the following equation, for any state q
and character a:
ı
...
Pq a/ :

(32
...
q; a/ D
...
We consider the
most recently read characters of T
...
Suppose that q D
...
We design the transition function ı so that this state number, q, tells us the
length of the longest prefix of P that matches a suffix of Ti
...
Ti /
...
) Thus, since
...
Ti / both equal q,
we shall see (in Theorem 32
...
Ti / D
...
5)

If the automaton is in state q and reads the next character T Œi C 1 D a, then we
want the transition to lead to the state corresponding to the longest prefix of P that
is a suffix of Ti a, and that state is
...
Because Pq is the longest prefix of P
that is a suffix of Ti , the longest prefix of P that is a suffix of Ti a is not only
...
Pq a/
...
3, on page 1000, proves that
...
Pq a/
...
Pq a/
...
In the first case, a D P Œq C 1, so that the
character a continues to match the pattern; in this case, because ı
...
7)
...
Here, we must find a smaller prefix of P that is also a suffix of Ti
...

Let’s look at an example
...
7 has
ı
...
To illustrate the second case, observe that the automaton of Figure 32
...
5; b/ D 4
...


32
...
8 An illustration for the proof of Lemma 32
...
The figure shows that r Ä
where r D
...



...
As for any string-matching automaton for a pattern of length m,
the state set Q is f0; 1; : : : ; mg, the start state is 0, and the only accepting state is
state m
...
T; ı; m/
1 n D T:length
2 q D0
3 for i D 1 to n
4
q D ı
...
n/
...
We address this problem later, after first proving that the
procedure F INITE -AUTOMATON -M ATCHER operates correctly
...
We shall prove
that the automaton is in state
...
Since
...
To prove this result, we make use of the following two
lemmas about the suffix function
...
2 (Suffix-function inequality)
For any string x and character a, we have
...
x/ C 1
...
8, let r D
...
If r D 0, then the conclusion

...
x/ C 1 is trivially satisfied, by the nonnegativity of
...
Now
assume that r > 0
...
Thus, Pr 1 = x, by

1000

Chapter 32 String Matching

x
a
Pq

a
Pr

Figure 32
...
3
...
x/ and r D
...



...
Therefore, r 1 Ä
...
x/ is the largest k such that Pk = x, and thus
...
x/ C 1
...
3 (Suffix-function recursion lemma)
For any string x and character a, if q D
...
xa/ D
...

Proof From the definition of , we have Pq = x
...
9 shows, we
also have Pq a = xa
...
xa/, then Pr = xa and, by Lemma 32
...
Thus, we have jPr j D r Ä q C 1 D jPq aj
...
1 implies that Pr = Pq a
...
Pq a/,
that is,
...
Pq a/
...
Pq a/ Ä
...

Thus,
...
Pq a/
...
As noted above, this theorem
shows that the automaton is merely keeping track, at each step, of the longest
prefix of the pattern that is a suffix of what has been read so far
...
5)
...
4
If is the final-state function of a string-matching automaton for a given pattern P
and T Œ1 : : n is an input text for the automaton, then

...
Ti /
for i D 0; 1; : : : ; n
...
For i D 0, the theorem is trivially true,
since T0 D "
...
T0 / D 0 D
...


32
...
Ti / D
...
Ti C1 / D
...
Let q
denote
...
Then,

...
Ti C1 / D
D ı
...
q; a/
D

...
Ti a/
D

...
4) of ı)
(by Lemma 32
...


By Theorem 32
...
Thus, we have q D m on line 5 if and only if the machine has just scanned an occurrence of the pattern P
...

Computing the transition function
The following procedure computes the transition function ı from a given pattern
P Œ1 : : m
...
P; †/
1 m D P:length
2 for q D 0 to m
3
for each character a 2 †
4
k D min
...
q; a/ D k
9 return ı
This procedure computes ı
...
4)
...
q; a/ to be the largest k such
that Pk = Pq a
...
m; q C 1/
...

The running time of C OMPUTE -T RANSITION -F UNCTION is O
...
Much faster procedures exist; by utilizing some cleverly computed information about the pattern P (see Exercise 32
...
m j†j/
...
m j†j/ preprocessing time and ‚
...

Exercises
32
...

32
...

32
...
Describe the state-transition diagram of the string-matching automaton for a nonoverlappable pattern
...
3-4 ?
Given two patterns P and P 0 , describe how to construct a finite automaton that
determines all occurrences of either pattern
...

32
...
1-4), show how to
build a finite automaton that can find an occurrence of P in a text T in O
...


? 32
...
This algorithm avoids computing the transition function ı altogether, and its
matching time is ‚
...
m/ and store in an array Œ1 : : m
...
Loosely speaking, for any state q D 0; 1; : : : ; m and any character

32
...
q; a/ but
that does not depend on a
...
m j†j/ entries, we save a factor of j†j in the preprocessing time by computing
rather than ı
...
We can take advantage of this information to
avoid testing useless shifts in the naive pattern-matching algorithm and to avoid
precomputing the full transition function ı for a string-matching automaton
...
Figure 32
...
For this example, q D 5 of the characters have matched successfully, but
the 6th pattern character fails to match the corresponding text character
...
Knowing these q text characters allows us to determine immediately that certain shifts are invalid
...
The shift s 0 D s C 2 shown in part (b) of the figure, however, aligns the first three pattern characters with three text characters that
must necessarily match
...
6)

where s 0 C k D s C q?
In other words, knowing that Pq = TsCq , we want the longest proper prefix Pk
of Pq that is also a suffix of TsCq
...
) We add the difference q k in the lengths of these prefixes of P to the
shift s to arrive at our new shift s 0 , so that s 0 D s C
...
In the best case, k D 0,
so that s 0 D s C q, and we immediately rule out shifts s C 1; s C 2; : : : ; s C q 1
...
6) guarantees that they
match
...
10(c) demonstrates
...
10 The prefix function
...
Matching characters, shown shaded, are connected by vertical lines
...
(c) We can precompute useful information for such deductions by comparing the
pattern with itself
...

We represent this precomputed information in the array , so that Œ5 D 3
...
q
in part (b)
...
Therefore, we can interpret
equation (32
...
Then, the new
shift s 0 D s C
...
We will find it convenient to
store, for each value of q, the number k of matching characters at the new shift s 0 ,
rather than storing, say, s 0 s
...
Given a pattern
P Œ1 : : m, the prefix function for the pattern P is the function W f1; 2; : : : ; mg !
f0; 1; : : : ; m 1g such that
Œq D max fk W k < q and Pk = Pq g :
That is, Œq is the length of the longest prefix of P that is a proper suffix of Pq
...
11(a) gives the complete prefix function for the pattern ababaca
...
4 The Knuth-Morris-Pratt algorithm

P5
P3

i
P Œi
Œi

1 2 3 4 5 6 7
a b a b a c a
0 0 1 2 3 0 1
(a)

1005

a b a b a c a
a b a b a c a

Œ5 D 3

P1

a b a b a c a

Œ3 D 1

P0

" a b a b a c a

Œ1 D 0

(b)

Figure 32
...
5 for the pattern P D ababaca and q D 5
...
Since Œ5 D 3, Œ3 D 1, and Œ1 D 0, by iterating we obtain
Œ5 D f3; 1; 0g
...
In
the figure, the first row gives P , and the dotted vertical line is drawn just after P5
...
Successfully
matched characters are shown shaded
...
Thus,
Œq D fk W k < q and Pk = Pq g
fk W k < 5 and Pk = P5 g D f3; 1; 0g
...
5 claims that
for all q
...
For the most part, the procedure follows from
F INITE -AUTOMATON -M ATCHER, as we shall see
...

KMP-M ATCHER
...
P /
4 q D0
/ number of characters matched
/
5 for i D 1 to n
/ scan the text from left to right
/
6
while q > 0 and P Œq C 1 ¤ T Œi
7
q D Œq
/ next character does not match
/
8
if P Œq C 1 == T Œi
9
q D qC1
/ next character matches
/
/ is all of P matched?
/
10
if q == m
11
print “Pattern occurs with shift” i m
12
q D Œq
/ look for the next match
/

1006

Chapter 32 String Matching

C OMPUTE -P REFIX -F UNCTION
...

We begin with an analysis of the running times of these procedures
...

Running-time analysis
The running time of C OMPUTE -P REFIX -F UNCTION is ‚
...
1)
...
m/ times altogether
...
We start by making
some observations about k
...
Thus, the total increase in k is at most m 1
...
Therefore, the assignments in lines 3 and 10
ensure that Œq < q for all q D 1; 2; : : : ; m, which means that each iteration of
the while loop decreases k
...
Putting these facts
together, we see that the total decrease in k from the while loop is bounded from
above by the total increase in k over all iterations of the for loop, which is m 1
...
m/
...
4-4 asks you to show, by a similar aggregate analysis, that the matching time of KMP-M ATCHER is ‚
...

Compared with F INITE -AUTOMATON -M ATCHER, by using rather than ı, we
have reduced the time for preprocessing the pattern from O
...
m/, while
keeping the actual matching time bounded by ‚
...


32
...
But first, we need to prove that the
procedure C OMPUTE -P REFIX -F UNCTION does indeed compute the prefix function correctly
...
The value of Œq gives us the longest such prefix, but
the following lemma, illustrated in Figure 32
...
Let
Œq D f Œq;


...
3/

Œq; : : : ;


...
t /

Œqg ;

Œq is defined in terms of functional iteration, so that
where

...
i 1/ Œq for i
1, and where the sequence in
reaching
...



...
5 (Prefix-function iteration lemma)
Let P be a pattern of length m with prefix function
...

Proof
i2

Œq  fk W k < q and Pk = Pq g or, equivalently,

We first prove that
Œq implies Pi = Pq :

(32
...
u/

Œq, then i D
Œq for some u > 0
...
7) by
If i 2
induction on u
...
Using the relations Œi < i and P Œi  = Pi
and the transitivity of < and = establishes the claim for all i in Œq
...

Œq by contradiction
...
Because Œq is the largest value in
Œq, we must have j < Œq, and so we
fk W k < q and Pk = Pq g and Œq 2
0
let j denote the smallest integer in Œq that is greater than j
...
) We have Pj = Pq because
j 2 fk W k < q and Pk = Pq g, and from j 0 2 Œq and equation (32
...
Thus, Pj = Pj 0 by Lemma 32
...
Therefore, we must have Œj 0  D j and, since j 0 2 Œq, we
must have j 2 Œq as well
...

The algorithm C OMPUTE -P REFIX -F UNCTION computes Œq, in order, for q D
1; 2; : : : ; m
...
We shall use the following lemma and

1008

Chapter 32 String Matching

its corollary to prove that C OMPUTE -P REFIX -F UNCTION computes Œq correctly
for q > 1
...
6
Let P be a pattern of length m, and let be the prefix function for P
...

Proof Let r D Œq > 0, so that r < q and Pr = Pq ; thus, r 1 < q 1 and
Pr 1 = Pq 1 (by dropping the last character from Pr and Pq , which we can do
Œq 1
...
By Lemma 32
...

For q D 2; 3; : : : ; m, define the subset Eq
Eq

1

D fk 2 Œq
D fk W k < q
D fk W k < q

1

Â

Œq

1 by

1 W P Œk C 1 D P Œqg
1 and Pk = Pq 1 and P Œk C 1 D P Œqg (by Lemma 32
...
Thus, Eq 1 consists of those
values k 2 Œq 1 such that we can extend Pk to PkC1 and get a proper suffix
of Pq
...
7
Let P be a pattern of length m, and let be the prefix function for P
...
Therefore Œq D 0
...

Therefore, from the definition of Œq, we have
Œq

1 C max fk 2 Eq 1 g :

(32
...
Let r D Œq 1, so that r C 1 D Œq and therefore PrC1 = Pq
...
Furthermore,
Œq 1
...
6, we have r 2
max fk 2 Eq 1 g or, equivalently,
Œq Ä 1 C max fk 2 Eq 1 g :
Combining equations (32
...
9) completes the proof
...
9)

32
...
In the procedure C OMPUTE -P REFIX -F UNCTION, at the start of each iteration of the for loop of lines 5–10, we have that k D Œq 1
...
Lines 6–9 adjust k so that it becomes
the correct value of Œq
...
7, we can set Œq
to k C 1
...
If P Œ1 D P Œq, then we should set both k and Œq to 1;
otherwise we should leave k alone and set Œq to 0
...
This completes our proof of the correctness of C OMPUTE P REFIX -F UNCTION
...
Specifically, we shall prove that in the ith iteration of
the for loops of both KMP-M ATCHER and F INITE -AUTOMATON -M ATCHER, the
state q has the same value when we test for equality with m (at line 10 in KMPM ATCHER and at line 5 in F INITE -AUTOMATON -M ATCHER)
...

Before we formally prove that KMP-M ATCHER correctly simulates F INITE AUTOMATON -M ATCHER, let’s take a moment to understand how the prefix function
replaces the ı transition function
...
q; a/
...
q; a/ D q C 1
...
q; a/ Ä q
...

The function comes into play when the character a does not continue to match
the pattern, so that the new state ı
...
The while loop of lines 6–7 in KMP-M ATCHER iterates through
the states in Œq, stopping either when it arrives in a state, say q 0 , such that a
matches P Œq 0 C 1 or q 0 has gone all the way down to 0
...
q; a/ for the simulation
to work correctly
...
q; a/ should be either state 0 or
one greater than some state in Œq
...
7 and 32
...
Suppose that the automaton is in state q D 5; the states in
Œ5 are, in descending order, 3, 1, and 0
...
5; c/ D 6 in both F INITE AUTOMATON -M ATCHER and KMP-M ATCHER
...
5; b/ D 4
...
Since P Œq 0 C 1 D P Œ4 D b, the test in line 8
comes up true, and KMP-M ATCHER moves to the new state q 0 C 1 D 4 D ı
...

Finally, suppose that the next character scanned is instead a, so that the automaton should move to state ı
...
The first three times that the test in line 6
executes, the test comes up true
...
The second
time, we find that P Œ4 D b ¤ a and move to state Œ3 D 1 (the second state
in Œ5)
...
The while loop exits once it arrives in state q 0 D 0
...
5; a/
...

Although that might seem like a lot of work just to simulate computing ı
...

We are now ready to formally prove the correctness of the Knuth-Morris-Pratt
algorithm
...
4, we have that q D
...
Therefore, it suffices to show that the
same property holds with regard to the for loop in KMP-M ATCHER
...
Initially, both procedures
set q to 0 as they enter their respective for loops for the first time
...
By the inductive hypothesis, we have q 0 D
...
We need to show
that q D
...
(Again, we shall handle line 12 separately
...
We consider separately the three cases in which
...
Ti / D q 0 C 1, and 0 <
...


32
...
Ti / D 0, then P0 D " is the only prefix of P that is a suffix of Ti
...
The loop
terminates when q reaches 0, and of course line 9 does not execute
...
Ti /
...
Ti / D q 0 C 1, then P Œq 0 C 1 D T Œi, and the while loop test in line 6
fails the first time through
...
Ti /
...
Ti / Ä q 0 , then the while loop of lines 6–7 iterates at least once,
checking in decreasing order each value q 2 Œq 0  until it stops at some q < q 0
...
Pq0 T Œi/
...
Ti 1 /, Lemma 32
...
Ti 1 T Œi/ D
...
Thus, we have
qC1 D
D
D


...
Ti 1 T Œi/

...
After line 9 increments q, we have q D
...

Line 12 is necessary in KMP-M ATCHER, because otherwise, we might reference P Œm C 1 on line 6 after finding an occurrence of P
...
Ti 1 / upon the next execution of line 6 remains valid by the hint given in
Exercise 32
...
m; a/ D ı
...
P a/ D
...
) The remaining argument for the correctness of the Knuth-MorrisPratt algorithm follows from the correctness of F INITE -AUTOMATON -M ATCHER,
since we have shown that KMP-M ATCHER simulates the behavior of F INITE AUTOMATON -M ATCHER
...
4-1
Compute the prefix function

for the pattern ababbabbabbababbabb
...
4-2
Give an upper bound on the size of
show that your bound is tight
...
Give an example to

32
...


1012

Chapter 32 String Matching

32
...
n/
...
4-5
Use a potential function to show that the running time of KMP-M ATCHER is ‚
...

32
...

32
...
For example, arc and car are cyclic rotations of each other
...
4-8 ?
Give an O
...
(Hint: Prove that
ı
...
Œq; a/ if q D m or P Œq C 1 ¤ a
...
For example,

...
We say that a string x 2 † has repetition factor r if x D y r
for some string y 2 † and some r > 0
...
x/ denote the largest r such that x
has repetition factor r
...
Give an efficient algorithm that takes as input a pattern P Œ1 : : m and computes
the value
...
What is the running time of your algorithm?

Notes for Chapter 32

1013

b
...
P / be defined as max1Äi Äm
...
Prove that if
the pattern P is chosen randomly from the set of all binary strings of length m,
then the expected value of
...
1/
...
Argue that the following string-matching algorithm correctly finds all occurrences of pattern P in a text T Œ1 : : n in time O
...
P; T /
1 m D P:length
2 n D T:length
3 k D 1 C
...
1; dq=ke/
13
q D0
This algorithm is due to Galil and Seiferas
...
1/ storage beyond what is required for P and T
...
The Knuth-Morris-Pratt algorithm [214] was
invented independently by Knuth and Pratt and by Morris; they published their
work jointly
...
The Rabin-Karp algorithm was proposed by Karp
and Rabin [201]
...
1/ space beyond that required to
store the pattern and text
...
In modern engineering and mathematics, computational geometry has applications in such diverse fields as computer graphics,
robotics, VLSI design, computer-aided design, molecular modeling, metallurgy,
manufacturing, textile layout, forestry, and statistics
...
The output is often a response to a query about the objects, such as
whether any of the lines intersect, or perhaps a new geometric object, such as the
convex hull (smallest enclosing convex polygon) of the set of points
...
We represent each input object by a set of
points fp1 ; p2 ; p3 ; : : :g, where each pi D
...
For example, we represent an n-vertex polygon P by a sequence hp0 ; p1 ; p2 ; : : : ; pn 1 i
of its vertices in order of their appearance on the boundary of P
...
Even in
two dimensions, however, we can see a good sample of computational-geometry
techniques
...
1 shows how to answer basic questions about line segments efficiently and accurately: whether one segment is clockwise or counterclockwise
from another that shares an endpoint, which way we turn when traversing two
adjoining line segments, and whether two line segments intersect
...
2
presents a technique called “sweeping” that we use to develop an O
...
Section 33
...
n lg n/, and Jarvis’s march, which takes O
...
Finally, Section 33
...
1 Line-segment properties

1015

an O
...


33
...
A convex combination of two
distinct points p1 D
...
x2 ; y2 / is any point p3 D
...
1 ˛/x2 and
y3 D ˛y1 C
...
We also write that p3 D ˛p1 C
...
Intuitively, p3
is any point that is on the line passing through p1 and p2 and is on or between p1
and p2 on the line
...
We call p1 and p2 the endpoints
of segment p1 p2
...
If p1 is the origin
...

segment p1 p2
2
In this section, we shall explore the following questions:
!
!
!
!
1
...
Given two line segments p0 p1 and p1 p2 , if we traverse p0 p1 and then p1 p2 ,
do we make a left turn at point p1 ?
3
...

We can answer each question in O
...
1/
...
We need neither division
nor trigonometric functions, both of which can be computationally expensive and
prone to problems with round-off error
...
When the segments are
nearly parallel, this method is very sensitive to the precision of the division operation on real computers
...


1016

Chapter 33 Computational Geometry

y

p1 + p2

y
p

p2
(0,0)

x
p1
(0,0)

x
(a)

(b)

Figure 33
...

(b) The lightly shaded region contains vectors that are clockwise from p
...


Cross products
Computing cross products lies at the heart of our line-segment methods
...
1(a)
...
0; 0/, p1 , p2 ,
and p1 C p2 D
...
An equivalent, but more useful, definition gives
the cross product as the determinant of a matrix:1
Â
Ã
x1 x2
p1 p2 D det
y1 y2
D x1 y 2 x2 y 1
D
p2 p1 :
If p1 p2 is positive, then p1 is clockwise from p2 with respect to the origin
...
(See Exercise 33
...
) Figure 33
...
A boundary condition arises if the cross product is 0; in this
case, the vectors are colinear, pointing in either the same or opposite directions
...
That
0
0
0
0
is, we let p1 p0 denote the vector p1 D
...
We then compute the cross product
1 Actually, the cross product is a three-dimensional concept
...
In this
chapter, however, we find it convenient to treat the cross product simply as the value x1 y2 x2 y1
...
1 Line-segment properties

1017

p2

p2
p1

counterclockwise

p1

clockwise

p0

p0
(a)

(b)

Figure 33
...
We check whether the directed segment p0 p2 is clockwise or counterclockwise
!
relative to the directed segment p0 p1
...
(b) If
clockwise, they make a right turn
...
p1

p0 /


...
x1

x0 /
...
x2

x0 /
...

Determining whether consecutive segments turn left or right
Our next question is whether two consecutive line segments p0 p1 and p1 p2 turn
left or right at point p1
...
Cross products allow us to answer this question without computing the angle
...
2 shows, we simply check whether directed
!
p2
segment p0 ! is clockwise or counterclockwise relative to directed segment p0 p1
...
p2 p0 /
...
If the sign of
!
p2
this cross product is negative, then p0 ! is counterclockwise with respect to p0 p1 ,
and thus we make a left turn at p1
...
A cross product of 0 means that points p0 , p1 , and p2
are colinear
...
A segment p1 p2 straddles a line if point p1
lies on one side of the line and point p2 lies on the other side
...
Two line segments intersect if and only
if either (or both) of the following conditions holds:
1
...

2
...
(This condition comes
from the boundary case
...
S EGMENTS -I NTERSECT returns
if segments p1 p2 and p3 p4 intersect and FALSE if they do not
...

TRUE

S EGMENTS -I NTERSECT
...
p3 ; p4 ; p1 /
2 d2 D D IRECTION
...
p1 ; p2 ; p3 /
4 d4 D D IRECTION
...
d1 > 0 and d2 < 0/ or
...
d3 > 0 and d4 < 0/ or
...
p3 ; p4 ; p1 /
8
return TRUE
9 elseif d2 == 0 and O N -S EGMENT
...
p1 ; p2 ; p3 /
12
return TRUE
13 elseif d4 == 0 and O N -S EGMENT
...
pi ; pj ; pk /
1 return
...
pj

pi /

O N -S EGMENT
...
xi ; xj / Ä xk Ä max
...
yi ; yj / Ä yk Ä max
...
Lines 1–4 compute the relative orientation di of each endpoint pi with respect to the other segment
...
Segment p1 p2 straddles the line containing seg!
!
ment p3 p4 if directed segments p3 p1 and p3 p2 have opposite orientations relative
!
...
Similarly, p p straddles
to p3 p4
1
2
3 4
the line containing p1 p2 if the signs of d3 and d4 differ
...
Figure 33
...
Otherwise, the segments do not straddle

33
...
3 Cases in the procedure S EGMENTS -I NTERSECT
...
Because p3 p4 straddles the line containing p1 p2 , the signs of the cross
products
...
p2 p1 / and
...
p2 p1 / differ
...
p1 p3 /
...
p2 p3 /
...
(b) Segment p3 p4 straddles the line containing p1 p2 , but p1 p2 does not straddle the line
containing p3 p4
...
p1 p3 /
...
p2 p3 /
...
(c) Point p3 is colinear with p1 p2 and is between p1 and p2
...
The segments do not intersect
...
If all the relative orientations are nonzero, no boundary case applies
...
Figure 33
...

A boundary case occurs if any relative orientation dk is 0
...
It is directly on the other segment if and only
if it is between the endpoints of the other segment
...
Figures 33
...
In
Figure 33
...
No endpoints are on other segments in Figure 33
...


1020

Chapter 33 Computational Geometry

Other applications of cross products
Later sections of this chapter introduce additional uses for cross products
...
3, we shall need to sort a set of points according to their polar angles with
respect to a given origin
...
1-3 asks you to show, we can use cross
products to perform the comparisons in the sorting procedure
...
2, we
shall use red-black trees to maintain the vertical ordering of a set of line segments
...

Exercises
33
...
0; 0/ and that if this cross product is negative, then p1 is
counterclockwise from p2
...
1-2
Professor van Pelt proposes that only the x-dimension needs to be tested in line 1
of O N -S EGMENT
...

33
...
For example, the polar angle
of
...
2; 4/ is the angle of the vector
...
The polar angle of
...
2; 4/ is the angle of the
vector
...
Write pseudocode to sort a
sequence hp1 ; p2 ; : : : ; pn i of n points according to their polar angles with respect
to a given origin point p0
...
n lg n/ time and use cross
products to compare angles
...
1-4
Show how to determine in O
...

33
...
That is, it is a curve
ending on itself that is formed by a sequence of straight-line segments, called the
sides of the polygon
...
If the polygon is simple, as we shall generally assume, it does not cross itself
...
2 Determining whether any pair of segments intersects

1021

the polygon, the set of points on the polygon itself forms its boundary, and the set
of points surrounding the polygon forms its exterior
...

A vertex of a convex polygon cannot be expressed as a convex combination of any
two distinct points on the boundary or in the interior of the polygon
...
Output “yes” if the set f†pi pi C1 pi C2 W i D 0; 1; : : : ; n 1g, where subscript addition is performed modulo n, does not contain both left turns and right
turns; otherwise, output “no
...
Modify the professor’s method so
that it always produces the correct answer in linear time
...
1-6
Given a point p0 D
...
xi ; yi / W xi x0 and yi D y0 g, that is, it is the set of points due right of p0
along with p0 itself
...
1/ time by reducing the problem to
that of determining whether two line segments intersect
...
1-7
One way to determine whether a point p0 is in the interior of a simple, but not
necessarily convex, polygon P is to look at any ray from p0 and check that the ray
intersects the boundary of P an odd number of times but that p0 itself is not on
the boundary of P
...
n/ time whether a point p0 is in
the interior of an n-vertex polygon P
...
1-6
...
)
33
...
n/ time
...
1-5 for definitions pertaining to polygons
...
2 Determining whether any pair of segments intersects
This section presents an algorithm for determining whether any two line segments
in a set of segments intersect
...
Moreover, as

1022

Chapter 33 Computational Geometry

the exercises at the end of this section show, this algorithm, or simple variations of
it, can help solve other computational-geometry problems
...
n lg n/ time, where n is the number of segments we are
given
...
(By Exercise 33
...
n2 / time in the worst case to
find all the intersections in a set of n line segments
...
We treat the spatial dimension that
the sweep line moves across, in this case the x-dimension, as a dimension of
time
...
The line-segment-intersection algorithm in this section considers all
the line-segment endpoints in left-to-right order and checks for an intersection each
time it encounters an endpoint
...
First, we
assume that no input segment is vertical
...
Exercises 33
...
2-9 ask you to show
that the algorithm is robust enough that it needs only a slight modification to work
even when these assumptions do not hold
...

Ordering segments
Because we assume that there are no vertical segments, we know that any input
segment intersecting a given vertical sweep line intersects it at a single point
...

To be more precise, consider two segments s1 and s2
...
We say that s1 is above s2 at x, written s1 at x and the intersection of s1 with the sweep line at x is higher than the intersection
of s2 with the same sweep line, or if s1 and s2 intersect at the sweep line
...
4(a), for example, we have the relationships a a < t c, and b ...

For any given x, the relation “ ...
That is, the relation is transitive, and
if segments s1 and s2 each intersect the sweep line at x, then either s1 or s2 ...
2 Determining whether any pair of segments intersects

1023

e
d
a

b

g

i

h

c

f
r

t

u
(a)

v

z
(b)

w

Figure 33
...
(a) We have a a < t b, b < t c, a < t c, and b ...

(b) When segments e and f intersect, they reverse their orders: we have e < f but f ...


also reflexive, but neither symmetric nor antisymmetric
...

A segment enters the ordering when its left endpoint is encountered by the sweep,
and it leaves the ordering when its right endpoint is encountered
...
4(b) shows, the segments reverse their positions in the total
preorder
...
Note
that because we assume that no three segments intersect at the same point, there
must be some vertical sweep line x for which intersecting segments e and f are
consecutive in the total preorder ...
4(b), such as ´, has e and f consecutive in its total preorder
...
The sweep-line status gives the relationships among the objects that the sweep
line intersects
...
The event-point schedule is a sequence of points, called event points, which
we order from left to right according to their x-coordinates
...

Changes to the sweep-line status occur only at event points
...
2-7, for example),
the event-point schedule develops dynamically as the algorithm progresses
...
In particular, each segment endpoint
is an event point
...
(If two or more endpoints are covertical, i
...
, they have
the same x-coordinate, we break the tie by putting all the covertical left endpoints
before the covertical right endpoints
...
) When we encounter a segment’s left endpoint, we insert the
segment into the sweep-line status, and we delete the segment from the sweep-line
status upon encountering its right endpoint
...

The sweep-line status is a total preorder T , for which we require the following
operations:
I NSERT
...

D ELETE
...

A BOVE
...

B ELOW
...

It is possible for segments s1 and s2 to be mutually above each other in the total
preorder T ; this situation can occur if s1 and s2 intersect at the sweep line whose
total preorder is given by T
...

If the input contains n segments, we can perform each of the operations I NSERT,
D ELETE, A BOVE, and B ELOW in O
...
Recall that
the red-black-tree operations in Chapter 13 involve comparing keys
...
2-2)
...

A red-black tree maintains the total preorder T
...
2 Determining whether any pair of segments intersects

1025

A NY-S EGMENTS -I NTERSECT
...
T; s/
6
if (A BOVE
...
T; s/ exists and intersects s)
7
return TRUE
8
if p is the right endpoint of a segment s
9
if both A BOVE
...
T; s/ exist
and A BOVE
...
T; s/
10
return TRUE
11
D ELETE
...
5 illustrates how the algorithm works
...
Line 2 determines the event-point schedule by sorting the 2n segment
endpoints from left to right, breaking ties as described above
...
x; e; y/, where x and y are
the usual coordinates, e D 0 for a left endpoint, and e D 1 for a right endpoint
...
If p is
the left endpoint of a segment s, line 5 adds s to the total preorder, and lines 6–7
return TRUE if s intersects either of the segments it is consecutive with in the total
preorder defined by the sweep line passing through p
...
In this case, we require only that s and s 0
be placed consecutively into T
...
But first, lines 9–10 return TRUE if
there is an intersection between the segments surrounding s in the total preorder
defined by the sweep line passing through p
...
If the segments surrounding
segment s intersect, they would have become consecutive after deleting s had the
return statement in line 10 not prevented line 11 from executing
...
Finally, if we never find any intersections after having processed
all 2n event points, line 12 returns FALSE
...
5 The execution of A NY-S EGMENTS -I NTERSECT
...
Except for the rightmost sweep line, the ordering of segment names below each sweep
line corresponds to the total preorder T at the end of the for loop processing the corresponding event
point
...


Correctness
To show that A NY-S EGMENTS -I NTERSECT is correct, we will prove that the call
A NY-S EGMENTS -I NTERSECT
...

It is easy to see that A NY-S EGMENTS -I NTERSECT returns TRUE (on lines 7
and 10) only if it finds an intersection between two of the input segments
...

We also need to show the converse: that if there is an intersection, then A NYS EGMENTS -I NTERSECT returns TRUE
...
Let p be the leftmost intersection point, breaking ties by choosing the
point with the lowest y-coordinate, and let a and b be the segments that intersect
at p
...
Because no three segments intersect at the same point, a
and b become consecutive in the total preorder at some sweep line ´
...
Some segment endpoint q on sweep line ´

2 If we allow three segments to intersect at the same point, there may be an intervening segment

c that
intersects both a and b at point p
...
Exercise 33
...


33
...
If p
is on sweep line ´, then q D p
...
In either case, the order given by T is correct just before encountering q
...
Because p is the lowest of the leftmost intersection points, even if p
is on sweep line ´ and some other intersection point p 0 is on ´, event point q D p
is processed before the other intersection p 0 can interfere with the total preorder T
...
) Either event
point q is processed by A NY-S EGMENTS -I NTERSECT or it is not processed
...
Either a or b is inserted into T , and the other segment is above or below it in
the total preorder
...

2
...
Lines 8–11 detect this
case
...

If event point q is not processed by A NY-S EGMENTS -I NTERSECT, the procedure must have returned before processing all event points
...

Thus, if there is an intersection, A NY-S EGMENTS -I NTERSECT returns TRUE
...
Therefore, A NY-S EGMENTS -I NTERSECT always returns a correct
answer
...
n lg n/
...
1/ time
...
n lg n/ time, using merge
sort or heapsort
...
Each iteration takes
O
...
lg n/ time and, using the
method of Section 33
...
1/ time
...
n lg n/
...
2-1
Show that a set of n line segments may contain ‚
...

33
...
1/ time which of a ...
(Hint: If a and b do not intersect, you can just use cross products
...
Of course, in the application of the intersect, we can just stop and declare that we have found an intersection
...
2-3
Professor Mason suggests that we modify A NY-S EGMENTS -I NTERSECT so that
instead of returning upon finding an intersection, it prints the segments that intersect and continues on to the next iteration of the for loop
...
Professor Dixon disagrees, claiming that Professor Mason’s idea is incorrect
...
2-4
Give an O
...

33
...
n lg n/-time algorithm to determine whether two simple polygons with
a total of n vertices intersect
...
2-6
A disk consists of a circle plus its interior and is represented by its center point and
radius
...
Give an O
...

33
...
n C k/ lg n/ time
...
3 Finding the convex hull

1029

33
...

33
...
How does your
answer to Exercise 33
...
3 Finding the convex hull
The convex hull of a set Q of points, denoted by CH
...
(See Exercise 33
...
) We
implicitly assume that all points in the set Q are unique and that Q contains at
least three points which are not colinear
...
The convex hull is then the shape
formed by a tight rubber band that surrounds all the nails
...
6 shows a set
of points and its convex hull
...
Both algorithms output the vertices of the convex hull in
counterclockwise order
...
n lg n/ time
...
nh/ time, where h is the number of
vertices of the convex hull
...
6 illustrates, every vertex of CH
...
6 A set of points Q D fp0 ; p1 ; : : : ; p12 g with its convex hull CH
...


1030

Chapter 33 Computational Geometry

point in Q
...

We can compute convex hulls in O
...

Both Graham’s scan and Jarvis’s march use a technique called “rotational sweep,”
processing vertices in the order of the polar angles they form with a reference
vertex
...
At the ith stage, we update the convex hull of the
i 1 leftmost points, CH
...
fp1 ; p2 ; : : : ; pi g/
...
3-6 asks you how to
implement this method to take a total of O
...

In the divide-and-conquer method, we divide the set of n points in ‚
...
n/ time
...
n/ D 2T
...
n/,
and so the divide-and-conquer method runs in O
...

The prune-and-search method is similar to the worst-case linear-time median
algorithm of Section 9
...
With this method, we find the upper portion (or “upper
chain”) of the convex hull by repeatedly throwing out a constant fraction of the
remaining points until only the upper chain of the convex hull remains
...
This method is asymptotically the fastest: if
the convex hull contains h vertices, it runs in only O
...

Computing the convex hull of a set of points is an interesting problem in its own
right
...
Consider, for example, the two-dimensional farthestpair problem: we are given a set of n points in the plane and wish to find the
two points whose distance from each other is maximum
...
3-3 asks
you to prove, these two points must be vertices of the convex hull
...
n/ time
...
n lg n/ time and then finding the farthest pair of the resulting convex-polygon
vertices, we can find the farthest pair of points in any set of n points in O
...

Graham’s scan
Graham’s scan solves the convex-hull problem by maintaining a stack S of candidate points
...
3 Finding the convex hull

1031

and it eventually pops from the stack each point that is not a vertex of CH
...

When the algorithm terminates, stack S contains exactly the vertices of CH
...

The procedure G RAHAM -S CAN takes as input a set Q of points, where jQj 3
...
S/, which returns the point on top of stack S without
changing S, and N EXT-T O -T OP
...
As we shall prove in a moment, the stack S
returned by G RAHAM -S CAN contains, from bottom to top, exactly the vertices
of CH
...

G RAHAM -S CAN
...
p0 ; S/
5 P USH
...
p2 ; S/
7 for i D 3 to m
8
while the angle formed by points N EXT-T O -T OP
...
S/,
and pi makes a nonleft turn
9
P OP
...
pi ; S/
11 return S
Figure 33
...
Line 1 chooses point p0
as the point with the lowest y-coordinate, picking the leftmost such point in case
of a tie
...
Q/
...
1-3
...
We let m denote the number of points other than p0 that remain
...
Since the points are sorted according to polar angles,
they are sorted in counterclockwise order relative to p0
...
Note that points p1 and pm are vertices

1032

Chapter 33 Computational Geometry
p10

p10

p9

p11

p7

p6

p8

p9

p11

p12

p3

p2

p6

p8

p5
p4

p7

p5
p4

p12

p2

p1
p0

p1
p0

(a)

p10

(b)

p10

p9

p11

p7

p6

p8

p9

p11
p3

p4
p2

p7

p6
p5

p8

p5

p12

p0

p1
p0

(c)

p10

(d)

p10

p9

p7

p6

p8

p11
p5
p4

p12

p3

p2

p9

p7
p8

(e)

p6
p5
p4

p12

p3

p2

p1
p0

p3

p4
p2

p12

p1

p11

p3

p1
p0

(f)

Figure 33
...
6
...
(a) The sequence hp1 ; p2 ; : : : ; p12 i of points
numbered in order of increasing polar angle relative to p0 , and the initial stack S containing p0 , p1 ,
and p2
...
Dashed lines show nonleft
turns, which cause points to be popped from the stack
...


33
...
7, continued (l) The convex hull returned by the procedure, which matches that of
Figure 33
...


1034

Chapter 33 Computational Geometry

of CH
...
3-1)
...
7(a) shows the points of Figure 33
...

The remainder of the procedure uses the stack S
...
Figure 33
...
The for loop of lines 7–10 iterates once for each point
in the subsequence hp3 ; p4 ; : : : ; pm i
...
fp0 ; p1 ; : : : ; pi g/ in counterclockwise order
...
When we traverse the convex
hull counterclockwise, we should make a left turn at each vertex
...
(By checking for a nonleft turn, rather than just a right turn, this
test precludes the possibility of a straight angle at a vertex of the resulting convex
hull
...
) After we pop all vertices
that have nonleft turns when heading toward point pi , we push pi onto the stack
...
7(b)–(k) show the state of the stack S after each iteration of the for
loop
...
Figure 33
...

The following theorem formally proves the correctness of G RAHAM -S CAN
...
1 (Correctness of Graham’s scan)
If G RAHAM -S CAN executes on a set Q of points, where jQj 3, then at termination, the stack S consists of, from bottom to top, exactly the vertices of CH
...

Proof After line 2, we have the sequence of points hp1 ; p2 ; : : : ; pm i
...
The
points in Q Qm are those that were removed because they had the same polar
angle relative to p0 as some point in Qm ; these points are not in CH
...
Qm / D CH
...
Thus, it suffices to show that when G RAHAM -S CAN
terminates, the stack S consists of the vertices of CH
...
Note that just as p0 , p1 , and pm are vertices
of CH
...
Qi /
...
Qi 1 / in counterclockwise
order
...
3 Finding the convex hull

pj

1035

pj

pk

pi

pi

pr
pt

Qj

p2
p1

p0

p1
p0

(a)

(b)

Figure 33
...
(a) Because pi ’s polar angle relative
to p0 is greater than pj ’s polar angle, and because the angle †pk pj pi makes a left turn, adding pi
to CH
...
Qj [ fpi g/
...
Qi /
...
Moreover, they appear in counterclockwise
order from bottom to top
...
Let pj be the top point on S after executing the
while loop of lines 8–9 but before line 10 pushes pi , and let pk be the point
just below pj on S
...
By the loop invariant, therefore, S contains exactly
the vertices of CH
...

Let us continue to focus on this moment just before pushing pi
...

Therefore, because S contains exactly the vertices of CH
...
8(a) that once we push pi , stack S will contain exactly the vertices
of CH
...

We now show that CH
...
Qi /
...

The angle †pr p t pi makes a nonleft turn, and the polar angle of p t relative
to p0 is greater than the polar angle of pr
...
8(b) shows, p t must

1036

Chapter 33 Computational Geometry

be either in the interior of the triangle formed by p0 , pr , and pi or on a side of
this triangle (but it is not a vertex of the triangle)
...
Qi /
...
Qi /, we have that
CH
...
Qi / :

(33
...

Since the equality (33
...
Qi Pi / D CH
...
But Qi Pi D Qj [ fpi g, and so we
conclude that CH
...
Qi Pi / D CH
...

We have shown that once we push pi , stack S contains exactly the vertices
of CH
...
Incrementing i will
then cause the loop invariant to hold for the next iteration
...
Qm /, which
is CH
...
This completes the
proof
...
n lg n/, where
n D jQj
...
n/ time
...
n lg n/ time, using merge sort
or heapsort to sort the polar angles and the cross-product method of Section 33
...
(We can remove all but the farthest point with the same polar
angle in total of O
...
) Lines 3–6 take O
...
Because
m Ä n 1, the for loop of lines 7–10 executes at most n 3 times
...
1/ time, each iteration takes O
...
n/ time exclusive of
the nested while loop
...
n/ time overall
...
As in the
analysis of the M ULTIPOP procedure of Section 17
...
At least three points—p0 , p1 , and pm —are
never popped from the stack, so that in fact at most m 2 P OP operations are
performed in total
...
Since the test in
line 8 takes O
...
1/ time, and m Ä n 1, the total
time taken by the while loop is O
...
Thus, the running time of G RAHAM -S CAN
is O
...


33
...
9 The operation of Jarvis’s march
...

The next vertex, p1 , has the smallest polar angle of any point with respect to p0
...
The right chain goes as high as the highest point p3
...


Jarvis’s march
Jarvis’s march computes the convex hull of a set Q of points by a technique known
as package wrapping (or gift wrapping)
...
nh/,
where h is the number of vertices of CH
...
When h is o
...

Intuitively, Jarvis’s march simulates wrapping a taut piece of paper around the
set Q
...
We know that this point
must be a vertex of the convex hull
...
This point must also be a vertex
of the convex hull
...

More formally, Jarvis’s march builds a sequence H D hp0 ; p1 ; : : : ; ph 1 i of the
vertices of CH
...
We start with p0
...
9 shows, the next vertex p1
in the convex hull has the smallest polar angle with respect to p0
...
) Similarly, p2 has the smallest polar angle

1038

Chapter 33 Computational Geometry

with respect to p1 , and so on
...
9
shows, the right chain of CH
...
To construct the left chain, we start at pk and
choose pkC1 as the point with the smallest polar angle with respect to pk , but from
the negative x-axis
...

We could implement Jarvis’s march in one conceptual sweep around the convex
hull, that is, without separately constructing the right and left chains
...
The advantage of constructing separate chains is that we need
not explicitly compute angles; the techniques of Section 33
...

If implemented properly, Jarvis’s march has a running time of O
...
For each
of the h vertices of CH
...
Each
comparison between polar angles takes O
...
1
...
1 shows, we can compute the minimum of n values in O
...
1/ time
...
nh/ time
...
3-1
Prove that in the procedure G RAHAM -S CAN, points p1 and pm must be vertices
of CH
...

33
...
n lg n/ to sort n numbers
...
n lg n/ is a lower bound for computing, in order, the vertices of the convex
hull of a set of n points in such a model
...
3-3
Given a set of points Q, prove that the pair of points farthest from each other must
be vertices of CH
...

33
...
As Figure 33
...
The set of all such points p is called the kernel of P
...
4 Finding the closest pair of points

1039

q′

p
q
(a)

(b)

Figure 33
...
3-4
...
The segment from point p to any point q on the boundary intersects the boundary only at q
...
The shaded region on the left is the shadow of q, and the shaded
region on the right is the shadow of q 0
...


star-shaped polygon P specified by its vertices in counterclockwise order, show
how to compute CH
...
n/ time
...
3-5
In the on-line convex-hull problem, we are given the set Q of n points one point at
a time
...
Obviously, we could run Graham’s scan once for each point, with a total
running time of O
...
Show how to solve the on-line convex-hull problem in
a total of O
...

33
...
n lg n/ time
...
4 Finding the closest pair of points
We now consider the problem of finding the closest pair of points in a set Q of
n 2 points
...
x1 ; y1 / and p2 D
...
x1 x2 /2 C
...
Two points
in set Q may be coincident, in which case the distance between them is zero
...
A system for
controlling air or sea traffic might need to identify the two closest vehicles in order
to detect potential collisions
...
n2 / pairs
2
of points
...
n/ D
2T
...
n/
...
n lg n/ time
...

The points in array X are sorted so that their x-coordinates are monotonically
increasing
...

Note that in order to attain the O
...
n/ D 2T
...
n lg n/, whose solution is T
...
n lg2 n/
...
6-2
...

A given recursive invocation with inputs P , X , and Y first checks whether
jP j Ä 3
...
If jP j > 3, the
2
recursive invocation carries out the divide-and-conquer paradigm as follows
...
Divide the array X
into arrays XL and XR , which contain the points of PL and PR respectively,
sorted by monotonically increasing x-coordinate
...

Conquer: Having divided P into PL and PR , make two recursive calls, one to find
the closest pair of points in PL and the other to find the closest pair of points
in PR
...
Let the closest-pair distances
returned for PL and PR be ıL and ıR , respectively, and let ı D min
...

Combine: The closest pair is either the pair with distance ı found by one of the
recursive calls, or it is a pair of points with one point in PL and the other in PR
...
Observe that if a pair of
points has distance less than ı, both points of the pair must be within ı units
of line l
...
11(a) shows, they both must reside in the 2ı-wide
vertical strip centered at line l
...
4 Finding the closest pair of points

1041

1
...
The array Y 0 is sorted by y-coordinate, just as Y is
...
For each point p in the array Y 0 , try to find points in Y 0 that are within ı
units of p
...
Compute the distance from p to each of these 7 points, and
keep track of the closest-pair distance ı 0 found over all pairs of points in Y 0
...
If ı 0 < ı, then the vertical strip does indeed contain a closer pair than the
recursive calls found
...
Otherwise, return
the closest pair and its distance ı found by the recursive calls
...
n lg n/ running time
...

Correctness
The correctness of this closest-pair algorithm is obvious, except for two aspects
...
The second aspect is that we need
only check the 7 points following each point p in array Y 0 ; we shall now prove this
property
...
Thus, the distance ı 0 between pL and pR is strictly less than ı
...
Similarly, pR
is on or to the right of l and less than ı units away
...
Thus, as Figure 33
...
(There may be other points within
this rectangle as well
...

Consider the ı ı square forming the left half of this rectangle
...
11(b) shows how
...
Thus, at most 8 points of P
can reside within the ı 2ı rectangle
...
This limit is achieved if there are
two pairs of coincident points such that each pair consists of one point from PL and
one point from PR , one pair is at the intersection of l and the top of the rectangle,
and the other pair is where l intersects the bottom of the rectangle
...
Still assuming that the closest pair is pL and pR , let us assume without

1042

Chapter 33 Computational Geometry

PR
PL

PR



δ

PL

δ

pR

pL

δ

δ

l

l

(a)

coincident points,
one in PL,
one in PR
coincident points,
one in PL,
one in PR

(b)

Figure 33
...
(a) If pL 2 PL and pR 2 PR are less than ı units apart, they
must reside within a ı 2ı rectangle centered at line l
...
On the left are 4 points in PL , and on the right are 4
points in PR
...


loss of generality that pL precedes pR in array Y 0
...
Thus, we have shown the correctness of the closest-pair algorithm
...
n/ D
2T
...
n/, where T
...
The main
difficulty comes from ensuring that the arrays XL , XR , YL , and YR , which are
passed to recursive calls, are sorted by the proper coordinate and also that the
array Y 0 is sorted by y-coordinate
...
)
The key observation is that in each call, we wish to form a sorted subset of a
sorted array
...
Having partitioned P into PL and PR , it needs to
form the arrays YL and YR , which are sorted by y-coordinate, in linear time
...
4 Finding the closest pair of points

1043

Section 2
...
1: we are splitting a sorted array into two sorted arrays
...

1 let YL Œ1 : : Y:length and YR Œ1 : : Y:length be new arrays
2 YL :length D YR :length D 0
3 for i D 1 to Y:length
4
if Y Œi 2 PL
5
YL :length D YL :length C 1
6
YL ŒYL :length D Y Œi
7
else YR :length D YR :length C 1
8
YR ŒYR :length D Y Œi
We simply examine the points in array Y in order
...

Similar pseudocode works for forming arrays XL , XR , and Y 0
...
We
presort them; that is, we sort them once and for all before the first recursive call
...
Presorting adds an additional
O
...
Thus, if we let T
...
n/ be the running time of the entire algorithm, we get
T 0
...
n/ C O
...
n=2/ C O
...
n/ D
O
...
n/ D O
...
n/ D O
...

Exercises
33
...
The idea is always to place
points on line l into set PL
...
Thus, at most 6 points can reside in
the ı 2ı rectangle
...
4-2
Show that it actually suffices to check only the points in the 5 array positions following each point in the array Y 0
...
4-3
We can define the distance between two points in ways other than euclidean
...
jx1 x2 jm C jy1 y2 jm /
...

Modify the closest-pair algorithm to use the L1 -distance, which is also known as
the Manhattan distance
...
4-4
Given two points p1 and p2 in the plane, the L1 -distance between them is
given by max
...
Modify the closest-pair algorithm to use the
L1 -distance
...
4-5
Suppose that
...

Show how to determine the sets PL and PR and how to determine whether each
point of Y is in PL or PR so that the running time for the closest-pair algorithm
remains O
...

33
...
n lg n/
...
)

Problems
33-1 Convex layers
Given a set Q of points in the plane, we define the convex layers of Q inductively
...
Q/
...
Then, the ith convex layer of Q is CH
...

a
...
n2 /-time algorithm to find the convex layers of a set of n points
...
Prove that
...
n lg n/ time to sort n real
numbers
...
We say that point
...
A point in Q that is dominated by no other
point
...
Note that Q may contain many maximal points,
which can be organized into maximal layers as follows
...
For i > 1, the ith maximal layer Li is the set of
Si 1
maximal points in Q
j D1 Lj
...
For now, assume that no two points
in Q have the same x- or y-coordinate
...
Show that y1 > y2 >

> yk
...
x; y/ that is to the left of any point in Q and for which y is
distinct from the y-coordinate of any point in Q
...
x; y/g
...
Let j be the minimum index such that yj < y, unless y < yk , in which case
we let j D k C 1
...
x; y/ as its new leftmost point
...
k C 1/st maximal layer: LkC1 D f
...

c
...
n lg n/-time algorithm to compute the maximal layers of a set Q
of n points
...
)
d
...

33-3 Ghostbusters and ghosts
A group of n Ghostbusters is battling n ghosts
...
A stream goes in a straight
line and terminates when it hits the ghost
...
They will pair off with the ghosts, forming n Ghostbuster-ghost
pairs, and then simultaneously each Ghostbuster will shoot a stream at his chosen ghost
...

Assume that the position of each Ghostbuster and each ghost is a fixed point in
the plane and that no three positions are colinear
...
Argue that there exists a line passing through one Ghostbuster and one ghost
such that the number of Ghostbusters on one side of the line equals the number
of ghosts on the same side
...
n lg n/ time
...
Give an O
...

33-4 Picking up sticks
Professor Charon has a set of n sticks, which are piled up in some configuration
...
x; y; ´/ coordinates
...
He wishes to pick up all the
sticks, one at a time, subject to the condition that he may pick up a stick only if
there is no other stick on top of it
...
Give a procedure that takes two sticks a and b and reports whether a is above,
below, or unrelated to b
...
Describe an efficient algorithm that determines whether it is possible to pick up
all the sticks, and if so, provides a legal order in which to pick them up
...
Sometimes,
the number of points, or size, of the convex hull of n points drawn from such a
distribution has expectation O
...
We call such a
distribution sparse-hulled
...
The convex hull has expected
size ‚
...

Points drawn uniformly from the interior of a convex polygon with k sides, for
any constant k
...
lg n/
...
The convex
p
hull has expected size ‚
...

a
...
n1 Cn2 / time
...
)
b
...
n/ average-case time
...
)

Notes for Chapter 33

1047

Chapter notes
This chapter barely scratches the surface of computational-geometry algorithms
and techniques
...

Although geometry has been studied since antiquity, the development of algorithms for geometric problems is relatively new
...
Lemoine in 1902
...
Lemoine was interested in the
number of primitives needed to effect a given construction; he called this amount
the “simplicity” of the construction
...
2, which determines whether any segments intersect, is due to Shamos and Hoey [313]
...
The packagewrapping algorithm is due to Jarvis [189]
...
n lg n/ for the running
time of any convex-hull algorithm
...
n lg h/ time, is asymptotically optimal
...
n lg n/-time divide-and-conquer algorithm for finding the closest pair of
points is by Shamos and appears in Preparata and Shamos [282]
...


34

NP-Completeness

Almost all the algorithms we have studied thus far have been polynomial-time algorithms: on inputs of size n, their worst-case running time is O
...
You might wonder whether all problems can be solved in polynomial time
...
For example, there are problems, such as Turing’s famous “Halting Problem,” that cannot be solved by any computer, no matter how much time we
allow
...
nk / for any
constant k
...

The subject of this chapter, however, is an interesting class of problems, called
the “NP-complete” problems, whose status is unknown
...

This so-called P ¤ NP question has been one of the deepest, most perplexing open
research problems in theoretical computer science since it was first posed in 1971
...
In each of the following pairs of problems, one is solvable in polynomial
time and the other is NP-complete, but the difference between problems appears to
be slight:
Shortest vs
...
V; E/ in O
...
Finding a longest simple path between two
vertices is difficult, however
...

Euler tour vs
...
V; E/ is a cycle that traverses each edge of G exactly once, although
it is allowed to visit each vertex more than once
...
E/ time and, in fact,

Chapter 34

NP-Completeness

1049

we can find the edges of the Euler tour in O
...
A hamiltonian cycle of
a directed graph G D
...

Determining whether a directed graph has a hamiltonian cycle is NP-complete
...
)
2-CNF satisfiability vs
...
A boolean formula is satisfiable if there exists
some assignment of the values 0 and 1 to its variables that causes it to evaluate
to 1
...
For example, the
boolean formula
...
:x1 _ x3 / ^
...
(It has
the satisfying assignment x1 D 1; x2 D 0; x3 D 1
...

NP-completeness and the classes P and NP
Throughout this chapter, we shall refer to three classes of problems: P, NP, and
NPC, the latter class being the NP-complete problems
...

The class P consists of those problems that are solvable in polynomial time
...
nk / for some
constant k, where n is the size of the input to the problem
...

The class NP consists of those problems that are “verifiable” in polynomial time
...
For example, in the hamiltoniancycle problem, given a directed graph G D
...
We could easily check in polynomial
time that
...
jV j ; 1 / 2 E as well
...
We could check in polynomial time that this assignment
satisfies the boolean formula
...
We shall formalize
this notion later in this chapter, but for now we can believe that P Â NP
...


1050

Chapter 34 NP-Completeness

Informally, a problem is in the class NPC—and we refer to it as being NPcomplete—if it is in NP and is as “hard” as any problem in NP
...

In the meantime, we will state without proof that if any NP-complete problem
can be solved in polynomial time, then every problem in NP has a polynomialtime algorithm
...
Yet, given the effort devoted thus far to proving
that NP-complete problems are intractable—without a conclusive outcome—we
cannot rule out the possibility that the NP-complete problems are in fact solvable
in polynomial time
...
If you can establish a problem as NP-complete, you
provide good evidence for its intractability
...
Moreover, many natural and interesting problems that
on the surface seem no harder than sorting, graph searching, or network flow are
in fact NP-complete
...

Overview of showing problems to be NP-complete
The techniques we use to show that a particular problem is NP-complete differ
fundamentally from the techniques used throughout most of this book to design
and analyze algorithms
...
We are not trying to prove the existence of
an efficient algorithm, but instead that no efficient algorithm is likely to exist
...
1
of an
...
1, however
...
optimization problems
Many problems of interest are optimization problems, in which each feasible (i
...
,
“legal”) solution has an associated value, and we wish to find a feasible solution
with the best value
...
In other words, SHORTEST-PATH
is the single-pair shortest-path problem in an unweighted, undirected graph
...

Although NP-complete problems are confined to the realm of decision problems,
we can take advantage of a convenient relationship between optimization problems
and decision problems
...
For
example, a decision problem related to SHORTEST-PATH is PATH: given a directed graph G, vertices u and , and an integer k, does a path exist from u to
consisting of at most k edges?
The relationship between an optimization problem and its related decision problem works in our favor when we try to show that the optimization problem is
“hard
...
” As a specific example, we can solve PATH by solving SHORTEST-PATH
and then comparing the number of edges in the shortest path found to the value
of the decision-problem parameter k
...
Stated in a way that has
more relevance to NP-completeness, if we can provide evidence that a decision
problem is hard, we also provide evidence that its related optimization problem is
hard
...

Reductions
The above notion of showing that one problem is no harder or no easier than another applies even when both problems are decision problems
...
Let us consider a
decision problem A, which we would like to solve in polynomial time
...
Now suppose that we already know how to solve a different
decision problem B in polynomial time
...

The answers are the same
...


1052

Chapter 34 NP-Completeness

instance α
of A

instance β
polynomial-time
algorithm to decide B
of B
polynomial-time algorithm to decide A

polynomial-time
reduction algorithm

yes

yes

no

no

Figure 34
...
In polynomial
time, we transform an instance ˛ of A into an instance ˇ of B, we solve B in polynomial time, and
we use the answer for ˇ as the answer for ˛
...
1 shows, it provides us a way to solve problem A in polynomial time:
1
...

2
...

3
...

As long as each of these steps takes polynomial time, all three together do also, and
so we have a way to decide on ˛ in polynomial time
...

Recalling that NP-completeness is about showing how hard a problem is rather
than how easy it is, we use polynomial-time reductions in the opposite way to show
that a problem is NP-complete
...
Suppose we have a decision problem A for
which we already know that no polynomial-time algorithm can exist
...
) Suppose further
that we have a polynomial-time reduction transforming instances of A to instances
of B
...
Suppose otherwise; i
...
, suppose that B has a
polynomial-time algorithm
...
1, we
would have a way to solve problem A in polynomial time, which contradicts our
assumption that there is no polynomial-time algorithm for A
...
The proof methodology is similar, however, in that
we prove that problem B is NP-complete on the assumption that problem A is also
NP-complete
...
1 Polynomial time

1053

A first NP-complete problem
Because the technique of reduction relies on having a problem already known to
be NP-complete in order to prove a different problem NP-complete, we need a
“first” NP-complete problem
...
We shall prove that this first
problem is NP-complete in Section 34
...

Chapter outline
This chapter studies the aspects of NP-completeness that bear most directly on the
analysis of algorithms
...
1, we formalize our notion of “problem” and
define the complexity class P of polynomial-time solvable decision problems
...
Section 34
...
It also formally poses the P ¤ NP question
...
3 shows we can relate problems via polynomial-time “reductions
...
Having found one NP-complete problem, we show
in Section 34
...
We illustrate this methodology by showing that
two formula-satisfiability problems are NP-complete
...
5 a variety of other problems to be NP-complete
...
1 Polynomial time
We begin our study of NP-completeness by formalizing our notion of polynomialtime solvable problems
...
We can offer three supporting arguments
...
n100 /
to be intractable, very few practical problems require time on the order of such a
high-degree polynomial
...
Experience has shown that once the
first polynomial-time algorithm for a problem has been discovered, more efficient
algorithms often follow
...
n100 /, an algorithm with a much better running time will likely
soon be discovered
...
For example, the class of problems solvable in polynomial time by the serial
random-access machine used throughout most of this book is the same as the class
of problems solvable in polynomial time on abstract Turing machines
...

Third, the class of polynomial-time solvable problems has nice closure properties, since polynomials are closed under addition, multiplication, and composition
...
Exercise 34
...

Abstract problems
To understand the class of polynomial-time solvable problems, we must first have
a formal notion of what a “problem” is
...

For example, an instance for SHORTEST-PATH is a triple consisting of a graph
and two vertices
...
The problem SHORTEST-PATH
itself is the relation that associates each instance of a graph and two vertices with
a shortest path in the graph that connects the two vertices
...

This formulation of an abstract problem is more general than we need for our
purposes
...
In this case, we can view an
abstract decision problem as a function that maps the instance set I to the solution
set f0; 1g
...
If i D hG; u; ; ki is an instance of the decision
problem PATH, then PATH
...
i/ D 0 (no) otherwise
...
As we saw above, however, we can usually recast an
optimization problem as a decision problem that is no harder
...


34
...
An encoding of a set S
of abstract objects is a mapping e from S to the set of binary strings
...
Using this encoding, e
...
If you
have looked at computer representations of keyboard characters, you probably have
seen the ASCII code, where, for example, the encoding of A is 1000001
...
Polygons, graphs, functions, ordered pairs, programs—all can
be encoded as binary strings
...
We call a problem whose
instance set is the set of binary strings a concrete problem
...
T
...
T
...
3 A concrete problem is polynomial-time solvable, therefore, if there exists
an algorithm to solve it in time O
...

We can now formally define the complexity class P as the set of concrete decision problems that are polynomial-time solvable
...
Given
an abstract decision problem Q mapping an instance set I to f0; 1g, an encoding
e W I ! f0; 1g can induce a related concrete decision problem, which we denote
by e
...
4 If the solution to an abstract-problem instance i 2 I is Q
...
i/ 2 f0; 1g is also Q
...
As a
technicality, some binary strings might represent no meaningful abstract-problem
instance
...
Thus, the concrete problem produces the same solutions as the abstract problem on binary-string instances that represent the encodings of abstract-problem
instances
...


3 We

assume that the algorithm’s output is separate from its input
...
T
...
T
...

4 We

denote by f0; 1g the set of all strings composed of symbols from the set f0; 1g
...
That is, the efficiency of solving a problem should not depend on how the problem is encoded
...
For example, suppose that
an integer k is to be provided as the sole input to an algorithm, and suppose that
the running time of the algorithm is ‚
...
If the integer k is provided in unary—a
string of k 1s—then the running time of the algorithm is O
...
If we use the more natural binary representation of the
integer k, however, then the input length is n D blg kc C 1
...
k/ D ‚
...
Thus, depending on the encoding, the algorithm runs in either polynomial
or superpolynomial time
...
We cannot really talk about solving an abstract problem without
first specifying an encoding
...
For example,
representing integers in base 3 instead of binary has no effect on whether a problem is solvable in polynomial time, since we can convert an integer represented in
base 3 to an integer represented in base 2 in polynomial time
...
x/
...
e1
...
i/
and f21
...
i// D e1
...
5 That is, a polynomial-time algorithm can compute the encoding e2
...
i/, and vice versa
...

Lemma 34
...
Then, e1
...
Q/ 2 P
...

A noninstance of an encoding e is a string x 2 f0; 1g such that there is no instance i for which
e
...
We require that f12
...
x 0 / D y 0 for every noninstance x 0 of e2 , where y 0 is some noninstance
of e1
...
1 Polynomial time

1057

Proof We need only prove the forward direction, since the backward direction is
symmetric
...
Q/ can be solved in time O
...
Further, suppose that for any problem instance i, the encoding e1
...
i/ in time O
...
i/j
...
Q/, on input e2
...
i/ and
then run the algorithm for e1
...
i/
...
nc /, and therefore je1
...
nc /, since the output of
a serial computer cannot be longer than its running time
...
i/ takes time O
...
i/jk / D O
...

Thus, whether an abstract problem has its instances encoded in binary or base 3
does not affect its “complexity,” that is, whether it is polynomial-time solvable or
not; but if instances are encoded in unary, its complexity may change
...
To be precise, we shall assume that the encoding of an
integer is polynomially related to its binary representation, and that the encoding of
a finite set is polynomially related to its encoding as a list of its elements, enclosed
in braces and separated by commas
...
) With
such a “standard” encoding in hand, we can derive reasonable encodings of other
mathematical objects, such as tuples, graphs, and formulas
...
Thus, hGi
denotes the standard encoding of a graph G
...
Henceforth, we shall
generally assume that all problem instances are binary strings encoded using the
standard encoding, unless we explicitly specify the contrary
...
You should watch
out for problems that arise in practice, however, in which a standard encoding is
not obvious and the encoding does make a difference
...
Let’s review some definitions from that theory
...
A language L over † is any set of
strings made up of symbols from †
...
We denote the empty string by ", the empty language
by ;, and the language of all strings over † by †
...
Every
language L over † is a subset of †
...
Set-theoretic operations,
such as union and intersection, follow directly from the set-theoretic definitions
...
The concatenation L1 L2 of two
We define the complement of L by L D †
languages L1 and L2 is the language
L D fx1 x2 W x1 2 L1 and x2 2 L2 g :
The closure or Kleene star of a language L is the language
L D f"g [ L [ L2 [ L3 [

;

where Lk is the language obtained by concatenating L to itself k times
...
Since Q is entirely characterized by those problem instances that produce a 1 (yes) answer, we can view Q as
a language L over † D f0; 1g, where
L D fx 2 † W Q
...
V; E/ is an undirected graph,
u; 2 V;
k 0 is an integer, and
there exists a path from u to in G
consisting of at most k edgesg :
(Where convenient, we shall sometimes use the same name—PATH in this case—
to refer to both a decision problem and its corresponding language
...
We say that an algorithm A accepts a string x 2 f0; 1g if, given input x, the algorithm’s output A
...
The language accepted by an algorithm A is the set of strings
L D fx 2 f0; 1g W A
...

An algorithm A rejects a string x if A
...

Even if language L is accepted by an algorithm A, the algorithm will not necessarily reject a string x 62 L provided as input to it
...
A language L is decided by an algorithm A if every binary string
in L is accepted by A and every binary string not in L is rejected by A
...
1 Polynomial time

1059

algorithm A accepts x in time O
...
A language L is decided in polynomial
time by an algorithm A if there exists a constant k such that for any length-n string
x 2 f0; 1g , the algorithm correctly decides whether x 2 L in time O
...
Thus,
to accept a language, an algorithm need only produce an answer when provided a
string in L, but to decide a language, it must correctly accept or reject every string
in f0; 1g
...
One
polynomial-time accepting algorithm verifies that G encodes an undirected graph,
verifies that u and are vertices in G, uses breadth-first search to compute a shortest path from u to in G, and then compares the number of edges on the shortest
path obtained with k
...
Otherwise, the algorithm runs forever
...

A decision algorithm for PATH must explicitly reject binary strings that do not belong to PATH
...
(It must also output 0 and halt if the input
encoding is faulty
...

We can informally define a complexity class as a set of languages, membership
in which is determined by a complexity measure, such as running time, of an
algorithm that determines whether a given string x belongs to language L
...
6
Using this language-theoretic framework, we can provide an alternative definition of the complexity class P:
P D fL Â f0; 1g W there exists an algorithm A that decides L
in polynomial timeg :
In fact, P is also the class of languages that can be accepted in polynomial time
...
2
P D fL W L is accepted by a polynomial-time algorithmg :
Proof Because the class of languages decided by polynomial-time algorithms is
a subset of the class of languages accepted by polynomial-time algorithms, we
need only show that if L is accepted by a polynomial-time algorithm, it is decided by a polynomial-time algorithm
...


1060

Chapter 34 NP-Completeness

polynomial-time algorithm A
...
Because A accepts L in time O
...
For any input string x, the algorithm A0
simulates cnk steps of A
...
If A has accepted x, then A0 accepts x by outputting a 1
...
The overhead of A0 simulating A
does not increase the running time by more than a polynomial factor, and thus A0
is a polynomial-time algorithm that decides L
...
2 is nonconstructive
...
Nevertheless, we know that such a bound exists, and therefore, that
an algorithm A0 exists that can check the bound, even though we may not be able
to find the algorithm A0 easily
...
1-1
Define the optimization problem LONGEST-PATH-LENGTH as the relation that
associates each instance of an undirected graph and two vertices with the number of edges in a longest simple path between the two vertices
...
V; E/ is an undirected graph, u; 2 V , k
0 is an integer, and there exists a simple path
from u to in G consisting of at least k edgesg
...

34
...
Give a related decision problem
...

34
...
Do the same using an adjacency-list representation
...

34
...
2-2 a polynomial-time algorithm? Explain your answer
...
2 Polynomial-time verification

1061

34
...
Also show that a polynomial number of
calls to polynomial-time subroutines may result in an exponential-time algorithm
...
1-6
Show that the class P, viewed as a set of languages, is closed under union, intersection, concatenation, complement, and Kleene star
...


34
...
For example,
suppose that for a given instance hG; u; ; ki of the decision problem PATH, we
are also given a path p from u to
...
For the decision problem PATH, this
certificate doesn’t seem to buy us much
...
We shall now examine
a problem for which we know of no polynomial-time decision algorithm and yet,
given a certificate, verification is easy
...
Formally, a hamiltonian cycle of an undirected graph
G D
...
A graph that contains a hamiltonian cycle is said to be hamiltonian; otherwise, it is nonhamiltonian
...
R
...
2(a)) in which one player sticks five pins in any five
consecutive vertices and the other player must complete the path to form a cycle

1062

Chapter 34 NP-Completeness

(a)

(b)

Figure 34
...
(b) A bipartite graph with an odd number of vertices
...


containing all the vertices
...
2(a)
shows one hamiltonian cycle
...
For example, Figure 34
...

Exercise 34
...

We can define the hamiltonian-cycle problem, “Does a graph G have a hamiltonian cycle?” as a formal language:
HAM-CYCLE D fhGi W G is a hamiltonian graphg :
How might an algorithm decide the language HAM-CYCLE? Given a problem
instance hGi, one possible decision algorithm lists all permutations of the vertices
of G and then checks each permutation to see if it is a hamiltonian path
...
n/, where
n D jhGij is the length of the encoding of G
...
Graves, Hamilton [157, p
...
and the
other player then aiming to insert, which by the theory in this letter can always be done, fifteen other
pins, in cyclical succession, so as to cover all the other points, and to end in immediate proximity to
the pin wherewith his antagonist had begun
...
2 Polynomial-time verification

1063

p
p
of the vertices, and therefore the running time is
...
n Š/ D
...
nk / for any constant k
...
In fact, the hamiltonian-cycle problem is NP-complete, as we
shall prove in Section 34
...


Verification algorithms
Consider a slightly easier problem
...
It would certainly be easy enough to verify the
proof: simply verify that the provided cycle is hamiltonian by checking whether
it is a permutation of the vertices of V and whether each of the consecutive edges
along the cycle actually exists in the graph
...
n2 / time, where n is the length of the encoding
of G
...

We define a verification algorithm as being a two-argument algorithm A, where
one argument is an ordinary input string x and the other is a binary string y called
a certificate
...
x; y/ D 1
...
x; y/ D 1g :
Intuitively, an algorithm A verifies a language L if for any string x 2 L, there
exists a certificate y that A can use to prove that x 2 L
...
For example, in the
hamiltonian-cycle problem, the certificate is the list of vertices in some hamiltonian cycle
...
Conversely, if a graph is not hamiltonian, there
can be no list of vertices that fools the verification algorithm into believing that the
graph is hamiltonian, since the verification algorithm carefully checks the proposed
“cycle” to be sure
...
8 More precisely, a language L belongs to NP if and only if
there exist a two-input polynomial-time algorithm A and a constant c such that
L D fx 2 f0; 1g W there exists a certificate y with jyj D O
...
x; y/ D 1g :
We say that algorithm A verifies language L in polynomial time
...
(It is always nice to know that an important set is nonempty
...
Thus, P Â NP
...
Intuitively, the class P consists of problems that can be solved
quickly
...
You may have learned from experience that it is often more difficult to
solve a problem from scratch than to verify a clearly presented solution, especially
when working under time constraints
...

There is more compelling, though not conclusive, evidence that P ¤ NP—the
existence of languages that are “NP-complete
...
3
...
Figure 34
...
Despite much work by many
researchers, no one even knows whether the class NP is closed under complement
...
We can restate the question
of whether NP is closed under complement as whether NP D co-NP
...
1-6), it follows from Exercise 34
...
Once again, however, no one knows whether P D NP \ co-NP
or whether there is some language in NP \ co-NP P
...


The class NP was originally studied
in the context of nondeterminism, but this book uses the somewhat simpler yet equivalent notion of
verification
...


34
...
3 Four possibilities for relationships among complexity classes
...
(a) P D NP D co-NP
...
(b) If NP is closed under complement, then NP D co-NP,
but it need not be the case that P D NP
...

(d) NP ¤ co-NP and P ¤ NP \ co-NP
...


Thus, our understanding of the precise relationship between P and NP is woefully incomplete
...

Exercises
34
...
Prove that GRAPH-ISOMORPHISM 2 NP by describing a
polynomial-time algorithm to verify the language
...
2-2
Prove that if G is an undirected bipartite graph with an odd number of vertices,
then G is nonhamiltonian
...
2-3
Show that if HAM-CYCLE 2 P, then the problem of listing the vertices of a
hamiltonian cycle, in order, is polynomial-time solvable
...
2-4
Prove that the class NP of languages is closed under union, intersection, concatenation, and Kleene star
...

34
...
n / for some constant k
...
2-6
A hamiltonian path in a graph is a simple path that visits every vertex exactly
once
...

34
...
2-6 can be solved in
polynomial time on directed acyclic graphs
...

34
...
The formula is a
tautology if it evaluates to 1 for every assignment of 1 and 0 to the input variables
...

Show that TAUTOLOGY 2 co-NP
...
2-9
Prove that P Â co-NP
...
2-10
Prove that if NP ¤ co-NP, then P ¤ NP
...
2-11
Let G be a connected, undirected graph with at least 3 vertices, and let G 3 be the
graph obtained by connecting all pairs of vertices that are connected by a path in G
of length at most 3
...
(Hint: Construct a spanning tree
for G, and use an inductive argument
...
3 NP-completeness and reducibility

1067

34
...

This class has the intriguing property that if any NP-complete problem can be
solved in polynomial time, then every problem in NP has a polynomial-time solution, that is, P D NP
...

The language HAM-CYCLE is one NP-complete problem
...
In fact, if NP P should turn out to be nonempty, we could say
with certainty that HAM-CYCLE 2 NP P
...
In
this section, we shall show how to compare the relative “hardness” of languages
using a precise notion called “polynomial-time reducibility
...
In Sections 34
...
5,
we shall use the notion of reducibility to show that many other problems are NPcomplete
...
For example, the problem of solving linear equations
in an indeterminate x reduces to the problem of solving quadratic equations
...
Thus, if a problem Q reduces to another
problem Q0 , then Q is, in a sense, “no harder to solve” than Q0
...
x/ 2 L2 :

(34
...

Figure 34
...
Each language is a subset of f0; 1g
...
4 An illustration of a polynomial-time reduction from a language L1 to a language L2
via a reduction function f
...
x/ 2 L2
...
x/ 2 L2
...
x/ 62 L2
...
x/ of the problem represented by L2
...
x/ 2 L2 directly provides the answer to whether x 2 L1
...

Lemma 34
...

Proof Let A2 be a polynomial-time algorithm that decides L2 , and let F be a
polynomial-time reduction algorithm that computes the reduction function f
...

Figure 34
...
For a given input x 2 f0; 1g ,
algorithm A1 uses F to transform x into f
...
x/ 2 L2
...

The correctness of A1 follows from condition (34
...
The algorithm runs in polynomial time, since both F and A2 run in polynomial time (see Exercise 34
...

NP-completeness
Polynomial-time reductions provide a formal means for showing that one problem is at least as hard as another, to within a polynomial-time factor
...
3 NP-completeness and reducibility

1069

F

yes, f
...
x/

yes, x 2 L1

no, f
...
5 The proof of Lemma 34
...
The algorithm F is a reduction algorithm that computes the
reduction function f from L1 to L2 in polynomial time, and A2 is a polynomial-time algorithm that
decides L2
...
x/
and then using A2 to decide whether f
...


why the “less than or equal to” notation for reduction is mnemonic
...

A language L Â f0; 1g is NP-complete if
1
...
L0 ÄP L for every L0 2 NP
...
We also define NPC to be the class of NP-complete languages
...

Theorem 34
...
Equivalently, if any problem in NP is not polynomial-time solvable, then no NP-complete
problem is polynomial-time solvable
...
For any L0 2 NP, we
have L0 ÄP L by property 2 of the definition of NP-completeness
...
3, we also have that L0 2 P, which proves the first statement of the
theorem
...

It is for this reason that research into the P ¤ NP question centers around the
NP-complete problems
...
6
...
Nevertheless, since
no polynomial-time algorithm for any NP-complete problem has yet been discov-

1070

Chapter 34 NP-Completeness

NP

NPC

P

Figure 34
...
Both P and NPC are wholly contained within NP, and P \ NPC D ;
...

Circuit satisfiability
We have defined the notion of an NP-complete problem, but up to this point, we
have not actually proved that any problem is NP-complete
...
Thus, we now focus on demonstrating the existence of an NP-complete problem: the circuit-satisfiability problem
...
Instead, we shall
informally describe a proof that relies on a basic understanding of boolean combinational circuits
...
A boolean combinational element is any circuit
element that has a constant number of boolean inputs and outputs and that performs
a well-defined function
...

The boolean combinational elements that we use in the circuit-satisfiability problem compute simple boolean functions, and they are known as logic gates
...
7 shows the three basic logic gates that we use in the circuit-satisfiability
problem: the NOT gate (or inverter), the AND gate, and the OR gate
...
Each of the other
two gates takes two binary inputs x and y and produces a single binary output ´
...
7
...
For

34
...
7 Three basic logic gates, with binary inputs and outputs
...
(a) The NOT gate
...
(c) The OR gate
...
We use the symbols : to denote the NOT
function, ^ to denote the AND function, and _ to denote the OR function
...

We can generalize AND and OR gates to take more than two inputs
...
An OR gate’s
output is 1 if any of its inputs are 1, and its output is 0 otherwise
...
A wire can connect the output of one element
to the input of another, thereby providing the output value of the first element as an
input value of the second
...
8 shows two similar boolean combinational
circuits, differing in only one gate
...
Although a single
wire may have no more than one combinational-element output connected to it, it
can feed several element inputs
...
If no element output is connected to a wire, the wire
is a circuit input, accepting input values from an external source
...
(An internal wire can also fan out
to a circuit output
...

Boolean combinational circuits contain no cycles
...
V; E/ with one vertex for each combinational element
and with k directed edges for each wire whose fan-out is k; the graph contains
a directed edge
...
Then G must be acyclic
...
8 Two instances of the circuit-satisfiability problem
...
The circuit
is therefore satisfiable
...
The circuit is therefore unsatisfiable
...
We say that a one-output boolean combinational circuit is satisfiable if it
has a satisfying assignment: a truth assignment that causes the output of the circuit
to be 1
...
8(a) has the satisfying assignment
hx1 D 1; x2 D 1; x3 D 0i, and so it is satisfiable
...
3-1 asks you to
show, no assignment of values to x1 , x2 , and x3 causes the circuit in Figure 34
...

The circuit-satisfiability problem is, “Given a boolean combinational circuit
composed of AND, OR, and NOT gates, is it satisfiable?” In order to pose this
question formally, however, we must agree on a standard encoding for circuits
...
We could devise a graphlike
encoding that maps any given circuit C into a binary string hC i whose length is
polynomial in the size of the circuit itself
...
If a subcircuit always produces 0, that subcircuit is unnecessary;
the designer can replace it by a simpler subcircuit that omits all logic gates and
provides the constant 0 value as its output
...

Given a circuit C , we might attempt to determine whether it is satisfiable by
simply checking all possible assignments to the inputs
...
When

34
...
2k / time, which is
superpolynomial in the size of the circuit
...
We break the
proof of this fact into two parts, based on the two parts of the definition of NPcompleteness
...
5
The circuit-satisfiability problem belongs to the class NP
...
One of the inputs to A is (a standard encoding of) a boolean combinational circuit C
...
(See Exercise 34
...
)
We construct the algorithm A as follows
...
Then, if the output of the
entire circuit is 1, the algorithm outputs 1, since the values assigned to the inputs
of C provide a satisfying assignment
...

Whenever a satisfiable circuit C is input to algorithm A, there exists a certificate
whose length is polynomial in the size of C and that causes A to output a 1
...
Algorithm A runs in polynomial time: with a good implementation, linear time suffices
...

The second part of proving that CIRCUIT-SAT is NP-complete is to show that
the language is NP-hard
...
The actual proof of this fact is full
of technical intricacies, and so we shall settle for a sketch of the proof based on
some understanding of the workings of computer hardware
...
A typical instruction encodes an operation to be performed, addresses
of operands in memory, and an address where the result is to be stored
...
2k /, then an algorithm whose running time
is O
...
Even if P ¤ NP, this situation would not contradict the NP-completeness of the problem; the existence of a polynomial-time
algorithm for a special case does not imply that there is a polynomial-time algorithm for all cases
...
The program counter automatically increments upon
fetching each instruction, thereby causing the computer to execute instructions sequentially
...

At any point during the execution of a program, the computer’s memory holds
the entire state of the computation
...
) We call any particular state of computer memory a configuration
...
The computer hardware that accomplishes
this mapping can be implemented as a boolean combinational circuit, which we
denote by M in the proof of the following lemma
...
6
The circuit-satisfiability problem is NP-hard
...
We shall describe a polynomial-time algorithm F computing a reduction function f that maps every binary string x to a
circuit C D f
...

Since L 2 NP, there must exist an algorithm A that verifies L in polynomial
time
...

Let T
...
n/ D O
...
nk /
...
)
The basic idea of the proof is to represent the computation of A as a sequence
of configurations
...
9 illustrates, we can break each configuration into
parts consisting of the program for A, the program counter and auxiliary machine
state, the input x, the certificate y, and working storage
...
Algorithm A
writes its output—0 or 1—to some designated location by the time it finishes executing, and if we assume that thereafter A halts, the value never changes
...
n/ steps, the output appears as one of the bits
in cT
...

The reduction algorithm F constructs a single combinational circuit that computes all configurations produced by a given initial configuration
...
3 NP-completeness and reducibility

c0

A

PC

1075

aux machine state

x

y

working storage

x

y

working storage

x

y

working storage

x

y

working storage

M

c1

A

PC

aux machine state

M

c2

A

PC

aux machine state

M


M

cT(n)

A

PC

aux machine state

0/1 output

Figure 34
...
Each configuration represents the state of the computer for one step of the computation
and, besides A, x, and y, includes the program counter (PC), auxiliary machine state, and working
storage
...
A boolean combinational
circuit M maps each configuration to the next configuration
...


1076

Chapter 34 NP-Completeness

paste together T
...
The output of the ith circuit, which
produces configuration ci , feeds directly into the input of the
...
Thus,
the configurations, rather than being stored in the computer’s memory, simply reside as values on the wires connecting copies of M
...
Given an input x, it must compute a circuit C D f
...
x; y/ D 1
...
n/
copies of M
...
x; y/, and the output is the configuration cT
...

Algorithm F modifies circuit C 0 slightly to construct the circuit C D f
...

First, it wires the inputs to C 0 corresponding to the program for A, the initial program counter, the input x, and the initial state of memory directly to these known
values
...
Second, it ignores all outputs from C 0 , except for the one bit of cT
...
This circuit C , so constructed, computes
C
...
x; y/ for any input y of length O
...
The reduction algorithm F ,
when provided an input string x, computes such a circuit C and outputs it
...
First, we must show that F correctly computes
a reduction function f
...
x; y/ D 1
...

To show that F correctly computes a reduction function, let us suppose that there
exists a certificate y of length O
...
x; y/ D 1
...
y/ D A
...
Thus, if a
certificate exists, then C is satisfiable
...
Hence, there exists an input y to C such that C
...
x; y/ D 1
...

To complete the proof sketch, we need only show that F runs in time polynomial
in n D jxj
...
The program for A itself has constant
size, independent of the length of its input x
...
nk /
...
nk /
steps, the amount of working storage required by A is polynomial in n as well
...
3-5 asks you to extend
the argument to the situation in which the locations accessed by A are scattered
across a much larger region of memory and the particular pattern of scattering can
differ for each input x
...
nk /; hence, the size of M
is polynomial in n
...
3 NP-completeness and reducibility

1077

system
...
nk / copies of M , and hence it
has size polynomial in n
...

The language CIRCUIT-SAT is therefore at least as hard as any language in NP,
and since it belongs to NP, it is NP-complete
...
7
The circuit-satisfiability problem is NP-complete
...
5 and 34
...

Exercises
34
...
8(b) is unsatisfiable
...
3-2
Show that the ÄP relation is a transitive relation on languages
...

34
...

34
...
5
...
3-5
The proof of Lemma 34
...
Where in the proof do we exploit this
assumption? Argue that this assumption does not involve any loss of generality
...
3-6
A language L is complete for a language class C with respect to polynomial-time
reductions if L 2 C and L0 ÄP L for all L0 2 C
...


1078

Chapter 34 NP-Completeness

34
...
3-6), L is
complete for NP if and only if L is complete for co-NP
...
3-8
The reduction algorithm F in the proof of Lemma 34
...
x/ based on knowledge of x, A, and k
...
nk / running time is known to F (since the language L belongs
to NP), not their actual values
...
Explain the flaw in the professor’s reasoning
...
4 NP-completeness proofs
We proved that the circuit-satisfiability problem is NP-complete by a direct proof
that L ÄP CIRCUIT-SAT for every language L 2 NP
...
We shall illustrate this methodology by
proving that various formula-satisfiability problems are NP-complete
...
5
provides many more examples of the methodology
...

Lemma 34
...
If, in
addition, L 2 NP, then L 2 NPC
...
By supposition, L0 ÄP L, and thus by transitivity (Exercise 34
...
If L 2 NP, we also have L 2 NPC
...
Thus, Lemma 34
...
Prove L 2 NP
...
Select a known NP-complete language L0
...
4 NP-completeness proofs

1079

3
...
x/ of L
...
Prove that the function f satisfies x 2 L0 if and only if f
...

5
...

(Steps 2–5 show that L is NP-hard
...
Proving
CIRCUIT-SAT 2 NPC has given us a “foot in the door
...
Moreover, as we develop a catalog of known
NP-complete problems, we will have more and more choices for languages from
which to reduce
...

This problem has the historical honor of being the first problem ever shown to be
NP-complete
...
An instance of SAT is a boolean formula composed of
1
...
m boolean connectives: any boolean function with one or two inputs and one
output, such as ^ (AND), _ (OR), : (NOT), ! (implication), $ (if and only
if); and
3
...
(Without loss of generality, we assume that there are no redundant
parentheses, i
...
, a formula contains at most one pair of parentheses per boolean
connective
...

As in boolean combinational circuits, a truth assignment for a boolean formula
is a set of values for the variables of , and a satisfying assignment is a truth
assignment that causes it to evaluate to 1
...
The satisfiability problem asks whether a given boolean
formula is satisfiable; in formal-language terms,
SAT D fh i W

is a satisfiable boolean formulag :

As an example, the formula

1080

Chapter 34 NP-Completeness

D
...
:x1 $ x3 / _ x4 // ^ :x2
has the satisfying assignment hx1 D 0; x2 D 0; x3 D 1; x4 D 1i, since
D
D
D
D


...
:0 $ 1/ _ 1// ^ :0

...
1 _ 1// ^ 1

...
2)

and thus this formula belongs to SAT
...
A formula with n variables has 2n possible
assignments
...
2n / time, which is superpolynomial in the length of h i
...

Theorem 34
...

Proof We start by arguing that SAT 2 NP
...
8, this will prove the theorem
...

The verifying algorithm simply replaces each variable in the formula with its corresponding value and then evaluates the expression, much as we did in equation (34
...
This task is easy to do in polynomial time
...
Thus,
the first condition of Lemma 34
...

To prove that SAT is NP-hard, we show that CIRCUIT-SAT ÄP SAT
...
We can use induction to
express any boolean combinational circuit as a boolean formula
...
We then obtain the formula for the circuit by writing an
expression that applies the gate’s function to its inputs’ formulas
...
As Exercise 34
...
Thus, the reduction algorithm
must be somewhat more clever
...
10 illustrates how we overcome this problem, using as an example
the circuit from Figure 34
...
For each wire xi in the circuit C , the formula

34
...
10 Reducing circuit satisfiability to formula satisfiability
...


has a variable xi
...
For example, the operation of the
output AND gate is x10 $
...
We call each of these small formulas a
clause
...
For the circuit in the figure, the formula is
D x10 ^
^
^
^
^
^
^


...
x5 $
...
x6 $ :x4 /

...
x1 ^ x2 ^ x4 //

...
x5 _ x6 //

...
x6 _ x7 //

...
x7 ^ x8 ^ x9 // :

Given a circuit C , it is straightforward to produce such a formula in polynomial
time
...
Therefore, when we assign wire values to
variables in , each clause of evaluates to 1, and thus the conjunction of all
evaluates to 1
...
Thus, we have shown that
CIRCUIT-SAT ÄP SAT, which completes the proof
...

The reduction algorithm must handle any input formula, though, and this requirement can lead to a huge number of cases that we must consider
...
Of course, we must not restrict the language so much that it
becomes polynomial-time solvable
...

We define 3-CNF satisfiability using the following terms
...
A boolean formula is in
conjunctive normal form, or CNF, if it is expressed as an AND of clauses, each
of which is the OR of one or more literals
...

For example, the boolean formula

...
x3 _ x2 _ x4 / ^
...
The first of its three clauses is
...

In 3-CNF-SAT, we are asked whether a given boolean formula in 3-CNF is
satisfiable
...

Theorem 34
...

Proof The argument we used in the proof of Theorem 34
...
By Lemma 34
...

We break the reduction algorithm into three basic steps
...

The first step is similar to the one used to prove CIRCUIT-SAT ÄP SAT in
Theorem 34
...
First, we construct a binary “parse” tree for the input formula ,
with literals as leaves and connectives as internal nodes
...
11 shows such
a parse tree for the formula
D
...
:x1 $ x3 / _ x4 // ^ :x2 :

(34
...
We can now think of the binary parse tree as a
circuit for computing the function
...
4 NP-completeness proofs

1083

y1
^
y2
_
y3

:x2

y4

!

:
y5

x1

x2

_
y6
x4

$
:x1

Figure 34
...
x1 ! x2 /_:
...
9, we introduce a variable yi for the output of each internal node
...
For the formula (34
...
y1

...
y3

...
y5

...
y2 ^ :x2 //
$
...
x1 ! x2 //
$ :y5 /
$
...
:x1 $ x3 // :

Observe that the formula 0 thus obtained is a conjunction of clauses i0 , each of
which has at most 3 literals
...

The second step of the reduction converts each clause i0 into conjunctive normal
form
...
Each row of the truth table consists of a possible assignment of the
variables of the clause, together with the value of the clause under that assignment
...
We then
negate this formula and convert it into a CNF formula i00 by using DeMorgan’s

1084

Chapter 34 NP-Completeness

y1
1
1
1
1
0
0
0
0

y2
1
1
0
0
1
1
0
0

x2
1
0
1
0
1
0
1
0

Figure 34
...
y1 $
...
y1 $
...


laws for propositional logic,
:
...
a _ b/ D :a ^ :b ;
to complement all literals, change ORs into ANDs, and change ANDs into ORs
...
y1 $
...
The truth table for 1 appears in Figure 34
...
The DNF formula
0
equivalent to : 1 is

...
y1 ^ :y2 ^ x2 / _
...
:y1 ^ y2 ^ :x2 / :
Negating and applying DeMorgan’s laws, we get the CNF formula
00
1

D
...
:y1 _ y2 _ :x2 /
^
...
y1 _ :y2 _ x2 / ;

0
which is equivalent to the original clause 1
...
Moreover, each clause of 00 has at most 3 literals
...
We construct the final 3-CNF formula 000
from the clauses of the CNF formula 00
...
For each clause Ci of 00 , we include the
following clauses in 000 :

If Ci has 3 distinct literals, then simply include Ci as a clause of

000


...
l1 _ l2 /, where l1 and l2 are literals,
then include
...
l1 _ l2 _ :p/ as clauses of 000
...
4 NP-completeness proofs

1085

exactly 3 distinct literals
...

If Ci has just 1 distinct literal l, then include
...
l _ p _ :q/ ^

...
l _ :p _ :q/ as clauses of 000
...

We can see that the 3-CNF formula 000 is satisfiable if and only if is satisfiable
by inspecting each of the three steps
...
The
second step produces a CNF formula 00 that is algebraically equivalent to 0
...

We must also show that the reduction can be computed in polynomial time
...

Constructing 00 from 0 can introduce at most 8 clauses into 00 for each clause
from 0 , since each clause of 0 has at most 3 variables, and the truth table for
each clause has at most 23 D 8 rows
...
Thus, the size of the resulting
formula 000 is polynomial in the length of the original formula
...

Exercises
34
...
9
...

34
...
10
on the formula (34
...

34
...
10, and not the other steps
...
Show that this strategy does not yield a polynomial-time reduction
...
4-4
Show that the problem of determining whether a boolean formula is a tautology is
complete for co-NP
...
3-7
...
4-5
Show that the problem of determining the satisfiability of boolean formulas in disjunctive normal form is polynomial-time solvable
...
4-6
Suppose that someone gives you a polynomial-time algorithm to decide formula
satisfiability
...

34
...
Show that 2-CNF-SAT 2 P
...
(Hint: Observe that x _ y is equivalent to :x ! y
...
)

34
...
In this section, we shall use the reduction methodology to provide NPcompleteness proofs for a variety of problems drawn from graph theory and set
partitioning
...
13 outlines the structure of the NP-completeness proofs in this section
and Section 34
...
We prove each language in the figure to be NP-complete by
reduction from the language that points to it
...
7
...
5
...
V; E/ is a subset V 0 Â V of vertices, each
pair of which is connected by an edge in E
...
The size of a clique is the number of vertices it contains
...
5 NP-complete problems

1087

CIRCUIT-SAT
SAT
3-CNF-SAT
CLIQUE

SUBSET-SUM

VERTEX-COVER
HAM-CYCLE
TSP

Figure 34
...
4 and 34
...
All proofs ultimately follow by reduction from the NP-completeness of CIRCUIT-SAT
...
As a decision problem, we ask simply whether a clique of a given size k
exists in the graph
...
V; E/ with jV j vertices has a clique of size k is to list all k-subsets of V , and check each one to
see whether it forms a clique
...
k 2 jV j /,
k
which is polynomial if k is a constant
...
Indeed, an efficient
algorithm for the clique problem is unlikely to exist
...
11
The clique problem is NP-complete
...
V; E/, we use the
set V 0 Â V of vertices in the clique as a certificate for G
...
u; / belongs to E
...
You might be surprised that we should be able to prove such a
result, since on the surface logical formulas seem to have little to do with graphs
...
Let
D
C1 ^ C2 ^
^ Ck be a boolean formula in 3-CNF with k clauses
...
14 The graph G derived from the 3-CNF formula D C1 ^ C2 ^ C3 , where C1 D

...
:x1 _ x2 _ x3 /, and C3 D
...
A satisfying assignment of the formula has x2 D 0, x3 D 1, and x1 either 0 or 1
...

r
r
r
1; 2; : : : ; k, each clause Cr has exactly three distinct literals l1 , l2 , and l3
...

We construct the graph G D
...
For each clause Cr D
r
r
r
r
r
r

...
We put
s
an edge between two vertices ir and j if both of the following hold:
r
i

and

s
j

are in different triples, that is, r ¤ s, and

their corresponding literals are consistent, that is, lir is not the negation of ljs
...
As an example of this

D
...
:x1 _ x2 _ x3 / ^
...
14
...
First, suppose
that has a satisfying assignment
...
Picking
one such “true” literal from each clause yields a set V 0 of k vertices
...
For any two vertices ir ; j 2 V 0 , where r ¤ s, both corresponding
literals lir and ljs map to 1 by the given satisfying assignment, and thus the literals

34
...
Thus, by the construction of G, the edge
...

Conversely, suppose that G has a clique V 0 of size k
...
We can
assign 1 to each literal lir such that ir 2 V 0 without fear of assigning 1 to both a
literal and its complement, since G contains no edges between inconsistent literals
...
(Any variables that do not correspond
to a vertex in the clique may be set arbitrarily
...
14, a satisfying assignment of has x2 D 0 and
x3 D 1
...
Because the clique contains no vertices corresponding to either x1 or :x1 ,
we can set x1 to either 0 or 1 in this satisfying assignment
...
11, we reduced an arbitrary instance
of 3-CNF-SAT to an instance of CLIQUE with a particular structure
...
Indeed, we have shown that CLIQUE is NP-hard only
in this restricted case, but this proof suffices to show that CLIQUE is NP-hard in
general graphs
...

The opposite approach—reducing instances of 3-CNF-SAT with a special structure to general instances of CLIQUE—would not have sufficed, however
...

Observe also that the reduction used the instance of 3-CNF-SAT, but not the
solution
...

34
...
2

The vertex-cover problem

A vertex cover of an undirected graph G D
...
u; / 2 E, then u 2 V 0 or 2 V 0 (or both)
...
The size of a vertex cover is the number of vertices in it
...
15(b) has a vertex cover fw; ´g of size 2
...
Restating this optimization problem as a decision problem, we wish to

1090

Chapter 34 NP-Completeness

u

v

z

u

w

y

x
(a)

v

z

w

y

x
(b)

Figure 34
...
(a) An undirected graph G D
...
(b) The graph G produced by the reduction algorithm that has vertex cover
V V 0 D fw; ´g
...
As a language, we
define
VERTEX-COVER D fhG; ki W graph G has a vertex cover of size kg :
The following theorem shows that this problem is NP-complete
...
12
The vertex-cover problem is NP-complete
...
Suppose we are given a graph
G D
...
The certificate we choose is the vertex cover V 0 Â V
itself
...
u; / 2 E, that u 2 V 0 or 2 V 0
...

We prove that the vertex-cover problem is NP-hard by showing that CLIQUE ÄP
VERTEX-COVER
...
Given an undirected graph G D
...
V; E/, where E D f
...
u; / 62 Eg
...
Figure 34
...

The reduction algorithm takes as input an instance hG; ki of the clique problem
...
The
output of the reduction algorithm is the instance hG; jV j ki of the vertex-cover
problem
...
5 NP-complete problems

1091

reduction: the graph G has a clique of size k if and only if the graph G has a vertex
cover of size jV j k
...
We claim that V V 0 is a
vertex cover in G
...
u; / be any edge in E
...
u; / 62 E, which implies
that at least one of u or does not belong to V 0 , since every pair of vertices in V 0 is
connected by an edge of E
...
u; / is covered by V V 0
...
u; / was chosen arbitrarily
from E, every edge of E is covered by a vertex in V V 0
...

Conversely, suppose that G has a vertex cover V 0 Â V , where jV 0 j D jV j k
...
u; / 2 E, then u 2 V 0 or 2 V 0 or both
...
u; / 2 E
...

Since VERTEX-COVER is NP-complete, we don’t expect to find a polynomialtime algorithm for finding a minimum-size vertex cover
...
1 presents a
polynomial-time “approximation algorithm,” however, which produces “approximate” solutions for the vertex-cover problem
...

Thus, we shouldn’t give up hope just because a problem is NP-complete
...

Chapter 35 gives several approximation algorithms for NP-complete problems
...
5
...
2
...
13
The hamiltonian cycle problem is NP-complete
...
Given a graph G D

...
The verification algorithm checks that this sequence contains each vertex
in V exactly once and that with the first vertex repeated at the end, it forms a cycle
in G
...
We can verify the certificate in
polynomial time
...
Given an undirected graph G D
...
16 The widget used in reducing the vertex-cover problem to the hamiltonian-cycle problem
...
u; / of graph G corresponds to widget Wu in the graph G 0 created in the reduction
...
(b)–(d) The shaded paths are the only possible ones
through the widget that include all vertices, assuming that the only connections from the widget to
the remainder of G 0 are through vertices Œu; ; 1, Œu; ; 6, Œ ; u; 1, and Œ ; u; 6
...
V 0 ; E 0 / that has a hamiltonian
cycle if and only if G has a vertex cover of size k
...
Figure 34
...
For each edge
...
We denote each vertex in Wu by Œu; ; i or Œ ; u; i, where 1 Ä i Ä 6, so
that each widget Wu contains 12 vertices
...
16(a)
...
In particular, only vertices Œu; ; 1, Œu; ; 6, Œ ; u; 1,
and Œ ; u; 6 will have edges incident from outside Wu
...
16(b)–(d)
...
16(b))
or the six vertices Œu; ; 1 through Œu; ; 6 (Figure 34
...
In the latter case,
the cycle will have to reenter the widget to visit vertices Œ ; u; 1 through Œ ; u; 6
...
16(d)) or
the six vertices Œ ; u; 1 through Œ ; u; 6 (Figure 34
...
No other paths through
the widget that visit all 12 vertices are possible
...


34
...
17 Reducing an instance of the vertex-cover problem to an instance of the hamiltoniancycle problem
...
(b) The undirected graph G 0 produced by the reduction, with the hamiltonian path corresponding to the vertex cover shaded
...
s1 ; Œw; x; 1/ and
...


The only other vertices in V 0 other than those of widgets are selector vertices
s1 ; s2 ; : : : ; sk
...

In addition to the edges in widgets, E 0 contains two other types of edges, which
Figure 34
...
First, for each vertex u 2 V , we add edges to join pairs
of widgets in order to form a path containing all widgets corresponding to edges
incident on u in G
...
1/ ; u
...
degree
...
u/ is the number of vertices
adjacent to u
...
Œu; u
...
i C1/ ; 1/ W
1 Ä i Ä degree
...
In Figure 34
...
Œw; x; 6; Œw; y; 1/ and
...
For each vertex u 2 V , these edges
in G 0 fill in a path containing all widgets corresponding to edges incident on u
in G
...
1/ ; 1 to Œu; u
...
u// ; 6 in G 0 that
“covers” all widgets corresponding to edges incident on u
...
i / , the path either includes all 12 vertices (if u is in the vertex
cover but u
...
i / ; 1; Œu; u
...
i / ; 6 (if
both u and u
...

The final type of edge in E 0 joins the first vertex Œu; u
...
degree
...
That is, we
include the edges
f
...
1/ ; 1/ W u 2 V and 1 Ä j Ä kg
[ f
...
degree
...
The vertices of G 0 are those
in the widgets, plus the selector vertices
...
The edges of G 0 are those in the widgets, those that go between widgets,
and those connecting selector vertices to widgets
...
For each vertex u 2 V , graph G 0 has degree
...
degree
...
Finally, G 0 has two edges for each pair consisting of a
selector vertex and a vertex of V , totaling 2k jV j such edges
...
14 jEj/ C
...
2k jV j/
D 16 jEj C
...
2 jV j 1/ jV j :
Now we show that the transformation from graph G to G 0 is a reduction
...


34
...
V; E/ has a vertex cover V Â V of size k
...
As Figure 34
...
Include
«
˚

...
i
edges
...
uj / 1 , which connect all
widgets corresponding to edges incident on uj
...
16(b)–(d) show, depending on whether the edge is covered by one or two vertices in V
...
1/
f
...
degree
...
sj C1 ; Œuj ; uj

; 6/ W 1 Ä j Ä k

1g

[ f
...
degree
...
17, you can verify that these edges form a cycle
...
The cycle visits each widget either once or twice, depending on whether one
or two vertices of V cover its corresponding edge
...
Because the cycle also visits every selector vertex, it
is hamiltonian
...
V 0 ; E 0 / has a hamiltonian cycle C Â E 0
...
sj ; Œu; u
...
4)

is a vertex cover for G
...
si ; Œu; u
...
Let us call
each such path a “cover path
...
si ; Œu; u
...
We refer to this cover path as pu , and by equation (34
...
Each widget visited by pu must be Wu or W u for some 2 V
...
If they are visited by one cover path, then edge
...
If two cover paths visit the widget, then the other cover path must
be p , which implies that 2 V , and edge
...


10 Technically, we define a cycle in terms of vertices rather than edges (see Section B
...
In the
interest of clarity, we abuse notation here and define the hamiltonian cycle in terms of edges
...
18 An instance of the traveling-salesman problem
...


Because each vertex in each widget is visited by some cover path, we see that each
edge in E is covered by some vertex in V
...
5
...
Modeling the problem as a complete
graph with n vertices, we can say that the salesman wishes to make a tour, or
hamiltonian cycle, visiting each city exactly once and finishing at the city he starts
from
...
i; j / to travel from city i
to city j , and the salesman wishes to make the tour whose total cost is minimum,
where the total cost is the sum of the individual costs along the edges of the tour
...
18, a minimum-cost tour is hu; w; ; x; ui, with cost 7
...
V; E/ is a complete graph;
c is a function from V V ! Z;
k 2 Z, and
G has a traveling-salesman tour with cost at most kg :
The following theorem shows that a fast algorithm for the traveling-salesman
problem is unlikely to exist
...
14
The traveling-salesman problem is NP-complete
...
Given an instance of the problem,
we use as a certificate the sequence of n vertices in the tour
...
This process can certainly be
done in polynomial time
...
5 NP-complete problems

1097

To prove that TSP is NP-hard, we show that HAM-CYCLE ÄP TSP
...
V; E/ be an instance of HAM-CYCLE
...
We form the complete graph G 0 D
...
i; j / W i; j 2 V
and i ¤ j g, and we define the cost function c by
(
0 if
...
i; j / D
1 if
...
; / D 1 for all
vertices 2 V
...

We now show that graph G has a hamiltonian cycle if and only if graph G 0 has a
tour of cost at most 0
...
Each edge
in h belongs to E and thus has cost 0 in G 0
...

Conversely, suppose that graph G 0 has a tour h0 of cost at most 0
...
Therefore, h0 contains only edges in E
...

34
...
5

The subset-sum problem

We next consider an arithmetic NP-complete problem
...
We ask
whether there exists a subset S 0 Â S whose elements sum to t
...

As usual, we define the problem as a language:
P
SUBSET-SUM D fhS; ti W there exists a subset S 0 Â S such that t D s2S 0 sg :
As with any arithmetic problem, it is important to recall that our standard encoding
assumes that the input integers are coded in binary
...

Theorem 34
...

Proof To show that SUBSET-SUM is in NP, for an instance hS; ti of the problem,
we let the subset S 0 be the certificate
...

We now show that 3-CNF-SAT ÄP SUBSET-SUM
...
Without loss of generality, we make two simplifying
assumptions about the formula
...
Second, each variable appears in at least one clause, because it
does not matter what value is assigned to a variable that appears in no clauses
...
We shall create numbers in base 10, where each number
contains nCk digits and each digit corresponds to either one variable or one clause
...

As Figure 34
...
We label
each digit position by either a variable or a clause
...

The target t has a 1 in each digit labeled by a variable and a 4 in each digit
labeled by a clause
...
Each of i and i0
has a 1 in the digit labeled by xi and 0s in the other variable digits
...
If literal :xi appears in clause Cj , then the digit labeled by Cj in i0 contains a 1
...

All i and i0 values in set S are unique
...
Furthermore, by our simplifying
assumptions above, no i and i0 can be equal in all k least significant digits
...
But we assume that no clause contains both xi and :xi
and that either xi or :xi appears in some clause, and so there must be some
clause Cj for which i and i0 differ
...
Each of sj and sj has
0s in all digits other than the one labeled by Cj
...
These integers are “slack variables,” which we
use to get each clause-labeled digit position to add to the target value of 4
...
19 demonstrates that all sj and sj values in S
are unique in set S
...
5 NP-complete problems

1099

x1

x2

x3

C1

C2

C3

C4

s1
0
s1
s2
0
s2
s3
0
s3
s4
0
s4

=
=
=
=
=
=
=
=
=
=
=
=
=
=

1
1
0
0
0
0
0
0
0
0
0
0
0
0

0
0
1
1
0
0
0
0
0
0
0
0
0
0

0
0
0
0
1
1
0
0
0
0
0
0
0
0

1
0
0
1
0
1
1
2
0
0
0
0
0
0

0
1
0
1
0
1
0
0
1
2
0
0
0
0

0
1
0
1
1
0
0
0
0
0
1
2
0
0

1
0
1
0
1
0
0
0
0
0
0
0
1
2

t

=

1

1

1

4

4

4

4

1
0
1
2
0
2
3
0
3

Figure 34
...
The formula in 3-CNF is
D
C1 ^C2 ^C3 ^C4 , where C1 D
...
:x1 _:x2 _:x3 /, C3 D
...
x1 _ x2 _ x3 /
...
The set S
produced by the reduction consists of the base-10 numbers shown; reading from top to bottom, S D
f1001001; 1000110; 100001; 101110; 10011; 11100; 1000; 2000; 100; 200; 10; 20; 1; 2g
...
The subset S 0 Â S is lightly shaded, and it contains 1 , 2 , and 3 , corresponding to the
0
0
0
satisfying assignment
...

0
the sj and sj values)
...
11
We can perform the reduction in polynomial time
...
The target t has n C k digits, and the reduction produces each in
constant time
...
First, suppose that has a satisfying assignment
...
Otherwise,
include i0
...
The instance at the beginning of this subsection is
the set S and target t in Figure 34
...


1100

Chapter 34 NP-Completeness

respond to literals with the value 1 in the satisfying assignment
...
Because each
clause is satisfied, the clause contains some literal with the value 1
...
In fact, 1, 2, or 3 literals may be 1 in each clause, and so each clauselabeled digit has a sum of 1, 2, or 3 from the i and i0 values in S 0
...
19
for example, literals :x1 , :x2 , and x3 have the value 1 in a satisfying assignment
...
Clause C2 contains
0
0
two of these literals, and 1 , 2 , and 3 contribute 2 to the sum in the digit for C2
...
We achieve the target of 4 in each digit labeled by clause Cj
0
by including in S 0 the appropriate nonempty subset of slack variables fsj ; sj g
...
19, S 0 includes s1 , s1 , s2 , s3 , s4 , and s4
...

Now, suppose that there is a subset S 0 Â S that sums to t
...
If i 2 S 0 , we set xi D 1
...
We claim that every clause Cj , for j D 1; 2; : : : ; k, is
satisfied by this assignment
...
If S 0 includes a i that has a 1 in Cj ’s position,
then the literal xi appears in clause Cj
...
If S 0 includes a i0 that has a 1 in that position, then the
literal :xi appears in Cj
...
Thus, all clauses of are satisfied, which completes the proof
...
5-1
The subgraph-isomorphism problem takes two undirected graphs G1 and G2 , and
it asks whether G1 is isomorphic to a subgraph of G2
...

34
...
Prove that 0-1 integer programming is
NP-complete
...
)
34
...
5-2, except that the values of the vector x may be
any integers rather than just 0 or 1
...

34
...

34
...
The question is
whether the numbers can be partitioned into two sets A and A D S A such
P
P
that x2A x D x2A x
...

34
...

34
...
Formulate a related decision
problem, and show that the decision problem is NP-complete
...
5-8
In the half 3-CNF satisfiability problem, we are given a 3-CNF formula with n
variables and m clauses, where m is even
...
Prove that the half 3-CNF
satisfiability problem is NP-complete
...
V; E/ is a subset V 0 Â V of vertices such
that each edge in E is incident on at most one vertex in V 0
...


1102

Chapter 34 NP-Completeness

a
...
(Hint: Reduce from the clique problem
...
Suppose that you are given a “black-box” subroutine to solve the decision problem you defined in part (a)
...
The running time of your algorithm should be polynomial in jV j
and jEj, counting queries to the black box as a single step
...

c
...
Analyze the running time, and prove that your algorithm
works correctly
...
Give an efficient algorithm to solve the independent-set problem when G is
bipartite
...
(Hint: Use the results of Section 26
...
)
34-2 Bonnie and Clyde
Bonnie and Clyde have just robbed a bank
...
For each of the following scenarios, either give a polynomial-time
algorithm, or prove that the problem is NP-complete
...

a
...
Bonnie and Clyde wish to divide
the money exactly evenly
...
The bag contains n coins, with an arbitrary number of different denominations,
but each denomination is a nonnegative integer power of 2, i
...
, the possible
denominations are 1 dollar, 2 dollars, 4 dollars, etc
...

c
...
” They wish to divide the checks so that they each get the
exact same amount of money
...
The bag contains n checks as in part (c), but this time Bonnie and Clyde are
willing to accept a split in which the difference is no larger than 100 dollars
...
We can model
this problem with an undirected graph G D
...

Then, a k-coloring is a function c W V ! f1; 2; : : : ; kg such that c
...
/ for
every edge
...
In other words, the numbers 1; 2; : : : ; k represent the k colors, and adjacent vertices must have different colors
...

a
...

b
...
Show that your decision problem is solvable in polynomial time if and only if the graph-coloring
problem is solvable in polynomial time
...
Let the language 3-COLOR be the set of graphs that can be 3-colored
...

To prove that 3-COLOR is NP-complete, we use a reduction from 3-CNF-SAT
...
, xn , we construct a graph
G D
...
The set V consists of a vertex for each variable, a vertex
for the negation of each variable, 5 vertices for each clause, and 3 special vertices:
TRUE, FALSE, and RED
...

The literal edges form a triangle on the special vertices and also form a triangle on
xi , :xi , and RED for i D 1; 2; : : : ; n
...
Argue that in any 3-coloring c of a graph containing the literal edges, exactly
one of a variable and its negation is colored c
...
FALSE /
...

The widget shown in Figure 34
...
x _ y _ ´/
...

e
...
TRUE / or c
...
TRUE/
...
Complete the proof that 3-COLOR is NP-complete
...
20

The widget corresponding to a clause
...


34-4 Scheduling with profits and deadlines
Suppose that we have one machine and a set of n tasks a1 ; a2 ; : : : ; an , each of
which requires time on the machine
...
The
machine can process only one task at a time, and task aj must run without interruption for tj consecutive time units
...
As
an optimization problem, we are given the processing times, profits, and deadlines
for a set of n tasks, and we wish to find a schedule that completes all the tasks and
returns the greatest amount of profit
...

a
...

b
...

c
...
(Hint: Use dynamic programming
...
Give a polynomial-time algorithm for the optimization problem, assuming that
all processing times are integers from 1 to n
...
The proof of Theorem 34
...
5 is drawn from their table of contents
...
Hopcroft, Motwani, and Ullman [177], Lewis
and Papadimitriou [236], Papadimitriou [270], and Sipser [317] have good treatments of NP-completeness in the context of complexity theory
...

The class P was introduced in 1964 by Cobham [72] and, independently, in 1965
by Edmonds [100], who also introduced the class NP and conjectured that P ¤ NP
...
Levin [234] independently discovered the notion, giving an NP-completeness
proof for a tiling problem
...
Karp’s paper included the original NP-completeness proofs of the clique, vertex-cover, and
hamiltonian-cycle problems
...
In a talk at a meeting celebrating Karp’s
60th birthday in 1995, Papadimitriou remarked, “about 6000 papers each year have
the term ‘NP-complete’ on their title, abstract, or list of keywords
...
’ ”
Recent work in complexity theory has shed light on the complexity of computing
approximate solutions
...
” This new definition implies that for problems such as
clique, vertex cover, the traveling-salesman problem with the triangle inequality,
and many others, computing good approximate solutions is NP-hard and hence no
easier than computing optimal solutions
...


35

Approximation Algorithms

Many problems of practical significance are NP-complete, yet they are too important to abandon merely because we don’t know how to find an optimal solution in
polynomial time
...
We have at
least three ways to get around NP-completeness
...
Second,
we may be able to isolate important special cases that we can solve in polynomial
time
...
In practice, nearoptimality is often good enough
...
This chapter presents polynomial-time approximation algorithms for several NP-complete problems
...
Depending
on the problem, we may define an optimal solution as one with maximum possible cost or one with minimum possible cost; that is, the problem may be either a
maximization or a minimization problem
...
n/ if,
for any input of size n, the cost C of the solution produced by the algorithm is
within a factor of
...
n/ :
(35
...
n/, we call it a
...
The definitions of the approximation ratio and of a
...

For a maximization problem, 0 < C Ä C , and the ratio C =C gives the factor
by which the cost of an optimal solution is larger than the cost of the approximate

Chapter 35

Approximation Algorithms

1107

solution
...
Because we assume that all solutions have positive
cost, these ratios are always well defined
...

imation algorithm is never less than 1, since C =C Ä 1 implies C =C
Therefore, a 1-approximation algorithm1 produces an optimal solution, and an approximation algorithm with a large approximation ratio may return a solution that
is much worse than optimal
...
An example of such a problem is the set-cover
problem presented in Section 35
...

Some NP-complete problems allow polynomial-time approximation algorithms
that can achieve increasingly better approximation ratios by using more and more
computation time
...
An example is the subset-sum problem studied in Section 35
...

This situation is important enough to deserve a name of its own
...
1 C /-approximation algorithm
...

The running time of a polynomial-time approximation scheme can increase very
rapidly as decreases
...
n2= /
...

We say that an approximation scheme is a fully polynomial-time approximation
scheme if it is an approximation scheme and its running time is polynomial in
both 1= and the size n of the input instance
...
1= /2 n3 /
...


1 When

the approximation ratio is independent of n, we use the terms “approximation ratio of ” and
“ -approximation algorithm,” indicating no dependence on n
...
Section 35
...
Section 35
...
It also shows that without the triangle inequality, for any constant
1,
a -approximation algorithm cannot exist unless P D NP
...
3, we
show how to use a greedy method as an effective approximation algorithm for the
set-covering problem, obtaining a covering whose cost is at worst a logarithmic
factor larger than the optimal cost
...
4 presents two more approximation
algorithms
...
Then we examine a weighted variant of the vertex-cover
problem and show how to use linear programming to develop a 2-approximation
algorithm
...
5 presents a fully polynomial-time approximation
scheme for the subset-sum problem
...
1 The vertex-cover problem
Section 34
...
2 defined the vertex-cover problem and proved it NP-complete
...
V; E/ is a subset V 0 Â V such
that if
...
The size of a
vertex cover is the number of vertices in it
...
We call such a vertex cover an optimal vertex cover
...

Even though we don’t know how to find an optimal vertex cover in a graph G
in polynomial time, we can efficiently find a vertex cover that is near-optimal
...


35
...
1 The operation of A PPROX -V ERTEX -C OVER
...
(b) The edge
...
Vertices b and c, shown lightly shaded, are added to the set C containing the vertex cover
being created
...
a; b/,
...
c; d /, shown dashed, are removed since they are now covered
by some vertex in C
...
e; f / is chosen; vertices e and f are added to C
...
d; g/
is chosen; vertices d and g are added to C
...
(f) The optimal vertex cover for
this problem contains only three vertices: b, d , and e
...
G/
1 C D;
2 E 0 D G:E
3 while E 0 ¤ ;
4
let
...
1 illustrates how A PPROX -V ERTEX -C OVER operates on an example
graph
...
Line 1 initializes C to the empty set
...
The loop of lines 3–6 repeatedly picks an edge
...
Finally, line 7 returns the vertex cover C
...
V C E/, using adjacency lists to represent E 0
...
1
A PPROX -V ERTEX -C OVER is a polynomial-time 2-approximation algorithm
...

The set C of vertices that is returned by A PPROX -V ERTEX -C OVER is a vertex
cover, since the algorithm loops until every edge in G:E has been covered by some
vertex in C
...
In order to cover the edges in A, any vertex cover—in
particular, an optimal cover C —must include at least one endpoint of each edge
in A
...
Thus,
no two edges in A are covered by the same vertex from C , and we have the lower
bound
jC j

jAj

(35
...
Each execution of line 4 picks an edge for
which neither of its endpoints is already in C , yielding an upper bound (an exact
upper bound, in fact) on the size of the vertex cover returned:
jC j D 2 jAj :

(35
...
2) and (35
...

Let us reflect on this proof
...
Instead of requiring that we know the exact size of an
optimal vertex cover, we rely on a lower bound on the size
...
1-2 asks
you to show, the set A of edges that line 4 of A PPROX -V ERTEX -C OVER selects is
actually a maximal matching in the graph G
...
) The size of a maximal matching

35
...
1, a lower bound on the size of an
optimal vertex cover
...
By relating the size of the solution
returned to the lower bound, we obtain our approximation ratio
...

Exercises
35
...

35
...

35
...
Repeatedly select a vertex of highest degree, and remove all of its incident edges
...
(Hint: Try a bipartite graph with vertices of uniform
degree on the left and vertices of varying degree on the right
...
1-4
Give an efficient greedy algorithm that finds an optimal vertex cover for a tree in
linear time
...
1-5
From the proof of Theorem 34
...
Does
this relationship imply that there is a polynomial-time approximation algorithm
with a constant approximation ratio for the clique problem? Justify your answer
...
2 The traveling-salesman problem
In the traveling-salesman problem introduced in Section 34
...
4, we are given a
complete undirected graph G D
...
u; /
associated with each edge
...
As an extension of our notation, let c
...
A/ D

X

c
...
u; /2A

In many practical situations, the least costly way to go from a place u to a place w
is to go directly, with no intermediate steps
...
We formalize this notion by saying that the
cost function c satisfies the triangle inequality if, for all vertices u; ; w 2 V ,
c
...
u; / C c
...
For example, if the vertices of the
graph are points in the plane and the cost of traveling between two vertices is the
ordinary euclidean distance between them, then the triangle inequality is satisfied
...

As Exercise 35
...
Thus, we should
not expect to find a polynomial-time algorithm for solving this problem exactly
...

In Section 35
...
1, we examine a 2-approximation algorithm for the travelingsalesman problem with the triangle inequality
...
2
...

35
...
1

The traveling-salesman problem with the triangle inequality

Applying the methodology of the previous section, we shall first compute a structure—a minimum spanning tree—whose weight gives a lower bound on the length
of an optimal traveling-salesman tour
...
The following algorithm implements this approach, calling the minimum-spanning-tree
algorithm MST-P RIM from Section 23
...
The parameter G is a
complete undirected graph, and the cost function c satisfies the triangle inequality
...
G; c/
1 select a vertex r 2 G:V to be a “root” vertex
2 compute a minimum spanning tree T for G from root r
using MST-P RIM
...
2 The traveling-salesman problem

a

d

1113

a

d

e
b

f

a
e

g

c

b

f

e
g

c
h

f

c

g

a

(c)

d

e
b

f

h
(b)

(a)

d

b
c

h

a

d

e
g

b

f

g

c
h

h
(d)

(e)

Figure 35
...
(a) A complete undirected graph
...
For example, f is one unit to the right and two units up from h
...
(b) A minimum spanning
tree T of the complete graph, as computed by MST-P RIM
...
Only edges
in the minimum spanning tree are shown
...
(c) A walk of T , starting at a
...
A preorder
walk of T lists a vertex just when it is first encountered, as indicated by the dot next to each vertex,
yielding the ordering a; b; c; h; d; e; f; g
...
Its total cost
is approximately 19:074
...
Its total cost is
approximately 14:715
...
1 that a preorder tree walk recursively visits every vertex
in the tree, listing a vertex when it is first encountered, before visiting any of its
children
...
2 illustrates the operation of A PPROX -TSP-T OUR
...
Part (c) shows how a preorder
walk of T visits the vertices, and part (d) displays the corresponding tour, which is
the tour returned by A PPROX -TSP-T OUR
...


1114

Chapter 35 Approximation Algorithms

By Exercise 23
...
V 2 /
...

Theorem 35
...

Proof We have already seen that A PPROX -TSP-T OUR runs in polynomial time
...
We obtain a spanning
tree by deleting any edge from a tour, and each edge cost is nonnegative
...
T / Ä c
...
4)

A full walk of T lists the vertices when they are first visited and also whenever
they are returned to after a visit to a subtree
...
The full
walk of our example gives the order
a; b; c; b; h; b; a; d; e; f; e; g; e; d; a :
Since the full walk traverses every edge of T exactly twice, we have (extending
our definition of the cost c in the natural manner to handle multisets of edges)
c
...
T / :

(35
...
4) and equation (35
...
W / Ä 2c
...
6)

and so the cost of W is within a factor of 2 of the cost of an optimal tour
...
By the triangle inequality, however, we can delete a visit to
any vertex from W and the cost does not increase
...
) By repeatedly applying this operation, we can remove from W all but the
first visit to each vertex
...
Let H
be the cycle corresponding to this preorder walk
...
2 The traveling-salesman problem

1115

ery vertex is visited exactly once, and in fact it is the cycle computed by A PPROX TSP-T OUR
...
H / Ä c
...
7)

Combining inequalities (35
...
7) gives c
...
H /, which completes
the proof
...
2, A PPROX TSP-T OUR is usually not the best practical choice for this problem
...
(See the
references at the end of this chapter
...
2
...

Theorem 35
...

Proof The proof is by contradiction
...
Without loss of generality, we assume that is an integer, by
rounding it up if necessary
...
2) in polynomial time
...
13 tells us that the hamiltonian-cycle problem is NP-complete,
Theorem 34
...

Let G D
...
We wish to
determine efficiently whether G contains a hamiltonian cycle by making use of
the hypothesized approximation algorithm A
...
Let G 0 D
...
u; / W u; 2 V and u ¤ g :
Assign an integer cost to each edge in E 0 as follows:
(
1
if
...
u; / D
jV j C 1 otherwise :
We can create representations of G 0 and c from a representation of G in time polynomial in jV j and jEj
...
G 0 ; c/
...
G 0 ; c/ contains a tour of cost jV j
...

But any tour that uses an edge not in E has a cost of at least

...
jV j

1/ D
>

jV j C jV j
jV j :

Because edges not in G are so costly, there is a gap of at least jV j between the cost
of a tour that is a hamiltonian cycle in G (cost jV j) and the cost of any other tour
(cost at least jV j C jV j)
...

Now, suppose that we apply the approximation algorithm A to the travelingsalesman problem
...
Because A is guaranteed to return a tour of cost no
more than times the cost of an optimal tour, if G contains a hamiltonian cycle,
then A must return it
...
Therefore, we can use A to solve the hamiltonian-cycle problem
in polynomial time
...
3 serves as an example of a general technique for
proving that we cannot approximate a problem very well
...
Then, we have shown that, unless P D NP, there is no
polynomial-time -approximation algorithm for problem Y
...
2-1
Suppose that a complete undirected graph G D
...
Prove that c
...

35
...
The two instances must have the same set of optimal tours
...
3, assuming that P ¤ NP
...
3 The set-covering problem

1117

35
...
Begin
with a trivial cycle consisting of a single arbitrarily chosen vertex
...
Suppose that the vertex on the cycle that is nearest u is vertex
...
Repeat until all vertices
are on the cycle
...

35
...
Assuming that the
cost function satisfies the triangle inequality, show that there exists a polynomialtime approximation algorithm with approximation ratio 3 for this problem
...

Show that the costliest edge in a bottleneck spanning tree has a cost that is at most
the cost of the costliest edge in a bottleneck hamiltonian cycle
...
2-5
Suppose that the vertices for an instance of the traveling-salesman problem are
points in the plane and that the cost c
...
Show that an optimal tour never crosses itself
...
3 The set-covering problem
The set-covering problem is an optimization problem that models many problems
that require resources to be allocated
...
The approximation algorithm developed to handle the vertex-cover problem doesn’t apply
here, however, and so we need to try other approaches
...
That is, as the size of the
instance gets larger, the size of the approximate solution may grow, relative to the
size of an optimal solution
...


1118

Chapter 35 Approximation Algorithms

S1

S2
S6
S3

S4

S5

Figure 35
...
X; F / of the set-covering problem, where X consists of the 12 black
points and F D fS1 ; S2 ; S3 ; S4 ; S5 ; S6 g
...
The greedy algorithm produces a cover of size 4 by selecting either the sets S1 , S4 , S5 ,
and S3 or the sets S1 , S4 , S5 , and S6 , in order
...
X; F / of the set-covering problem consists of a finite set X and
a family F of subsets of X , such that every element of X belongs to at least one
subset in F :
[
S:
XD
S2F

We say that a subset S 2 F covers its elements
...
8)
XD
S2C

We say that any C satisfying equation (35
...
Figure 35
...
The size of C is the number of sets it contains, rather than
the number of individual elements in these sets, since every subset C that covers X
must contain all jX j individual elements
...
3, the minimum set cover
has size 3
...
As a simple example, suppose that X represents a set of skills that are needed
to solve a problem and that we have a given set of people available to work on the
problem
...
In the decision version of the set-covering problem, we ask whether a
covering exists with size at most k, where k is an additional parameter specified
in the problem instance
...
3-2 asks you to show
...
3 The set-covering problem

1119

A greedy approximation algorithm
The greedy method works by picking, at each stage, the set S that covers the greatest number of remaining elements that are uncovered
...
X; F /
1 U DX
2 C D;
3 while U ¤ ;
4
select an S 2 F that maximizes jS \ U j
5
U DU S
6
C D C [ fSg
7 return C
In the example of Figure 35
...

The algorithm works as follows
...
The set C contains the cover being constructed
...
After S is selected,
line 5 removes its elements from U , and line 6 places S into C
...

We can easily implement G REEDY-S ET-C OVER to run in time polynomial in jX j
and jF j
...
jX j ; jF j/, and we can implement the loop body to run in time
O
...
jX j jF j min
...
Exercise 35
...

Analysis
We now show that the greedy algorithm returns a set cover that is not too much
larger than an optimal set cover
...
1) by H
...
As a boundary
condition, we define H
...

Theorem 35
...
n/-approximation algorithm, where

...
max fjSj W S 2 F g/ :
Proof
time
...
n/-approximation algorithm, we assign a cost of 1 to each set selected by the algorithm, distribute this cost over
the elements covered for the first time, and then use these costs to derive the desired relationship between the size of an optimal set cover C and the size of the
set cover C returned by the algorithm
...
We
spread this cost of selecting Si evenly among the elements covered for the first time
by Si
...
Each element
is assigned a cost only once, when it is covered for the first time
...
S1 [ S2 [ [ Si 1 /j
Each step of the algorithm assigns 1 unit of cost, and so
X
cx :
(35
...
10)
S2C

x2S

x2X

Combining equation (35
...
10), we have that
X X
cx :
jC j Ä
S2C

(35
...
For any set S belonging to the family F ,
X
cx Ä H
...
12)
x2S

From inequalities (35
...
12), it follows that
X
H
...
max fjSj W S 2 F g/ ;
thus proving the theorem
...
12)
...
S1 [ S2 [

[ Si /j

be the number of elements in S that remain uncovered after the algorithm has
selected sets S1 ; S2 ; : : : ; Si
...
3 The set-covering problem

1121

of S, which are all initially uncovered
...
Then, ui 1
ui , and
some element in S is uncovered by S1 [ S2 [
ui 1 ui elements of S are covered for the first time by Si , for i D 1; 2; : : : ; k
...
ui

1

ui /

1


...
S1 [ S2 [

[ Si 1 /j


...

Consequently, we obtain
X

cx Ä

k
X


...
ui

1

ui /

1

ui

i D1

x2S

D

k
X ui 1
X
i D1 j Dui C1

1

1
ui

1

k
X ui 1 1
X
Ä
j
i D1 j Du C1

(because j Ä ui 1 )

i

k
X ui 1 1
X
D
j
i D1
j D1

D

k
X


...
ui 1 /

ui
X1
j
j D1

!

H
...
u0 / H
...
u0 / H
...
u0 /
H
...
0/ D 0)

which completes the proof of inequality (35
...


1122

Chapter 35 Approximation Algorithms

Corollary 35
...
ln jX j C 1/-approximation algorithm
...
14) and Theorem 35
...


In some applications, max fjSj W S 2 F g is a small constant, and so the solution
returned by G REEDY-S ET-C OVER is at most a small constant times larger than
optimal
...
In this case, the
solution found by G REEDY-S ET-C OVER is not more than H
...

Exercises
35
...
Show which set cover
G REEDY-S ET-C OVER produces when we break ties in favor of the word that appears first in the dictionary
...
3-2
Show that the decision version of the set-covering problem is NP-complete by
reducing it from the vertex-cover problem
...
3-3
Show how to Á
implement G REEDY-S ET-C OVER in such a way that it runs in time
P
O
S2F jSj
...
3-4
Show that the following weaker form of Theorem 35
...
3-5
G REEDY-S ET-C OVER can return a number of different solutions, depending on
how we break ties in line 4
...
n/
that returns an n-element instance of the set-covering problem for which, depending on how we break ties in line 4, G REEDY-S ET-C OVER can return a number of
different solutions that is exponential in n
...
4 Randomization and linear programming

1123

35
...
We shall give a simple randomized
algorithm for an optimization version of 3-CNF satisfiability, and then we shall use
linear programming to help design an approximation algorithm for a weighted version of the vertex-cover problem
...
The chapter notes give references for further study of
these areas
...
We say that a randomized algorithm
for a problem has an approximation ratio of
...
n/ of the cost C of an optimal solution:
Ã
Â
C C
Ä
...
13)
;
max
C C
We call a randomized algorithm that achieves an approximation ratio of
...
n/-approximation algorithm
...

A particular instance of 3-CNF satisfiability, as defined in Section 34
...
In order to be satisfiable, there must exist an assignment of
the variables so that every clause evaluates to 1
...
We call the
resulting maximization problem MAX-3-CNF satisfiability
...
We now show that randomly setting each variable to 1 with probability 1=2
and to 0 with probability 1=2 yields a randomized 8=7-approximation algorithm
...
4, we require
each clause to consist of exactly three distinct literals
...
(Exercise 35
...
)

1124

Chapter 35 Approximation Algorithms

Theorem 35
...

Proof Suppose that we have independently set each variable to 1 with probability 1=2 and to 0 with probability 1=2
...
Since no literal appears more than once in the same clause, and since we have
assumed that no variable and its negation appear in the same clause, the settings of
the three literals in each clause are independent
...
1=2/3 D 1=8
...
1,
we have E ŒYi  D 7=8
...
Then, we have
" m #
X
Yi
E ŒY  D E
i D1

D

m
X

E ŒYi 

(by linearity of expectation)

i D1

D

m
X

7=8

i D1

D 7m=8 :
Clearly, m is an upper bound on the number of satisfied clauses, and hence the
approximation ratio is at most m=
...

Approximating weighted vertex cover using linear programming
In the minimum-weight vertex-cover problem, we are given an undirected graph
G D
...

For any vertex cover V 0 Â V , we define the weight of the vertex cover w
...
The goal is to find a vertex cover of minimum weight
...

We shall, however, compute a lower bound on the weight of the minimum-weight

35
...
We shall then “round” this solution and
use it to obtain a vertex cover
...
/ with each vertex 2 V , and let us
require that x
...
We put into the vertex cover
if and only if x
...
Then, we can write the constraint that for any edge
...
u/ C x
...
This view
gives rise to the following 0-1 integer program for finding a minimum-weight
vertex cover:
X
minimize
w
...
/
(35
...
u/ C x
...
u; / 2 E
x
...
15)
(35
...
/ are equal to 1, this formulation is the optimization version of the NP-hard vertex-cover problem
...
/ 2 f0; 1g and replace it
by 0 Ä x
...
We then obtain the following linear program, which is known as
the linear-programming relaxation:
X
w
...
/
(35
...
u/ C x
...
u; / 2 E
x
...
/
0 for each 2 V :

(35
...
19)
(35
...
14)–(35
...
17)–(35
...
Therefore, the
value of an optimal solution to the linear program gives a lower bound on the value
of an optimal solution to the 0-1 integer program, and hence a lower bound on the
optimal weight in the minimum-weight vertex-cover problem
...
G; w/
1 C D;
2 compute x, an optimal solution to the linear program in lines (35
...
20)
N
3 for each 2 V
4
if x
...
Line 1 initializes the vertex cover to be empty
...
17)–(35
...
An optimal solution
gives each vertex an associated value x
...
/ Ä 1
...

If x
...
In effect, we are “rounding”
N
each fractional variable in the solution to the linear program to 0 or 1 in order to
obtain a solution to the 0-1 integer program in lines (35
...
16)
...

Theorem 35
...

Proof Because there is a polynomial-time algorithm to solve the linear program
in line 2, and because the for loop of lines 3–5 runs in polynomial time, A PPROX M IN -W EIGHT-VC is a polynomial-time algorithm
...
Let C be an optimal solution to the minimum-weight vertex-cover problem, and let ´ be the value of an optimal solution to the linear program in
lines (35
...
20)
...
C /, that is,
´ Ä w
...
21)

Next, we claim that by rounding the fractional values of the variables x
...
C / Ä 2´
...
u; / 2 E
...
18), we know that
x
...
/ 1, which implies that at least one of x
...
/ is at least 1=2
...

Now, we consider the weight of the cover
...
4 Randomization and linear programming

´

D

X
2V

1127

w
...
/
N
X

w
...
/
N

2V Wx
...
/

2V Wx
...
/

1
2

1
2

1X
w
...
C / :
2
Combining inequalities (35
...
22) gives
D

(35
...
C / Ä 2´ Ä 2w
...

Exercises
35
...

35
...
Give a
randomized 2-approximation algorithm for the MAX-CNF satisfiability problem
...
4-3
In the MAX-CUT problem, we are given an unweighted undirected graph G D

...
We define a cut
...
The goal is to find a cut of maximum weight
...
Show that this algorithm is a
randomized 2-approximation algorithm
...
4-4
Show that the constraints in line (35
...
17)–(35
...
/ Ä 1 for each 2 V
...
5 The subset-sum problem
Recall from Section 34
...
5 that an instance of the subset-sum problem is a
pair
...
This decision problem asks whether there exists a subset of S that
adds up exactly to the target value t
...
5
...

The optimization problem associated with this decision problem arises in practical applications
...
For example, we may have a truck that can carry no more than t pounds, and n different
boxes to ship, the ith of which weighs xi pounds
...

In this section, we present an exponential-time algorithm that computes the optimal value for this optimization problem, and then we show how to modify the
algorithm so that it becomes a fully polynomial-time approximation scheme
...
)
An exponential-time exact algorithm
Suppose that we computed, for each subset S 0 of S, the sum of the elements
in S 0 , and then we selected, among the subsets whose sum does not exceed t,
the one whose sum was closest to t
...
To implement this algorithm,
we could use an iterative procedure that, in iteration i, computes the sums of
all subsets of fx1 ; x2 ; : : : ; xi g, using as a starting point the sums of all subsets
of fx1 ; x2 ; : : : ; xi 1 g
...
We now give an implementation of this
strategy
...
This procedure it-

35
...

If L is a list of positive integers and x is another positive integer, then we let
L C x denote the list of integers derived from L by increasing each element of L
by x
...
We also use
this notation for sets, so that
S C x D fs C x W s 2 Sg :
We also use an auxiliary procedure M ERGE -L ISTS
...
Like the M ERGE procedure we used in merge sort (Section 2
...
1),
M ERGE -L ISTS runs in time O
...
We omit the pseudocode for M ERGE L ISTS
...
S; t/
1 n D jSj
2 L0 D h0i
3 for i D 1 to n
4
Li D M ERGE -L ISTS
...
For example, if S D f1; 4; 5g, then
P1 D f0; 1g ;
P2 D f0; 1; 4; 5g ;
P3 D f0; 1; 4; 5; 6; 9; 10g :
Given the identity
Pi D Pi

1

[
...
23)

we can prove by induction on i (see Exercise 35
...
Since the length
of Li can be as much as 2i , E XACT-S UBSET-S UM is an exponential-time algorithm
in general, although it is a polynomial-time algorithm in the special cases in which t
is polynomial in jSj or all the numbers in S are bounded by a polynomial in jSj
...
The idea behind trimming is

1130

Chapter 35 Approximation Algorithms

that if two values in L are close to each other, then since we want just an approximate solution, we do not need to maintain both of them explicitly
...
When we trim a list L by ı,
we remove as many elements from L as possible, in such a way that if L0 is the
result of trimming L, then for every element y that was removed from L, there is
an element ´ still in L0 that approximates y, that is,
y
Ä´Äy:
(35
...
Each removed
element y is represented by a remaining element ´ satisfying inequality (35
...

For example, if ı D 0:1 and
L D h10; 11; 12; 15; 20; 21; 22; 23; 24; 29i ;
then we can trim L to obtain
L0 D h10; 12; 15; 20; 23; 29i ;
where the deleted value 11 is represented by 10, the deleted values 21 and 22
are represented by 20, and the deleted value 24 is represented by 23
...

The following procedure trims list L D hy1 ; y2 ; : : : ; ym i in time ‚
...
The
output of the procedure is a trimmed, sorted list
...
L; ı/
1 let m be the length of L
2 L0 D hy1 i
3 last D y1
4 for i D 2 to m
/ yi last because L is sorted
/
5
if yi > last
...
A number is appended onto the returned list L0 only if it is the first element of L or if it
cannot be represented by the most recent number placed into L0
...
This procedure takes as input a set S D fx1 ; x2 ; : : : ; xn g of n integers (in
arbitrary order), a target integer t, and an “approximation parameter” , where

35
...
25)

It returns a value ´ whose value is within a 1 C factor of the optimal solution
...
S; t; /
1 n D jSj
2 L0 D h0i
3 for i D 1 to n
4
Li D M ERGE -L ISTS
...
Li ; =2n/
6
remove from Li every element that is greater than t
7 let ´ be the largest value in Ln
8 return ´
Line 2 initializes the list L0 to be the list containing just the element 0
...
Since we create Li
from Li 1 , we must ensure that the repeated trimming doesn’t introduce too much
compounded inaccuracy
...

As an example, suppose we have the instance
S D h104; 102; 201; 101i
with t D 308 and D 0:40
...
A PPROX S UBSET-S UM computes the following values on the indicated lines:
line 2:

L0 D h0i ;

line 4:
line 5:
line 6:

L1 D h0; 104i ;
L1 D h0; 104i ;
L1 D h0; 104i ;

line 4:
line 5:
line 6:

L2 D h0; 102; 104; 206i ;
L2 D h0; 102; 206i ;
L2 D h0; 102; 206i ;

line 4:
line 5:
line 6:

L3 D h0; 102; 201; 206; 303; 407i ;
L3 D h0; 102; 201; 303; 407i ;
L3 D h0; 102; 201; 303i ;

line 4:
line 5:
line 6:

L4 D h0; 101; 102; 201; 203; 302; 303; 404i ;
L4 D h0; 101; 201; 302; 404i ;
L4 D h0; 101; 201; 302i :

1132

Chapter 35 Approximation Algorithms

The algorithm returns ´ D 302 as its answer, which is well within
the optimal answer 307 D 104 C 102 C 101; in fact, it is within 2%
...
8
A PPROX -S UBSET-S UM is a fully polynomial-time approximation scheme for the
subset-sum problem
...
Therefore, the value ´ returned in line 8 is indeed the sum of some
subset of S
...

Then, from line 6, we know that ´ Ä y
...
1), we need to show
that y =´ Ä 1 C
...

As Exercise 35
...
26)

...
26) must hold for y 2 Pn , and therefore there exists an element
´ 2 Ln such that
y

...
27)

Since there exists an element ´ 2 Ln fulfilling inequality (35
...
28)
´
2n
Now, we show that y =´ Ä 1 C
...
1 C =2n/n Ä
1 C
...
14), we have limn!1
...
Exercise 35
...
29)
dn
2n
Therefore, the function
...
5 The subset-sum problem

Án
1C

Ä e

2n

1133

=2

Ä 1 C =2 C
...
13))
Ä 1C
(by inequality (35
...


(35
...
28) and (35
...

To show that A PPROX -S UBSET-S UM is a fully polynomial-time approximation
scheme, we derive a bound on the length of Li
...
That is, they must
differ by a factor of at least 1 C =2n
...
The number of
elements in each list Li is at most
log1C

=2n

t C2 D
Ä
<

ln t
C2
ln
...
1 C =2n/ ln t
3n ln t

C 2 (by inequality (3
...
25))
...
Since the running time of A PPROX -S UBSETS UM is polynomial in the lengths of the Li , we conclude that A PPROX -S UBSETS UM is a fully polynomial-time approximation scheme
...
5-1
Prove equation (35
...
Then show that after executing line 5 of E XACT-S UBSETS UM, Li is a sorted list containing every element of Pi whose value is not more
than t
...
5-2
Using induction on i, prove inequality (35
...

35
...
29)
...
5-4
How would you modify the approximation scheme presented in this section to find
a good approximation to the smallest value not less than t that is a sum of some
subset of the given input list?
35
...


Problems
35-1 Bin packing
Suppose that we are given a set of n objects, where the size si of the ith object
satisfies 0 < si < 1
...
Each bin can hold any subset of the objects whose total size does
not exceed 1
...
Prove that the problem of determining the minimum number of bins required is
NP-hard
...
)
The first-fit heuristic takes each object in turn and places it into the first bin that
Pn
can accommodate it
...

b
...

c
...

d
...

e
...

f
...

35-2 Approximating the size of a maximum clique
Let G D
...
For any k 1, define G
...
V
...
k/ /, where V
...
k/ is defined so that
...
w1 ; w2 ; : : : ; wk /
if and only if for i D 1; 2; : : : ; k, either vertex i is adjacent to wi in G, or else
i D wi
...
Prove that the size of the maximum clique in G
...

b
...

35-3 Weighted set-covering problem
Suppose that we generalize the set-covering problem so that each set Si in the
P
family F has an associated weight wi and the weight of a cover C is Si 2C wi
...
(Section 35
...
)
Show how to generalize the greedy set-covering heuristic in a natural manner
to provide an approximate solution for any instance of the weighted set-covering
problem
...
d /, where d is
the maximum size of any set Si
...
In Section 26
...
In this problem, we will look at
matchings in undirected graphs in general (i
...
, the graphs are not required to be
bipartite)
...
A maximal matching is a matching that is not a proper subset of any other
matching
...
(Hint: You can find such a graph with only four vertices
...
Consider an undirected graph G D
...
Give an O
...

In this problem, we shall concentrate on a polynomial-time approximation algorithm for maximum matching
...
You will show that the linear-time greedy algorithm
for maximal matching in part (b) is a 2-approximation algorithm for maximum
matching
...
Show that the size of a maximum matching in G is a lower bound on the size
of any vertex cover for G
...
Consider a maximal matching M in G D
...
Let
T D f 2 V W some edge in M is incident on g :
What can you say about the subgraph of G induced by the vertices of G that
are not in T ?
e
...

f
...

35-5 Parallel machine scheduling
In the parallel-machine-scheduling problem, we are given n jobs, J1 ; J2 ; : : : ; Jn ,
where each job Jk has an associated nonnegative processing time of pk
...
Any job can run on any machine
...
Each job Jk must run on some machine Mi
for pk consecutive time units, and during that time period no other job may run
on Mi
...
Given a schedule, we define Cmax D max1Äj Än Cj to
be the makespan of the schedule
...

For example, suppose that we have two machines M1 and M2 and that we have
four jobs J1 ; J2 ; J3 ; J4 , with p1 D 2, p2 D 12, p3 D 4, and p4 D 5
...
For this schedule, C1 D 2, C2 D 14,
C3 D 9, C4 D 5, and Cmax D 14
...
For this schedule, C1 D 2, C2 D 12,
C3 D 6, C4 D 11, and Cmax D 12
...

a
...
Show that the optimal makespan is at least as large as the average machine load,
that is,
1 X
pk :
Cmax
m
1ÄkÄn

Problems for Chapter 35

1137

Suppose that we use the following greedy algorithm for parallel machine scheduling: whenever a machine is idle, schedule any job that has not yet been scheduled
...
Write pseudocode to implement this greedy algorithm
...
For the schedule returned by the greedy algorithm, show that
Cmax Ä

1 X
pk C max pk :
1ÄkÄn
m
1ÄkÄn

Conclude that this algorithm is a polynomial-time 2-approximation algorithm
...
V; E/ be an undirected graph with distinct edge weights w
...
u; / 2 E
...
/ D max
...
u; /g be
the maximum-weight edge incident on that vertex
...
/ W 2 V g
be the set of maximum-weight edges incident on each vertex, and let TG be the
maximum-weight spanning tree of G, that is, the spanning tree of maximum total
P
weight
...
E 0 / D
...
u; /
...
Give an example of a graph with at least 4 vertices for which SG D TG
...
Give an example of a graph with at least 4 vertices for which SG ¤ TG
...
Prove that SG Â TG for any graph G
...
Prove that w
...
SG /=2 for any graph G
...
Give an O
...

35-7 An approximation algorithm for the 0-1 knapsack problem
Recall the knapsack problem from Section 16
...
There are n items, where the ith
item is worth i dollars and weighs wi pounds
...
Here, we add the further assumptions that each
weight wi is at most W and that the items are indexed in monotonically decreasing
order of their values: 1
2
n
...
The fractional knapsack
problem is like the 0-1 knapsack problem, except that we are allowed to take a
fraction of each item, rather than being restricted to taking either all or none of

1138

Chapter 35 Approximation Algorithms

each item
...
Our goal is to develop
a polynomial-time 2-approximation algorithm for the 0-1 knapsack problem
...
Given an instance I of the knapsack problem, we
form restricted instances Ij , for j D 1; 2; : : : ; n, by removing items 1; 2; : : : ; j 1
and requiring the solution to include item j (all of item j in both the fractional
and 0-1 knapsack problems)
...
For instance Ij ,
let Pj denote an optimal solution to the 0-1 problem and Qj denote an optimal
solution to the fractional problem
...
Argue that an optimal solution to instance I of the 0-1 knapsack problem is one
of fP1 ; P2 ; : : : ; Pn g
...
Prove that we can find an optimal solution Qj to the fractional problem for instance Ij by including item j and then using the greedy algorithm in which
at each step, we take as much as possible of the unchosen item in the set
fj C 1; j C 2; : : : ; ng with maximum value per pound i =wi
...
Prove that we can always construct an optimal solution Qj to the fractional
problem for instance Ij that includes at most one item fractionally
...

d
...
Let
...
Qj /=2
the total value of items taken in a solution S
...
Rj /

...

e
...


Chapter notes
Although methods that do not necessarily compute exact solutions have been
known for thousands of years (for example, methods to approximate the value
of ), the notion of an approximation algorithm is much more recent
...
The first such
algorithm is often credited to Graham [149]
...
Recent texts by Ausiello et al
...
Several other texts, such as Garey and Johnson [129]
and Papadimitriou and Steiglitz [271], have significant coverage of approximation
algorithms as well
...

Papadimitriou and Steiglitz attribute the algorithm A PPROX -V ERTEX -C OVER
to F
...
Yannakakis
...
1/
...
Christofides improved on this algorithm and gave a 3=2-approximation algorithm for the traveling-salesman problem with the triangle inequality
...
Theorem 35
...

The analysis of the greedy heuristic for the set-covering problem is modeled
after the proof published by Chv´ tal [68] of a more general result; the basic result
a
as presented here is due to Johnson [190] and Lov´ sz [238]
...

Problem 35-7 is a combinatorial version of a more general result on approximating knapsack-type integer programs by Bienstock and McClosky [45]
...
The weighted vertex-cover algorithm is by Hochbaum [171]
...
4 only touches on the power of randomization and linear programming in the design of approximation algorithms
...
These probabilities then help guide
the solution of the original problem
...
(See Motwani, Naor,
and Raghavan [261] for a survey
...


1140

Chapter 35 Approximation Algorithms

As mentioned in the chapter notes for Chapter 34, recent results in probabilistically checkable proofs have led to lower bounds on the approximability of many
problems, including several in this chapter
...


VIII

Appendix: Mathematical Background

Introduction
When we analyze algorithms, we often need to draw upon a body of mathematical
tools
...
In Part I, we saw how to manipulate asymptotic notations and solve
recurrences
...
As noted in the introduction to Part I, you
may have seen much of the material in this appendix before having read this book
(although the specific notational conventions we use might occasionally differ from
those you have seen elsewhere)
...
As in the rest of this book, however, we have included exercises and
problems, in order for you to improve your skills in these areas
...
Many of the formulas here appear
in any calculus text, but you will find it convenient to have these methods compiled
in one place
...
It also gives some basic properties of these mathematical objects
...
The remainder contains definitions and properties of basic
probability
...
Later, when you encounter a probabilistic
analysis that you want to understand better, you will find Appendix C well organized for reference purposes
...
You have probably seen most of this material already if you have taken a
course in linear algebra, but you might find it helpful to have one place to look for
our notation and definitions
...
For example, we found in Section 2
...
By adding
up the time spent on each iteration, we obtained the summation (or series)
n
X
j :
j D2

When we evaluated this summation, we attained a bound of ‚
...
This example illustrates why you should know
how to manipulate and bound summations
...
1 lists several basic formulas involving summations
...
2 offers useful techniques for bounding summations
...
1 without proof, though proofs for some of them appear in Section A
...
You can find most of the other proofs in any
calculus text
...
1 Summation formulas and properties
Given a sequence a1 ; a2 ; : : : ; an of numbers, where n is a nonnegative integer, we
can write the finite sum a1 C a2 C C an as
n
X
ak :
kD1

If n D 0, the value of the summation is defined to be 0
...

Given an infinite sequence a1 ; a2 ; : : : of numbers, we can write the infinite sum
as
a1 C a2 C

1146

Appendix A
1
X

Summations

ak ;

kD1

which we interpret to mean
lim

n
X

n!1

ak :

kD1

If the limit does not exist, the series diverges; otherwise, it converges
...
We can, P
however,
1
rearrange the terms P an absolutely convergent series, that is, a series kD1 ak
of
1
for which the series kD1 jak j also converges
...
cak C bk / D c

kD1

n
X

ak C

kD1

n
X

bk :

kD1

The linearity property also applies to infinite convergent series
...
For example,
!
n
n
X
X

...
k// D ‚
f
...
We can also apply such manipulations to
infinite convergent series
...
n C 1/
k D
2

(A
...
n2 / :

(A
...
1 Summation formulas and properties

1147

Sums of squares and cubes
We have the following summations of squares and cubes:
n
X
n
...
2n C 1/
;
k2 D
6
kD0
n
X
kD0

k3 D

n2
...
3)
(A
...
5)

kD0

When the summation is infinite and jxj < 1, we have the infinite decreasing geometric series
1
X
1
:
(A
...
1/ :

(A
...
2
...
For
example, by differentiating both sides of the infinite geometric series (A
...
1

(A
...

Telescoping series
For any sequence a0 ; a1 ; : : : ; an ,
n
X

...
9)

kD1

since each of the terms a1 ; a2 ; : : : ; an 1 is added in exactly once and subtracted out
exactly once
...
Similarly,
n 1
X


...
k C 1/

Since we can rewrite each term as
1
1
1
D
;
k
...
k C 1/
k kC1
kD1

kD1

D 1

1
:
n

Products
We can write the finite product a1 a2
n
Y
ak :

an as

kD1

If n D 0, the value of the product is defined to be 1
...
2 Bounding summations

1149

Exercises
A
...
2k
A
...
2k
kD1
series
...


p
1/ D ln
...
1/ by manipulating the harmonic

A
...
1 C x/=
...
1-4 ? P
1
Show that kD0
...


1/=2k D 0
...
1-5 ?
P1
Evaluate the sum kD1
...

A
...
fk
...
i/ by using the linearity property of
summations
...
1-7
Qn
Evaluate the product kD1 2 4k
...
1-8 ?
Qn
Evaluate the product kD2
...


A
...
Here are some of the most frequently used
methods
...
As an
Pn
example, let us prove that the arithmetic series kD1 k evaluates to 1 n
...
We
2
can easily verify this assertion for n D 1
...
We have
nC1
X

k D

kD1

n
X

k C
...
n C 1/ C
...
n C 1/
...
Instead, you can use induction to provePbound on a suma
mation
...
3n /
...
For the
P0
initial condition n D 0, we have kD0 3k D 1 Ä c 1 as long as c 1
...
We have
D

nC1
X

k

3

D

kD0

n
X

3k C 3nC1

kD0
n

Ä c3 C 3nC1
(by the inductive hypothesis)
Ã
Â
1
1
C
c3nC1
D
3
c
Ä c3nC1

Pn
as long as
...
Thus, kD0 3k D O
...

We have to be careful when we use asymptotic notation to prove bounds by inPn
duction
...
n/
...
1/
...
n C 1/

kD1

D O
...
n C 1/
D O
...
We have not shown that the same constant works for all n
...
For

A
...
1) is
n
X

k

Ä

kD1

n
X

n

kD1

D n2 :
In general, for a series
n
X

Pn
kD1

ak , if we let amax D max1ÄkÄn ak , then

ak Ä n amax :

kD1

The technique of bounding each term in a series by the largest term is a weak
method when the series can in fact be bounded by a geometric series
...
We can bound the sum by an infinite decreasing geometric series, since
ak Ä a0 r k , and thus
n
X

ak

Ä

kD0

1
X

a0 r k

kD0

D a0

1
X

rk

kD0

D a0

1
1

r

:

P1
We can apply this method to bound the summation kD1
...
In order to
P1
start the summation at k D 0, we rewrite it as kD0
...
The first
term (a0 ) is 1=3, and the ratio (r) of consecutive terms is

...
k C 1/=3kC1

D
Ä

for all k
1
X k
3k

1 kC2
3 kC1
2
3

0
...
An example is the infinite harmonic series, which diverges since
1
X1
k
kD1

n
X1
D lim
n!1
k
kD1

D lim ‚
...
k C1/st and kth terms in this series is k=
...
To bound a series by a geometric
series, we must show that there is an r < 1, which is a constant, such that the ratio
of all pairs of consecutive terms never exceeds r
...

Splitting summations
One way to obtain bounds on a difficult summation is to express the series as the
sum of two or more series by partitioning the range of the index and then to bound
each of the resulting series
...
We might attempt to bound each term in the summation by the smallest term,
but since that term is 1, we get a lower bound of n for the summation—far off from
our upper bound of n2
...
Assume for
convenience that n is even
...
n=2/

kDn=2C1

D
...
n2 / ;
Pn
which is an asymptotically tight bound, since kD1 k D O
...

For a summation arising from the analysis of an algorithm, we can often split
the summation and ignore a constant number of the P
initial terms
...


A
...
1/ C

n
X

ak ;

kDk0

since the initial terms of the summation are all constant and there are a constant
Pn
number of them
...
This technique applies to infinite summations as well
...
k C 1/2 =2kC1
k 2 =2k

D
Ä

if k

3
...
k C 1/2
2k 2
8
9

2k

D

2
X k2

C

2k
2
1 Â Ã
X k2
9X 8 k
Ä
C
2k
8 kD0 9
kD0
kD0

2k

1
X k2
kD3

D O
...

The technique of splitting summations can help us determine asymptotic bounds
in much more difficult situations
...
lg n/
on the harmonic series (A
...
For i D 0; 1; : : : ; blg nc, the ith piece consists

1154

Appendix A

Summations

of the terms starting at 1=2i and going up to but not including 1=2i C1
...
10)

Approximation by integrals
Pn
When a summation has the form kDm f
...
k/ is a monotonically increasing function, we can approximate it by integrals:
Z nC1
Z n
n
X
f
...
k/ Ä
f
...
11)
m 1

kDm

m

Figure A
...
The summation is represented as the area
of the rectangles in the figure, and the integral is the shaded region under the curve
...
k/ is a monotonically decreasing function, we can use a similar method
to provide the bounds
Z n
Z nC1
n
X
f
...
k/ Ä
f
...
12)
m

kDm

m 1

The integral approximation (A
...
For a lower bound, we obtain
Z nC1
n
X1
dx
kD1

k

1

x

D ln
...
13)

A
...
1 Approximation of n
kDm f
...
The area of each rectangle is shown
within the rectangle, and the total rectangle area represents the value of the summation
...
By comparing areas in (a), we get
Rn
Pn
f
...
k/, and then by shifting the rectangles one unit to the right, we get
m
R nC1kDm
Pn 1
f
...

kDm f
...
14)

kD1

Exercises
A
...

A
...
2-3
Show that the nth harmonic number is


...


A
...

A
...
12) directly on kD1 1=k to
obtain an upper bound on the nth harmonic number?

Problems
A-1 Bounding summations
Give asymptotically tight bounds on the following summations
...

a
...


kD1

b
...


0

Notes for Appendix A

c
...


kD1

Appendix notes
Knuth [209] provides an excellent reference for the material presented here
...
[334]
...


Many chapters of this book touch on the elements of discrete mathematics
...
If you are already well versed
in this material, you can probably just skim this chapter
...
1

Sets
A set is a collection of distinguishable objects, called its members or elements
...
If x is not a member of S, we write x 62 S
...
For
example, we can define a set S to contain precisely the numbers 1, 2, and 3 by
writing S D f1; 2; 3g
...
A set cannot contain the same object more
than once,1 and its elements are not ordered
...
For example, f1; 2; 3; 1g D f1; 2; 3g D
f3; 2; 1g
...

Z denotes the set of integers, that is, the set f: : : ; 2; 1; 0; 1; 2; : : :g
...

N denotes the set of natural numbers, that is, the set f0; 1; 2; : : :g
...


2 Some

with 0
...
The modern trend seems to be to start

B
...
A set A is a
proper subset of B, written A B, if A Â B but A ¤ B
...
) For any set A, we have A Â A
...
For any three sets A, B, and C , if A Â B
and B Â C , then A Â C
...

We sometimes define sets in terms of other sets
...
For example,
we can define the set of even integers by fx W x 2 Z and x=2 is an integerg
...
” (Some authors use a vertical bar in place
of the colon
...


A

B

A

B

D
C

C

A


...
B \ C /

D


...
A

C/

Figure B
...
2)
...


Associative laws:
A \
...
A \ B/ \ C ;
A [
...
A [ B/ [ C :
Distributive laws:
A \
...
A \ B/ [
...
B \ C / D
...
A [ C / :

(B
...
A [ B/ D A ;
A [
...
B \ C / D
...
A

C/ ;

A


...
A

B/ \
...
2)

Figure B
...

Often, all the sets under consideration are subsets of some larger set U called the
universe
...
Given a universe U , we define the
complement of a set A as A D U A D fx W x 2 U and x 62 Ag
...
1 Sets

1161

We can rewrite DeMorgan’s laws (B
...
For any two sets
B; C Â U , we have
B \C
B [C

D B [C ;
D B \C :

Two sets A and B are disjoint if they have no elements in common, that is, if
A \ B D ;
...

The number of elements in a set is the cardinality (or size) of the set, denoted jSj
...
The cardinality of the empty set is j;j D 0
...
An infinite
set that can be put into a one-to-one correspondence with the natural numbers N is
countably infinite; otherwise, it is uncountable
...

For any two finite sets A and B, we have the identity
jA [ Bj D jAj C jBj

jA \ Bj ;

(B
...
If
A Â B, then jAj Ä jBj
...
A 1-set is called a
singleton
...

We denote the set of all subsets of a set S, including the empty set and S itself,
by 2S ; we call 2S the power set of S
...

The power set of a finite set S has cardinality 2jSj (see Exercise B
...

We sometimes care about setlike structures in which the elements are ordered
...
a; b/ and is defined formally
as the set
...
Thus, the ordered pair
...
b; a/
...


The Cartesian product of two sets A and B, denoted A B, is the set of all
ordered pairs such that the first element of the pair is an element of A and the
second is an element of B
...
a; b/ W a 2 A and b 2 Bg :

For example, fa; bg fa; b; cg D f
...
a; b/;
...
b; a/;
...
b; c/g
...
4)

The Cartesian product of n sets A1 ; A2 ; : : : ; An is the set of n-tuples
A1

A2

An D f
...
We denote an n-fold Cartesian product over a single set A by
the set
An D A

A

A;

whose cardinality is jAn j D jAjn if A is finite
...

Exercises
B
...
1)
...
1-2
Prove the generalization of DeMorgan’s laws to any finite collection of sets:
A1 \ A2 \
A1 [ A2 [

\ An D A1 [ A2 [
[ An D A1 \ A2 \

[ An ;
\ An :

B
...
1-3 ?
Prove the generalization of equation (B
...
1/
jA1 \ A2 \ \ An j :
B
...

B
...

B
...


B
...

If
...
When we say that R is a binary relation
on a set A, we mean that R is a subset of A A
...
a; b/ W a; b 2 N and a < bg
...

relation on sets A1 ; A2 ; : : : ; An is a subset of A1 A2
A binary relation R Â A A is reflexive if
aRa
for all a 2 A
...
The relation R is symmetric if
a R b implies b R a
for all a; b 2 A
...
The
relation R is transitive if
a R b and b R c imply a R c

1164

Appendix B

Sets, Etc
...
For example, the relations “<,” “Ä,” and “D” are transitive, but
the relation R D f
...

A relation that is reflexive, symmetric, and transitive is an equivalence relation
...

If R is an equivalence relation on a set A, then for a 2 A, the equivalence class
of a is the set Œa D fb 2 A W a R bg, that is, the set of all elements equivalent to a
...
a; b/ W a; b 2 N and a C b is an even numberg,
then R is an equivalence relation, since a C a is even (reflexive), a C b is even
implies b C a is even (symmetric), and a C b is even and b C c is even imply
a C c is even (transitive)
...
A basic theorem of equivalence
classes is the following
...
1 (An equivalence relation is the same as a partition)
The equivalence classes of any equivalence relation R on a set A form a partition
of A, and any partition of A determines an equivalence relation on A for which the
sets in the partition are the equivalence classes
...
Because R is reflexive, a 2 Œa, and so the equivalence classes are nonempty; moreover, since every
element a 2 A belongs to the equivalence class Œa, the union of the equivalence
classes is A
...
Suppose that a R c and b R c
...
Thus, for any arbitrary element x 2 Œa, we have x R a
and, by transitivity, x R b, and thus Œa  Œb
...

For the second part of the proof, let A D fAi g be a partition of A, and define
R D f
...
We claim that R is an
equivalence relation on A
...
Symmetry holds, because if a R b, then a and b are in the same set Ai , and hence b R a
...
To see that the sets in the partition are the equivalence
classes of R, observe that if a 2 Ai , then x 2 Œa implies x 2 Ai , and x 2 Ai
implies x 2 Œa
...
2 Relations

1165

For example, the “Ä” relation on the natural numbers is antisymmetric, since a Ä b
and b Ä a imply a D b
...
For example, the relation “is a descendant of” is a partial order on the
set of all people (if we view individuals as being their own descendants)
...
Instead, the set may contain several maximal elements a
such that for no b 2 A, where b ¤ a, is it the case that a R b
...
3
A relation R on a set A is a total relation if for all a; b 2 A, we have a R b
or b R a (or both), that is, if every pairing of elements of A is related by R
...
For example,
the relation “Ä” is a total order on the natural numbers, but the “is a descendant
of” relation is not a total order on the set of all people, since there are individuals
neither of whom is descended from the other
...

Exercises
B
...

B
...
(We say that a Á b
...
) Into what equivalence classes does this relation
partition the integers?
B
...
reflexive and symmetric but not transitive,
b
...
symmetric and transitive but not reflexive
...


1166

Appendix B

Sets, Etc
...
2-4
Let S be a finite set, and let R be an equivalence relation on S S
...

B
...
He offers the following proof
...

Transitivity, therefore, implies a R a
...
3

Functions
Given two sets A and B, a function f is a binary relation on A and B such that
for all a 2 A, there exists precisely one b 2 B such that
...
The set A is
called the domain of f , and the set B is called the codomain of f
...
a; b/ 2 f , we write b D f
...

Intuitively, the function f assigns an element of B to each element of A
...
For example, the binary relation
f D f
...
For this example, 0 D f
...
1/,
0 D f
...
In contrast, the binary relation
g D f
...
1; 3/ and
...
a; b/ 2 g
...
a/, we say that a is the argument of f
and that b is the value of f at a
...
For example, we might define f
...
n; 2n/ W n 2 Ng
...
a/ D g
...

A finite sequence of length n is a function f whose domain is the set of n
integers f0; 1; : : : ; n 1g
...
0/; f
...
n 1/i
...
For example, the Fibonacci sequence, defined by
recurrence (3
...


B
...
For example, if we had a function
An ! B, we would write b D f
...
a1 ; a2 ; : : : ; an //
...
a1 ; a2 ; : : : ; an /
...
a/, then we sometimes say that b is the
image of a under f
...
A0 / D fb 2 B W b D f
...
A/
...
n/ D 2n is f
...

A function is a surjection if its range is its codomain
...
n/ D bn=2c is a surjective function from N to N, since every element in N
appears as the value of f for some argument
...
n/ D 2n
is not a surjective function from N to N, since no argument to f can produce 3 as a
value
...
n/ D 2n is, however, a surjective function from the natural
numbers to the even numbers
...
When we say that f is onto, we mean that it is surjective
...
a/ ¤ f
...
For example, the function
f
...
The function
f
...
An injection is sometimes called a one-to-one function
...
For example,
the function f
...
1/n dn=2e is a bijection from N to Z:
0
1
2
3
4

!
!
!
!
!
:
:
:

0;
1;
1;
2;
2;

The function is injective, since no element of Z is the image of more than one
element of N
...
Hence, the function is bijective
...
A bijection from a set A to itself is sometimes called a permutation
...
b/ D a if and only if f
...


For example, the inverse of the function f
...
1/n dn=2e is
(
2m
if m 0 ;
f 1
...
3-1
Let A and B be finite sets, and let f W A ! B be a function
...
if f is injective, then jAj Ä jBj;
b
...


B
...
x/ D x C 1 bijective when the domain and the codomain are N?
Is it bijective when the domain and the codomain are Z?
B
...

B
...
4

Z
...
Certain definitions in the literature differ from those given here, but for the most part, the
differences are slight
...
1 shows how we can represent graphs in computer memory
...
V; E/, where V is a finite set and E
is a binary relation on V
...
The set E is called the edge set of G, and its
elements are called edges
...
2(a) is a pictorial representation of a directed
graph on the vertex set f1; 2; 3; 4; 5; 6g
...
Note that self-loops—edges from a
vertex to itself—are possible
...
V; E/, the edge set E consists of unordered
pairs of vertices, rather than ordered pairs
...
4 Graphs

1169

1

2

3

1

2

3

4

5

6

4

5

6

(a)

(b)

1

2

3

6
(c)

Figure B
...
(a) A directed graph G D
...
1; 2/;
...
2; 4/;
...
4; 1/;
...
5; 4/;
...
The edge
...
(b) An undirected graph G D
...
1; 2/;
...
2; 5/;
...
The vertex 4 is isolated
...


u; 2 V and u ¤
...
u; / for an edge, rather
than the set notation fu; g, and we consider
...
; u/ to be the same edge
...
Figure B
...

Many definitions for directed and undirected graphs are the same, although certain terms have slightly different meanings in the two contexts
...
u; / is an edge
in a directed graph G D
...
u; / is incident from or leaves
vertex u and is incident to or enters vertex
...
2(a) are
...
2; 4/, and
...
The edges entering vertex 2 are

...
2; 2/
...
u; / is an edge in an undirected graph G D
...
u; / is incident on vertices u and
...
2(b), the edges incident on
vertex 2 are
...
2; 5/
...
u; / is an edge in a graph G D
...
When the graph is undirected, the adjacency relation is symmetric
...
If is
adjacent to u in a directed graph, we sometimes write u !
...
2, vertex 2 is adjacent to vertex 1, since the edge
...
Vertex 1 is not adjacent to vertex 2 in Figure B
...
2; 1/
does not belong to the graph
...
For example, vertex 2 in Figure B
...
A vertex whose degree is 0,
such as vertex 4 in Figure B
...
In a directed graph, the out-degree
of a vertex is the number of edges leaving it, and the in-degree of a vertex is the
number of edges entering it
...


degree plus its out-degree
...
2(a) has in-degree 2, out-degree 3,
and degree 5
...
V; E/
is a sequence h 0 ; 1 ; 2 ; : : : ; k i of vertices such that u D 0 , u0 D k , and

...
The length of the path is the number of
edges in the path
...
0 ; 1 /;
...
k 1 ; k /
...
) If
there is a path p from u to u0 , we say that u0 is reachable from u via p, which we
p
sometimes write as u ; u0 if G is directed
...
In Figure B
...

The path h2; 5; 4; 5i is not simple
...
That is, for any 0 Ä i Ä j Ä k, the subsequence of vertices h i ; i C1 ; : : : ; j i
is a subpath of p
...
The cycle is simple if, in addition, 1 ; 2 ; : : : ; k
are distinct
...
Two paths h 0 ; 1 ; 2 ; : : : ; k 1 ; 0 i
0
0
0
0
0
and h 0 ; 1 ; 2 ; : : : ; k 1 ; 0 i form the same cycle if there exists an integer j such
that i0 D
...
In Figure B
...
This cycle is simple,
but the cycle h1; 2; 4; 5; 4; 1i is not
...
2; 2/ is
a self-loop
...
In an undirected graph,
3 and 0 D k ; the cycle is simple if
a path h 0 ; 1 ; : : : ; k i forms a cycle if k
1 ; 2 ; : : : ; k are distinct
...
2(b), the path h1; 2; 5; 1i is a
simple cycle
...

An undirected graph is connected if every vertex is reachable from all other
vertices
...
The graph in Figure B
...
Every vertex in f1; 2; 5g is
reachable from every other vertex in f1; 2; 5g
...
The edges of a connected component
are those that are incident on only the vertices of the component; in other words,
edge
...

A directed graph is strongly connected if every two vertices are reachable from
each other
...
” We use the terms “path” and “simple path” throughout this book in a manner consistent with
their definitions
...
4 Graphs

1171

1

2

6

G

3
5

G′

1

u

v

5

3

4

w

x

2

4

y

(a)

z

u

v

w

x

y

(b)

Figure B
...
The vertices of the top graph are mapped to the
vertices of the bottom graph by f
...
2/ D ; f
...
4/ D x; f
...
6/ D ´
...


alence classes of vertices under the “are mutually reachable” relation
...
The
graph in Figure B
...
All pairs of vertices in f1; 2; 4; 5g are mutually reachable
...

Two graphs G D
...
V 0 ; E 0 / are isomorphic if there exists a
bijection f W V ! V 0 such that
...
f
...
// 2 E 0
...
Figure B
...
The mapping from V to V 0 given by f
...
2/ D ;
f
...
4/ D x; f
...
6/ D ´ provides the required bijective function
...
3(b) are not isomorphic
...

We say that a graph G 0 D
...
V; E/ if V 0 Â V
and E 0 Â E
...
V 0 ; E 0 /, where
E 0 D f
...


The subgraph induced by the vertex set f1; 2; 3; 6g in Figure B
...
2(c) and has the edge set f
...
2; 2/;
...

Given an undirected graph G D
...
V; E 0 /, where
...
u; / 2 E
...
u; / in G by the two directed edges
...
; u/
in the directed version
...
V; E/, the undirected version
of G is the undirected graph G 0 D
...
u; / 2 E 0 if and only if u ¤
and
...
That is, the undirected version contains the edges of G “with
their directions removed” and with self-loops eliminated
...
u; / and
...
u; /
and
...
) In a directed graph G D
...
That is, is a neighbor of u if
u ¤ and either
...
; u/ 2 E
...

Several kinds of graphs have special names
...
A bipartite graph is an undirected
graph G D
...
u; / 2 E implies either u 2 V1 and 2 V2 or u 2 V2 and 2 V1
...
An acyclic, undirected graph is a forest,
and a connected, acyclic, undirected graph is a (free) tree (see Section B
...
We
often take the first letters of “directed acyclic graph” and call such a graph a dag
...
A multigraph is like an undirected graph, but it can have both multiple edges between vertices and self-loops
...
Many
algorithms written for ordinary directed and undirected graphs can be adapted to
run on these graphlike structures
...
V; E/ by an edge e D
...
V 0 ; E 0 /, where V 0 D V fu; g [ fxg and x is a new vertex
...
u; / and, for each vertex w
incident on u or , deleting whichever of
...
; w/ is in E and adding the
new edge
...
In effect, u and are “contracted” into a single vertex
...
4-1
Attendees of a faculty party shake hands to greet each other, and each professor
remembers how many times he or she shook hands
...


B
...
V; E/ is
an undirected graph, then
X
degree
...
4-2
Show that if a directed or undirected graph contains a path between two vertices u
and , then it contains a simple path between u and
...

B
...
V; E/ satisfies jEj

jV j

1
...
4-4
Verify that in an undirected graph, the “is reachable from” relation is an equivalence relation on the vertices of the graph
...
4-5
What is the undirected version of the directed graph in Figure B
...
2(b)?
B
...
(Hint: Let one set
of vertices in the bipartite graph correspond to vertices of the hypergraph, and let
the other set of vertices of the bipartite graph correspond to hyperedges
...
5

Trees
As with graphs, there are many related, but slightly different, notions of trees
...

Sections 10
...
1 describe how we can represent trees in computer memory
...
5
...
4, a free tree is a connected, acyclic, undirected graph
...
If an undirected
graph is acyclic but possibly disconnected, it is a forest
...


(a)

(b)

(c)

Figure B
...
(b) A forest
...


for trees also work for forests
...
4(a) shows a free tree, and Figure B
...
The forest in Figure B
...

The graph in Figure B
...

The following theorem captures many important facts about free trees
...
2 (Properties of free trees)
Let G D
...
The following statements are equivalent
...
G is a free tree
...
Any two vertices in G are connected by a unique simple path
...
G is connected, but if any edge is removed from E, the resulting graph is disconnected
...
G is connected, and jEj D jV j
5
...


1
...
G is acyclic, but if any edge is added to E, the resulting graph contains a cycle
...
Suppose, for the sake of contradiction, that vertices u
and are connected by two distinct simple paths p1 and p2 , as shown in Figure B
...

Let w be the vertex at which the paths first diverge; that is, w is the first vertex
on both p1 and p2 whose successor on p1 is x and whose successor on p2 is y,
where x ¤ y
...
Let p 0 be the subpath of p1
from w through x to ´, and let p 00 be the subpath of p2 from w through y to ´
...
Thus, the path obtained by
concatenating p 0 and the reverse of p 00 is a cycle, which contradicts our assumption

B
...
5 A step in the proof of Theorem B
...
Assume for the sake of contradiction that vertices u
and are connected by two distinct simple paths p1 and p2
...
The path p 0 concatenated with the reverse of the path p 00 forms
a cycle, which yields the contradiction
...
Thus, if G is a tree, there can be at most one simple path between
two vertices
...
Let
...
This edge is a path from u to ,
and so it must be the unique path from u to
...
u; / from G, there
is no path from u to , and hence its removal disconnects G
...
4-3, we
have jEj
jV j 1
...
A connected
graph with n D 1 or n D 2 vertices has n 1 edges
...
Removing an arbitrary edge from G separates the graph into k 2
connected components (actually k D 2)
...
If we view each connected component Vi , with edge set Ei ,
as its own free tree, then because each component has fewer than jV j vertices, by
the inductive hypothesis we have jEi j Ä jVi j 1
...
Adding in the removed edge
yields jEj Ä jV j 1
...
We must show
that G is acyclic
...
Let Gk D
...
Note that jVk j D jEk j D k
...
Define GkC1 D
...
i ; kC1 /g
...
If k C 1 < jV j, we can continue, defining GkC2 in
the same manner, and so forth, until we obtain Gn D
...


Vn D V , and jEn j D jVn j D jV j
...
Thus,
G is acyclic
...
Let k be the
number of connected components of G
...
Consequently, we must have k D 1, and G is in fact a
tree
...
Thus, adding any edge to G creates a cycle
...
We must show that G is connected
...

If u and are not already adjacent, adding the edge
...
u; / belong to G
...
u; / must contain a
path from u to , and since u and were chosen arbitrarily, G is connected
...
5
...
We call the distinguished vertex the root of the tree
...
Figure B
...

Consider a node x in a rooted tree T with root r
...
If y is an ancestor of x, then x is
a descendant of y
...
) If y
is an ancestor of x and x ¤ y, then y is a proper ancestor of x and x is a proper
descendant of y
...
For example, the subtree rooted at node 8 in Figure B
...

If the last edge on the simple path from the root r of a tree T to a node x is
...
The root is the only node in T with
no parent
...
A node with no
children is a leaf or external node
...


5 The term “node” is often used in the graph theory literature as a synonym for “vertex
...


B
...
6 Rooted and ordered trees
...
The tree is drawn in a
standard way: the root (node 7) is at the top, its children (nodes with depth 1) are beneath it, their
children (nodes with depth 2) are beneath them, and so forth
...
(b) Another rooted tree
...


The number of children of a node x in a rooted tree T equals the degree of x
...

A level of a tree consists of all nodes at the same depth
...
The height of a tree is also
equal to the largest depth of any node in the tree
...

That is, if a node has k children, then there is a first child, a second child,
...
The two trees in Figure B
...

B
...
3 Binary and positional trees
We define binary trees recursively
...

The degree of a vertex in a free tree is, as in any undirected graph, the number of adjacent vertices
...


1178

Appendix B

Sets, Etc
...
7 Binary trees
...
The left child of a node is
drawn beneath the node and to the left
...
(b) A binary
tree different from the one in (a)
...

In (b), the left child of node 7 is absent and the right child is 5
...
(c) The binary tree in (a) represented by the internal
nodes of a full binary tree: an ordered tree in which each internal node has degree 2
...


is composed of three disjoint sets of nodes: a root node, a binary tree called its
left subtree, and a binary tree called its right subtree
...
If the left subtree is nonempty, its root is called the left child of
the root of the entire tree
...
If a subtree is the null tree NIL, we say that the
child is absent or missing
...
7(a) shows a binary tree
...
For example, in a binary tree, if a node has just one child, the position
of the child—whether it is the left child or the right child—matters
...
Figure B
...
7(a) because of
the position of one node
...

We can represent the positioning information in a binary tree by the internal
nodes of an ordered tree, as shown in Figure B
...
The idea is to replace each
missing child in the binary tree with a node having no children
...
The tree that results is a full binary tree: each
node is either a leaf or has degree exactly 2
...
Consequently, the order of the children of a node preserves the position information
...
In a positional tree, the

B
...
8 A complete binary tree of height 3 with 8 leaves and 7 internal nodes
...
The ith child of a
node is absent if no child is labeled with integer i
...
Thus,
a binary tree is a k-ary tree with k D 2
...
Figure B
...
How many leaves does a complete k-ary tree of height h have? The root
has k children at depth 1, each of which has k children at depth 2, etc
...
Consequently, the height of a complete k-ary
tree with n leaves is logk n
...
5)
...


Exercises
B
...
Draw all the
rooted trees with nodes x, y, and ´ with x as the root
...
Draw all the binary trees with nodes x,
y, and ´ with x as the root
...


B
...
V; E/ be a directed acyclic graph in which there is a vertex 0 2 V
such that there exists a unique path from 0 to every vertex 2 V
...

B
...
Conclude that the number of internal nodes
in a full binary tree is 1 fewer than the number of leaves
...
5-4
Use induction to show that a nonempty binary tree with n nodes has height at
least blg nc
...
5-5 ?
The internal path length of a full binary tree is the sum, taken over all internal
nodes of the tree, of the depth of each node
...
Consider a full
binary tree with n internal nodes, internal path length i, and external path length e
...

B
...
x/ D 2 d with each leaf x of depth d in a binary
P
tree T , and let L be the set of leaves of T
...
x/ Ä 1
...
)
B
...


Problems
B-1 Graph coloring
Given an undirected graph G D
...
u/ ¤ c
...
u; / 2 E
...

a
...


Problems for Appendix B

1181

b
...
G is bipartite
...
G is 2-colorable
...
G has no cycles of odd length
...
Let d be the maximum degree of any vertex in a graph G
...

p
d
...
jV j/ edges, then we can color G with O
...

B-2 Friendly graphs
Reword each of the following statements as a theorem about undirected graphs,
and then prove it
...

a
...

b
...

c
...

d
...

B-3 Bisecting trees
Many divide-and-conquer algorithms that operate on graphs require that the graph
be bisected into two nearly equal-sized subgraphs, which are induced by a partition
of the vertices
...
We require that whenever two vertices end up in the same
subtree after removing edges, then they must be in the same partition
...
Show that we can partition the vertices of any n-vertex binary tree into two
sets A and B, such that jAj Ä 3n=4 and jBj Ä 3n=4, by removing a single
edge
...
Show that the constant 3=4 in part (a) is optimal in the worst case by giving
an example of a simple binary tree whose most evenly balanced partition upon
removal of a single edge has jAj D 3n=4
...


c
...
lg n/ edges, we can partition the vertices
of any n-vertex binary tree into two sets A and B such that jAj D bn=2c
and jBj D dn=2e
...
Boole pioneered the development of symbolic logic, and he introduced many of
the basic set notations in a book published in 1854
...
Cantor during the period 1874–1895
...
The term “function” is attributed to G
...
Leibniz, who used it
to refer to several kinds of mathematical formulas
...
Graph theory originated in 1736, when L
...

The book by Harary [160] provides a useful compendium of many definitions
and results from graph theory
...
If you
have a good background in these areas, you may want to skim the beginning of this
appendix lightly and concentrate on the later sections
...

Section C
...
The axioms of probability
and basic facts concerning probability distributions form Section C
...
Random
variables are introduced in Section C
...
Section C
...
The study of the binomial distribution
continues in Section C
...


C
...
For example, we might ask, “How many different n-bit
numbers are there?” or “How many orderings of n distinct elements are there?” In
this section, we review the elements of counting theory
...
1
...

The rule of sum says that the number of ways to choose one element from one
of two disjoint sets is the sum of the cardinalities of the sets
...
3)
...
The number of possibilities for each position is therefore
26 C 10 D 36, since there are 26 choices if it is a letter and 10 choices if it is a
digit
...
That is, if A and B are two finite sets, then jA Bj D jAj jBj,
which is simply equation (B
...
For example, if an ice-cream parlor offers 28
flavors of ice cream and 4 toppings, the number of possible sundaes with one scoop
of ice cream and one topping is 28 4 D 112
...
For example, there are 8
binary strings of length 3:
000; 001; 010; 011; 100; 101; 110; 111 :
We sometimes call a string of length k a k-string
...
A k-substring of a string
is a substring of length k
...

We can view a k-string over a set S as an element of the Cartesian product S k
of k-tuples; thus, there are jSjk strings of length k
...
Intuitively, to construct a k-string over an n-set, we have n
ways to pick the first element; for each of these choices, we have n ways to pick the
second element; and so forth k times
...

Permutations
A permutation of a finite set S is an ordered sequence of all the elements of S,
with each element appearing exactly once
...

A k-permutation of S is an ordered sequence of k elements of S, with no element appearing more than once in the sequence
...
) The twelve 2-permutations of the set fa; b; c; d g are

C
...
n

1/
...
n

k C 1/ D


;

...
1)

since we have n ways to choose the first element, n 1 ways to choose the second
element, and so on, until we have selected k elements, the last being a selection
from the remaining n k C 1 elements
...
For example, the 4-set
fa; b; c; d g has six 2-combinations:
ab; ac; ad; bc; bd; cd :
(Here we use the shorthand of denoting the 2-subset fa; bg by ab, and so on
...
The order in which we select the elements does not matter
...
Every k-combination has exactly kŠ permutations
of its elements, each of which is a distinct k-permutation of the n-set
...
1), this quantity is

:

...
2)

For k D 0, this formula tells us that the number of ways to choose 0 elements from
an n-set is 1 (not 0), since 0Š D 1
...
From equation (C
...
n k/Š
k

This formula is symmetric in k and n
!
!
n
n
D
:
k
n k

k:
(C
...
4)
xkyn k :

...

Many identities involve binomial coefficients
...

Binomial bounds
We sometimes need to bound the size of a binomial coefficient
...
n 1/
...
k 1/ 1
k
Â
à Â
Ã
n kC1
nÁ n 1
D
k
k 1
1
n Ák
:
k
Taking advantage of the inequality kŠ

...
18), we obtain the upper bounds
!
n
n
...
n k C 1/
D
k
...
5)

For all integers k such that 0 Ä k Ä n, we can use induction (see Exercise C
...
1 Counting

1187

!
n
nn
Ä k
k
...
6)

where for convenience we assume that 00 D 1
...
n/ n
...
1 /n
n
Ã1 !n
 à Â
1
1
D
1

Ä 1, we

D 2n H
...
/ D

lg


...
1

/

(C
...
0/ D H
...

Exercises
C
...
) How many substrings does an n-string have in
total?
C
...
How many n-input, 1-output boolean functions are there? How
many n-input, m-output boolean functions are there?
C
...

C
...
1-5
Prove the identity
!
!
n
n n 1
D
k k 1
k

(C
...

C
...

C
...
Use this approach to prove
that
!
!
!
n
n 1
n 1
D
C
:
k
k
k 1
C
...
1-7, make a table for n D 0; 1; : : : ; 6 and 0 Ä k Ä n
n
of the binomial coefficients k with 0 at the top, 1 and 1 on the next line, and
0
0
1
so forth
...

C
...
1-10
Show that for any integers n 0 and 0 Ä k Ä n, the expression
maximum value when k D bn=2c or k D dn=2e
...
1-11 ?
Argue that for any integers n
!
!
!
n
n n j
Ä
:
j Ck
j
k

0, j

0, k

n
k

achieves its

0, and j C k Ä n,
(C
...
2 Probability

1189

Provide both an algebraic proof and an argument based on a method for choosing
j C k items out of n
...

C
...
6),
and use equation (C
...

C
...
1 C O
...
10)

C
...
/, show that it achieves its maximum
value at D 1=2
...
1=2/?
C
...
11)

kD0

C
...
This section reviews basic probability theory
...
We can think of each elementary event as a
possible outcome of an experiment
...
For example, in the experiment of
flipping two coins, the event of obtaining one head and one tail is fHT; TH g
...
We say
that two events A and B are mutually exclusive if A \ B D ;
...
By definition, all elementary events
are mutually exclusive
...
Pr fAg

0 for any event A
...
Pr fSg D 1
...
Pr fA [ Bg D Pr fAg C Pr fBg for any two mutually exclusive events A
and B
...
We note here that axiom 2 is a
normalization requirement: there is really nothing fundamental about choosing 1
as the probability of the certain event, except that it is natural and convenient
...
1)
...
If A Â B, then
Pr fAg Ä Pr fBg
...
For any two events A and B,
Pr fA [ Bg D Pr fAg C Pr fBg Pr fA \ Bg
Ä Pr fAg C Pr fBg :

(C
...
13)

1 For a general probability distribution, there may be some subsets of the sample space S that are not
considered to be events
...

The main requirement for what subsets are events is that the set of events of a sample space be closed
under the operations of taking the complement of an event, forming the union of a finite or countable
number of events, and taking the intersection of a finite or countable number of events
...
A notable exception is the continuous
uniform probability distribution, which we shall see shortly
...
2 Probability

1191

In our coin-flipping example, suppose that each of the four elementary events
has probability 1=4
...

Discrete probability distributions
A probability distribution is discrete if it is defined over a finite or countably infinite
sample space
...
Then for any event A,
X
Pr fsg ;
Pr fAg D
s2A

since elementary events, specifically those in A, are mutually exclusive
...
In such a case the experiment is often described as “picking an element of S at random
...
If we flip the coin n times, we have the uniform probability distribution
defined on the sample space S D fH; Tgn , a set of size 2n
...
The event
A D fexactly k heads and exactly n

k tails occurg

n
k

n
, since k strings of length n over fH; Tg contain
is a subset of S of size jAj D
n
exactly k H’s
...


Continuous uniform probability distribution
The continuous uniform probability distribution is an example of a probability
distribution in which not all subsets of the sample space are considered to be
events
...
Our intuition is that each point in the interval Œa; b should be “equally likely
...
For this reason, we would like to associate a

1192

Appendix C

Counting and Probability

probability only with some of the subsets of S, in such a way that the axioms are
satisfied for these events
...
If we remove
the endpoints of an interval Œc; d , we obtain the open interval
...
Since
Œc; d  D Œc; c [
...
c; d /g
...


Pr fŒc; d g D

Conditional probability and independence
Sometimes we have some prior partial knowledge about the outcome of an experiment
...
What is the probability that both
coins are heads? The information given eliminates the possibility of two tails
...
Since only one of these elementary events shows two heads,
the answer to our question is 1=3
...
The conditional probability of an event A given
that another event B occurs is defined to be
Pr fA \ Bg
Pr fA j Bg D
(C
...
(We read “Pr fA j Bg” as “the probability of A given B
...
That is, A \ B is the set of outcomes in which both A and B occur
...
The conditional probability of A given B is, therefore, the ratio of the
probability of event A \ B to the probability of event B
...
Thus, Pr fA j Bg D
...
3=4/ D 1=3
...
15)

C
...
Then the probability of two heads is
...
1=2/ D 1=4
...
Each of these events occurs with probability 1=2, and
the probability that both events occur is 1=4; thus, according to the definition of
independence, the events are independent—even though you might think that both
events depend on the first coin
...
Then the probability that each coin comes up heads is 1=2, but the
probability that they both come up heads is 1=2 ¤
...
1=2/
...

A collection A1 ; A2 ; : : : ; An of events is said to be pairwise independent if
Pr fAi \ Aj g D Pr fAi g Pr fAj g
for all 1 Ä i < j Ä n
...
Let A1 be the event that the first coin
is heads, let A2 be the event that the second coin is heads, and let A3 be the event
that the two coins are different
...
The events are not mutually independent, however, because Pr fA1 \ A2 \ A3 g D 0 and Pr fA1 g Pr fA2 g Pr fA3 g D
1=8 ¤ 0
...
14) and the commutative law
A \ B D B \ A, it follows that for two events A and B, each with nonzero
probability,
Pr fA \ Bg D Pr fBg Pr fA j Bg
D Pr fAg Pr fB j Ag :

(C
...
17)
Pr fA j Bg D
Pr fBg
which is known as Bayes’s theorem
...
Since B D
...
B \ A/,
and since B \ A and B \ A are mutually exclusive events,
«
˚
Pr fBg D Pr fB \ Ag C Pr B \ A
«
˚ « ˚
D Pr fAg Pr fB j Ag C Pr A Pr B j A :
Substituting into equation (C
...
18)
Pr fA j Bg D
Pr fAg Pr fB j Ag C Pr A Pr B j A
Bayes’s theorem can simplify the computing of conditional probabilities
...
We run an experiment consisting of three independent events: we choose
one of the two coins at random, we flip that coin once, and then we flip it again
...
What is the
probability that it is biased?
We solve this problem using Bayes’s theorem
...
« We wish to determine « fA j Bg
...
1=2/ 1

...
1=2/
...
2-1
Professor Rosencrantz flips a fair coin once
...
What is the probability that Professor Rosencrantz obtains more heads
than Professor Guildenstern?

C
...
2-2
Prove Boole’s inequality: For any finite or countably infinite sequence of events
A1 ; A2 ; : : :,
Pr fA1 [ A2 [

g Ä Pr fA1 g C Pr fA2 g C

:

(C
...
2-3
Suppose we shuffle a deck of 10 cards, each bearing a distinct number from 1 to 10,
to mix the cards thoroughly
...
What is the probability that we select the three cards in sorted (increasing)
order?
C
...
2-5
Prove that for any collection of events A1 ; A2 ; : : : ; An ,
Pr fA1 \ A2 \

\ An g D Pr fA1 g Pr fA2 j A1 g Pr fA3 j A1 \ A2 g
Pr fAn j A1 \ A2 \ \ An 1 g :

C
...
b a/=b
...
1/
...
)
C
...

C
...

C
...
You will win the prize if you select the correct curtain
...
How would
your chances change if you switch? (This question is the celebrated Monty Hall
problem, named after a game-show host who often presented contestants with just
this dilemma
...
2-10 ?
A prison warden has randomly picked one prisoner among three to go free
...
The guard knows which one will go free but is forbidden to give any prisoner information regarding his status
...
Prisoner X asks the guard privately which of Y or Z will be executed, arguing that since he already knows that at least one of them must die, the
guard won’t be revealing any information about his own status
...
Prisoner X feels happier now, since he figures that either
he or prisoner Z will go free, which means that his probability of going free is
now 1=2
...


C
...
It associates a real number with each possible
outcome of an experiment, which allows us to work with the probability distribution induced on the resulting set of numbers
...
Henceforth, we shall assume that random
variables are discrete
...
s/ D xg; thus,
X
Pr fsg :
Pr fX D xg D
s2SWX
...
x/ D Pr fX D xg
is the probability density function of the random variable X
...

As an example, consider the experiment of rolling a pair of ordinary, 6-sided
dice
...
We assume

C
...
Define the random variable X to be the maximum of
the two values showing on the dice
...
1; 3/,
...
3; 3/,

...
3; 1/
...
If X and Y
are random variables, the function
f
...
For a fixed value y,
X
Pr fX D x and Y D yg ;
Pr fY D yg D
x

and similarly, for a fixed value x,
X
Pr fX D x and Y D yg :
Pr fX D xg D
y

Using the definition (C
...

Given a set of random variables defined over the same sample space, we can
define new random variables as sums, products, or other functions of the original
variables
...
The expected value (or, synonymously,
expectation or mean) of a discrete random variable X is
X
x Pr fX D xg ;
(C
...
Sometimes the
expectation of X is denoted by X or, when the random variable is apparent from
context, simply by
...
You earn $3 for each head but
lose $2 for each tail
...
1=4/ C 1
...
1=4/
D 1:

4 Pr f2 T’sg

The expectation of the sum of two random variables is the sum of their expectations, that is,
E ŒX C Y  D E ŒX  C E ŒY  ;

(C
...
We call this property linearity of expectation, and it holds even if X and Y are not independent
...
Linearity of expectation is the
key property that enables us to perform probabilistic analyses by using indicator
random variables (see Section 5
...

If X is any random variable, any function g
...
X /
...
X / is defined, then
X
g
...
X / D
x

Letting g
...
22)

Consequently, expectations are linear: for any two random variables X and Y and
any constant a,
E ŒaX C Y  D aE ŒX  C E ŒY  :

(C
...
24)

C
...
Pr fX
Pr fX

ig

Pr fX

i C 1g/

ig ;

(C
...

When we apply a convex function f
...
X /

f
...
26)

provided that the expectations exist and are finite
...
x/ is convex
if for all x and y and for all 0 Ä
Ä 1, we have f
...
1
/y/ Ä
f
...
1
/f
...
)
Variance and standard deviation
The expected value of a random variable does not tell us how “spread out” the
variable’s values are
...

The notion of variance mathematically expresses how far from the mean a random variable’s values are likely to be
...
X
D E X2

E ŒX /2
2X E ŒX  C E2 ŒX 

D E X2

2E ŒX E ŒX  C E2 ŒX 

D E X2

2E2 ŒX  C E2 ŒX 

D E X2

E2 ŒX  :

(C
...
The equality E ŒX E ŒX  D E2 ŒX 

1200

Appendix C

Counting and Probability

follows from equation (C
...
Rewriting equation (C
...
28)

The variance of a random variable X and the variance of aX are related (see
Exercise C
...
29)
Var
i D1

i D1

The standard deviation of a random variable X is the nonnegative square root
of the variance of X
...

With this notation, the variance of X is denoted 2
...
3-1
Suppose we roll two ordinary, 6-sided dice
...
3-2
An array AŒ1 : : n contains n distinct numbers that are randomly ordered, with each
permutation of the n numbers being equally likely
...
3-3
A carnival game consists of three dice in a cage
...
The cage is shaken, and the payoff is as follows
...
Otherwise,
if his number appears on exactly k of the three dice, for k D 1; 2; 3, he keeps his
dollar and wins k more dollars
...
4 The geometric and binomial distributions

1201

C
...
X; Y / Ä E ŒX  C E ŒY  :
C
...
Prove that f
...
Y / are
independent for any choice of functions f and g
...
3-6 ?
Let X be a nonnegative random variable, and suppose that E ŒX  is well defined
...
30)

for all t > 0
...
3-7 ?
Let S be a sample space, and let X and X 0 be random variables such that
X
...
s/ for all s 2 S
...
3-8
Which is larger: the expectation of the square of a random variable, or the square
of its expectation?
C
...

C
...
27) of variance
...
4 The geometric and binomial distributions
We can think of a coin flip as an instance of a Bernoulli trial, which is an experiment with only two possible outcomes: success, which occurs with probability p,
and failure, which occurs with probability q D 1 p
...
Two

1202

Appendix C

 Ãk
2
3

1

Counting and Probability

 Ã
1
3

0
...
30
0
...
20
0
...
10
0
...
1 A geometric distribution with probability p D 1=3 of success and a probability
q D 1 p of failure
...


important distributions arise from Bernoulli trials: the geometric distribution and
the binomial distribution
...
How many trials occur before we obtain
a success? Let us define the random variable X be the number of trials needed to
obtain a success
...
31)

since we have k 1 failures before the one success
...
31) is said to be a geometric distribution
...
1 illustrates
such a distribution
...
4 The geometric and binomial distributions

1203

Assuming that q < 1, we can calculate the expectation of a geometric distribution using identity (A
...
1 q/2
p q
q p2
1=p :
1

D
D
D
D

(C
...

The variance, which can be calculated similarly, but using Exercise A
...
33)

As an example, suppose we repeatedly roll two dice until we obtain either a
seven or an eleven
...
Thus, the probability of success is p D 8=36 D 2=9, and we must roll
1=p D 9=2 D 4:5 times on average to obtain a seven or eleven
...
Then X has values in the range
f0; 1; : : : ; ng, and for k D 0; 1; : : : ; n,
!
n k n k
;
(C
...
A probability distribution satisfying equation (C
...
For convenience, we define the
family of binomial distributions using the notation
!
n k
(C
...
kI n; p/ D
p
...
2 illustrates a binomial distribution
...
34) being the kth term of the expansion of
...

Consequently, since p C q D 1,

1204

Appendix C

Counting and Probability

b (k; 15, 1/3)
0
...
20
0
...
10
0
...
2 The binomial distribution b
...
The expectation of the distribution is np D 5
...
kI n; p/ D 1 ;

(C
...

We can compute the expectation of a random variable having a binomial distribution from equations (C
...
36)
...
kI n; p/, and let q D 1 p
...
kI n; p/

kD0

!
n k n k
k
p q
D
k
kD1
!
n
X n 1
D np
(by equation (C
...
n 1/ k
D np
k
n
X

kD0

C
...
kI n

1205

1; p/

kD0

D np

(by equation (C
...


(C
...
Let Xi be the random variable describing the number of
successes in the ith trial
...
21)), the expected number of successes for n trials is
#
" n
X
Xi
E ŒX  D E
i D1

D
D

n
X
i D1
n
X

E ŒXi 
p

i D1

D np :

(C
...
Using
equation (C
...
Since Xi only takes on the
values 0 and 1, we have Xi2 D Xi , which implies E ŒXi2  D E ŒXi  D p
...
1

p/ D pq :

(C
...
29),
" n
#
X
Var ŒX  D Var
Xi
i D1

D
D

n
X
i D1
n
X

Var ŒXi 
pq

i D1

D npq :

(C
...
2 shows, the binomial distribution b
...
We can prove that the distribution
always behaves in this manner by looking at the ratio of successive terms:

1206

Appendix C

Counting and Probability

b
...
k 1I n; p/

D

n
k

pk qn

k

n
k 1

p k 1 q n kC1

...
n k C 1/Šp
D

...
n k C 1/p
D
kq

...
41)

This ratio is greater than 1 precisely when
...
Consequently, b
...
k 1I n; p/ for k <
...
kI n; p/ < b
...
n C 1/p (the distribution decreases)
...
n C 1/p is an integer, then b
...
k 1I n; p/, and so the distribution then has two maxima: at k D
...
nC1/p 1 D np q
...
n C 1/p
...

Lemma C
...
kI n; p/ Ä
k
n k

p, and let 0 Ä k Ä n
...
6), we have
!
n k n k
b
...
4-1
Verify axiom 2 of the probability axioms for the geometric distribution
...
4-2
How many times on average must we flip 6 fair coins before we obtain 3 heads
and 3 tails?

C
...
4-3
Show that b
...
n

kI n; q/, where q D 1

1207

p
...
4-4
Show that value of the maximum of the binomial distribution b
...

C
...
Show that the probability of exactly one success
is also approximately 1=e
...
4-6 ?
Professor Rosencrantz flips a fair coin n times, and so does Professor Guildenstern
...
(Hint:
n
For Professor Rosencrantz, call a head a success; for Professor Guildenstern, call
a tail a success
...
4-7 ?
Show that for 0 Ä k Ä n,
b
...
k=n/

n

;

where H
...
7)
...
4-8 ?
Consider n Bernoulli trials, where for i D 1; 2; : : : ; n, the ith trial has probability pi of success, and let X be the random variable denoting the total number of
successes
...
Prove that for 1 Ä k Ä n,
Pr fX < kg

k 1
X

b
...
4-9 ?
Let X be the random variable for the total number of successes in a set A of n
Bernoulli trials, where the ith trial has a probability pi of success, and let X 0
be the random variable for the total number of successes in a second set A0 of n
Bernoulli trials, where the ith trial has a probability pi0 pi of success
...
3-7
...
5 The tails of the binomial distribution
The probability of having at least, or at most, k successes in n Bernoulli trials,
each with probability p of success, is often of more interest than the probability of
having exactly k successes
...
kI n; p/ that are far from the
mean np
...

We first provide a bound on the right tail of the distribution b
...
We can
determine bounds on the left tail by inverting the roles of successes and failures
...
2
Consider a sequence of n Bernoulli trials, where success occurs with probability p
...
Then for
0 Ä k Ä n, the probability of at least k successes is
Pr fX

kg D

n
X

b
...
Clearly Pr fAS g D p k if jSj D k
...
19))

C
...
In general, we shall leave it to you to adapt the proofs from one tail to
the other
...
3
Consider a sequence of n Bernoulli trials, where success occurs with probability p
...
iI n; p/

i D0

Ä
D

!

n
n

!

k

n

...
1
p/n

p/n
k

k

:

Our next bound concerns the left tail of the binomial distribution
...

Theorem C
...
Let X be the random variable denoting the
total number of successes
...
iI n; p/
Pr fX < kg D
i D0

<

kq
b
...
iI n; p/ by a geometric series using the technique from Section A
...
For i D 1; 2; : : : ; k, we have from equation (C
...
i 1I n; p/
D
b
...
n i C 1/p
iq
<

...
n k/p

1210

Appendix C

Counting and Probability

If we let
kq

...
n np/p
kq
nqp
k
np
1;

x D
<
D
D
<

it follows that
b
...
iI n; p/

for 0 < i Ä k
...
iI n; p/ < x k

i

i times, we obtain

b
...
iI n; p/ <

i D0

k 1
X

x k i b
...
kI n; p/

1
X

xi

i D0

D
D

x

b
...
kI n; p/ :
np k
1

Corollary C
...
Then for 0 < k Ä np=2, the probability of
fewer than k successes is less than one half of the probability of fewer than k C 1
successes
...
np=2/q
np
...
5 The tails of the binomial distribution


...
42)

since q Ä 1
...
4 and inequality (C
...
iI n; p/ < b
...
iI n; p/
Pk
i D0 b
...
iI n; p/
Pk 1
i D0 b
...
kI n; p/

< 1=2 ;
b
...
kI n; p/
...
Exercise C
...

Corollary C
...

Let X be the random variable denoting the total number of successes
...
iI n; p/

i DkC1

<


...
kI n; p/ :
np

Corollary C
...
Then for
...

The next theorem considers n Bernoulli trials, each with a probability pi of
success, for i D 1; 2; : : : ; n
...

Theorem C
...

Let X be the random variable describing the total number of successes, and let
D E ŒX 
...
X / e ˛r ;

Proof
Pr fX

(C
...
Using Markov’s inequality (C
...
44)
Pr e ˛
...
X / e ˛r :
The bulk of the proof consists of bounding E e ˛
...
44)
...
X /
...
2), let Xi D I fthe ith
Bernoulli trial is a successg for i D 1; 2; : : : ; n; that is, Xi is the random variable that is 1 if the ith Bernoulli trial is a success and 0 if it is a failure
...
Xi
X
D

i D1

i D1

pi / :

i D1

To evaluate E e ˛
...
X

/

/

, we substitute for X
Pn

D E e ˛ i D1
...
Xi
D E

pi /

#

pi /

i D1

D

n
Y
i D1

E e ˛
...
5 The tails of the binomial distribution

1213

which follows from (C
...
Xi pi / (see
Exercise C
...
By the definition of expectation,
E e ˛
...
1 pi / pi C e ˛
...
pi e ˛ / ;

pi /

qi
(C
...
x/ denotes the exponential function: exp
...
(Inequality (C
...
12)
...
X

/

D

n
Y

E e ˛
...
pi e ˛ /

i D1

D exp

n
X

!
pi e

˛

i D1
˛

(C
...
e / ;
Pn
since
D
i D1 pi
...
43) and inequalities (C
...
46), it follows that
Pr fX

rg Ä exp
...
47)

Choosing ˛ D ln
...
5-7), we obtain
Pr fX

rg Ä exp
...
r= / r ln
...
r r ln
...
r= /r
e Ár
D
:
r

When applied to Bernoulli trials in which each trial has the same probability of
success, Theorem C
...


1214

Appendix C

Counting and Probability

Corollary C
...
Then for r > np,
Pr fX

np

rg D
Ä

Proof

n
X
kDdnpCre
npe Ár

r

b
...
37), we have

D E ŒX  D np
...
5-1 ?
Which is less likely: obtaining no heads when you flip a fair coin n times, or
obtaining fewer than n heads when you flip the coin 4n times?
C
...
6 and C
...

C
...
a C 1/n
na
i
i D0

k
b
...
a C 1//
k
...
a C 1/
...
5-4 ?
Prove that if 0 < k < np, where 0 < p < 1 and q D 1
k 1
X
i D0

pi qn

i

<

np Ák nq Án
kq
np k k
n k

p, then

k

:

C
...
8 imply that
Ãr
Â

...
9 imply that
nqe Ár
:
Pr fnp X rg Ä
r

Problems for Appendix C

1215

C
...

Let X be the random variable describing the total number of successes, and let
D E ŒX 
...
Then follow the outline of the proof
of Theorem C
...
45)
...
5-7 ?
Show that choosing ˛ D ln
...
47)
...

a
...
Argue that the number of ways of placing the balls in the bins is b n
...
Suppose that the balls are distinct and that the balls in each bin are ordered
...
b C n 1/Š=
...
(Hint: Consider the number of ways of arranging n distinct balls and b 1
indistinguishable sticks in a row
...
Suppose that the balls are identical, and hence their order within a bin does not
matter
...

n
(Hint: Of the arrangements in part (b), how many are repeated if the balls are
made identical?)
d
...
Show that the number of ways of placing the balls is n
...
Suppose that the balls are identical and that no bin may be left empty
...

b 1

1216

Appendix C

Counting and Probability

Appendix notes
The first general methods for solving probability problems were discussed in a
famous correspondence between B
...
de Fermat, which began in 1654,
and in a book by C
...
Rigorous probability theory began with the
work of J
...
De Moivre in 1730
...
-S
...
-D
...
F
...

Sums of random variables were originally studied by P
...
Chebyshev and A
...

Markov
...
N
...
Chernoff [66]
and Hoeffding [173] provided bounds on the tails of distributions
...
Erd¨ s
...
Standard textbooks such as Billingsley [46], Chung [67], Drake [95],
Feller [104], and Rozanov [300] offer comprehensive introductions to probability
...
If you have seen matrices before, much of the material in this
appendix will be familiar to you, but some of it might be new
...
1 covers
basic matrix definitions and operations, and Section D
...


D
...

Matrices and vectors
A matrix is a rectangular array of numbers
...
1)

is a 2 3 matrix A D
...
We use uppercase letters
to denote matrices and corresponding subscripted lowercase letters to denote their
elements
...

The transpose of a matrix A is the matrix AT obtained by exchanging the rows
and columns of A
...
1),

1218

Appendix D

Matrices

1 4
2 5
3 6

AT D

:

A vector is a one-dimensional array of numbers
...
We sometimes call a vector of length n an n-vector
...
We take the standard form of a vector to be
as a column vector equivalent to an n 1 matrix; the corresponding row vector is
obtained by taking the transpose:
xT D
...
Usually, the size of a unit vector is clear from the context
...
Such a matrix is often
denoted 0, since the ambiguity between the number 0 and a matrix of 0s is usually
easily resolved from context
...

Square matrices
Square n n matrices arise frequently
...
A diagonal matrix has aij D 0 whenever i ¤ j
...
a11 ; a22 ; : : : ; ann / D

0
2
...
1; 1; : : : ; 1/
1 0 ::: 0
0 1 ::: 0
D
: : :: :
: :
: :
: :
:
0 0 ::: 1

:

D
...
The ith
column of an identity matrix is the unit vector ei
...
A tridiagonal matrix T is one for which tij D 0 if ji j j > 1
...
An upper-triangular matrix U is one for which uij D 0 if i > j
...

5
...
All entries
above the diagonal are zero:

˙l

11

LD

l21
:
:
:
ln1

0 ::: 0
l22 : : : 0
: ::
:
:
: :
:
:
ln2 : : : lnn

:

A lower-triangular matrix is unit lower-triangular if it has all 1s along the
diagonal
...
A permutation matrix P has exactly one 1 in each row or column, and 0s
elsewhere
...
Exercise D
...

7
...
For example,
1 2 3
2 6 4
3 4 5
is a symmetric matrix
...
The number
system defines how to add and multiply numbers
...

We define matrix addition as follows
...
aij / and B D
...
cij / D A C B is the m n matrix defined by
cij D aij C bij
for i D 1; 2; : : : ; m and j D 1; 2; : : : ; n
...
A zero matrix is the identity for matrix addition:
AC0 DAD 0CA:
If is a number and A D
...
aij / is the scalar
multiple of A obtained by multiplying each of its elements by
...
aij / to be 1 A D A, so that the ij th
entry of A is aij
...
A/ D 0 D
...
1 Matrices and matrix operations

1221

We use the negative of a matrix to define matrix subtraction: A B D A C
...

We define matrix multiplication as follows
...
(In general, an expression containing a matrix product AB is always
assumed to imply that matrices A and B are compatible
...
ai k / is an m n
matrix and B D
...
cij /, where
cij D

n
X

ai k bkj

(D
...
The procedure S QUARE -M ATRIX M ULTIPLY in Section 4
...
2), assuming that the matrices are square:
m D n D p
...
n 1/ additions, and so its running time is ‚
...

Matrices have many (but not all) of the algebraic properties typical of numbers
...
Multiplying by a zero matrix gives a zero matrix:

A0 D 0 :
Matrix multiplication is associative:
A
...
AB/C
for compatible matrices A, B, and C
...
B C C / D AB C AC ;

...
For example, if
n
Ã
Â
0 1
0 0
AD
and B D
, then
0 0
1 0
Â
Ã
1 0
AB D
0 0
and
BA D

Â

0 0
0 1

Ã
:

1222

Appendix D

Matrices

We define matrix-vector products or vector-vector products as if the vector were
the equivalent n 1 matrix (or a 1 n matrix, in the case of a row vector)
...
If x and y
are n-vectors, then
T

x yD

n
X

xi y i

i D1

is a number (actually a 1 1 matrix) called the inner product of x and y
...

The (euclidean) norm kxk of an n-vector x is defined by
2
2
kxk D
...
x T x/1=2 :

2
C xn /1=2

Thus, the norm of x is its length in n-dimensional euclidean space
...
1-1
Show that if A and B are symmetric n

n matrices, then so are A C B and A

B
...
1-2
Prove that
...

D
...

D
...
Prove that the product of two permutation matrices is
a permutation matrix
...
2

Basic matrix properties
In this section, we define some basic properties pertaining to matrices: inverses,
linear dependence and independence, rank, and determinants
...


D
...
For example,
Ã
Â
à 1 Â
0
1
1 1
D
:
1
1
1 0

1

(if

Many nonzero n n matrices do not have inverses
...
An example of a nonzero singular matrix is
Â
Ã
1 0
:
1 0
If a matrix has an inverse, it is called invertible, or nonsingular
...
(See Exercise D
...
) If A and B are nonsingular
n n matrices, then

...
A 1 /T D
...

The row vectors x1 D
...
2 6 4 /, and x3 D
...
If vectors are not
linearly dependent, they are linearly independent
...

The column rank of a nonzero m n matrix A is the size of the largest set
of linearly independent columns of A
...
A fundamental property of
any matrix A is that its row rank always equals its column rank, so that we can
simply refer to the rank of A
...
m; n/, inclusive
...
) An alternate, but equivalent and often more useful, definition
is that the rank of a nonzero m n matrix A is the smallest number r such that
there exist matrices B and C of respective sizes m r and r n such that
A D BC :
A square n n matrix has full rank if its rank is n
...
The following theorem gives a fundamental property
of ranks
...
1
A square matrix has full rank if and only if it is nonsingular
...
The
following theorem (whose proof is left as Exercise D
...

Theorem D
...

Corollary D
...

The ij th minor of an n n matrix A, for n > 1, is the
...
n 1/ matrix AŒij 
obtained by deleting the ith row and j th column of A
...
A/ D

11

n
X

if n D 1 ;

...
AŒ1j  / if n > 1 :

j D1

The term
...
AŒij  / is known as the cofactor of the element aij
...

Theorem D
...
A/ D 0
...

The determinant of A is unchanged if the entries in one row (respectively, column) are added to those in another row (respectively, column)
...

The determinant of A is multiplied by 1 if any two rows (or any two columns)
are exchanged
...
AB/ D det
...
B/
...
2 Basic matrix properties

1225

Theorem D
...
A/ D 0
...
An n n
matrix A is positive-definite if x TAx > 0 for all n-vectors x ¤ 0
...
x1 x2
x T In x D x T x
n
X
xi2
D
i D1

> 0:
Matrices that arise in applications are often positive-definite due to the following
theorem
...
6
For any matrix A with full column rank, the matrix AT A is positive-definite
...
AT A/x > 0 for any nonzero vector x
...
AT A/x D
...
Ax/ (by Exercise D
...

0
...
Since A has full column rank, Ax D 0 implies x D 0, by Theorem D
...

Hence, AT A is positive-definite
...
3 explores other properties of positive-definite matrices
...
2-1
Prove that matrix inverses are unique, that is, if B and C are inverses of A, then
B D C
...
2-2
Prove that the determinant of a lower-triangular or upper-triangular matrix is equal
to the product of its diagonal elements
...


1226

Appendix D

Matrices

D
...

D
...
Prove that if A0 is obtained
from A by adding row j into row i, then subtracting column i from column j of B
yields the inverse B 0 of A0
...
2-5
Let A be a nonsingular n n matrix with complex entries
...

D
...

Show that if B is an arbitrary m n matrix, then the m m matrix given by the
product BAB T is symmetric
...
2-7
Prove Theorem D
...
That is, show that a matrix A has full column rank if and only
if Ax D 0 implies x D 0
...
)
D
...
AB/ Ä min
...
A/; rank
...
(Hint: Use
the alternate definition of the rank of a matrix
...
x0 ; x1 ; : : : ; xn 1 / D

1
:
:
:

2
x0
2
x1
:
:
:

x0
x1
:
:
:

1 xn

1

2
xn

1

n
x0
n
x1
:
::
:
:
:
n
xn

1
1

1
1

Problems for Appendix D

is
det
...
x0 ; x1 ; : : : ; xn 1 // D

1227

Y


...
)

1;

D-2 Permutations defined by matrix-vector multiplication over GF
...
2/
...
If A is an n n matrix in which each entry is either 0
or 1, then we can define a permutation mapping each value x in Sn to the number
whose binary representation is the matrix-vector product Ax
...
2/: all values are either 0 or 1, and with one exception the
usual rules of addition and multiplication apply
...

You can think of arithmetic over GF
...

As an example, for S2 D f0; 1; 2; 3g, the matrix
Â
Ã
1 0
AD
1 1
defines the following permutation A : A
...
1/ D 3,
A
...
To see why A
...
2/,
Â
ÃÂ Ã
1 0
1
A
...


A
...
2/, and all matrix and
vector entries are 0 or 1
...
2/ the same as for a regular matrix, but with all
arithmetic that determines linear independence performed over GF
...
We define
the range of an n n 0-1 matrix A by
R
...
A/ is the set of numbers in Sn that we can produce by multiplying each
value x in Sn by A
...
If r is the rank of matrix A, prove that jR
...
Conclude that A defines a
permutation on Sn only if A has full rank
...
A/, we define the preimage

P
...
A; y/ is the set of values in Sn that map to y when multiplied by A
...
If r is the rank of n

n matrix A and y 2 R
...
A; y/j D 2n r
...
i C 1/2m 1
...
S; m/ to be the
set of size-2m blocks of Sn containing some element of S
...
S; m/ consists of blocks 0 (since 1 is in
the 0th block) and 2 (since both 4 and 5 are in block 2)
...
Let r be the rank of the lower left
...
Let S be any size-2m block of Sn , and let S 0 D
fy W y D Ax for some x 2 Sg
...
S 0 ; m/j D 2r and that for each
block in B
...

Because multiplying the zero vector by any matrix yields a zero vector, the set
of permutations of Sn defined by multiplying by n n 0-1 matrices with full rank
over GF
...
Let us extend the class of permutations defined by matrix-vector multiplication to include an additive term, so
that x 2 Sn maps to Ax C c, where c is an n-bit vector and addition is performed
over GF
...
For example, when
Â
Ã
1 0
AD
1 1

Notes for Appendix D

and
cD

Â

0
1

1229

Ã
;

we get the following permutation A;c : A;c
...
1/ D 1, A;c
...
3/ D 3
...

d
...

e
...
(Hint: For a given permutation, think about how
multiplying a matrix by a unit vector relates to the columns of the matrix
...

The books by Strang [323, 324] are particularly good
...
Stegun, editors
...

Dover, 1965
...
M
...
M
...
An algorithm for the organization of information
...

[3] Alok Aggarwal and Jeffrey Scott Vitter
...
Communications of the ACM, 31(9):1116–1127, 1988
...
PRIMES is in P
...

[5] Alfred V
...
Hopcroft, and Jeffrey D
...
The Design and Analysis of
Computer Algorithms
...

[6] Alfred V
...
Hopcroft, and Jeffrey D
...
Data Structures and Algorithms
...

[7] Ravindra K
...
Magnanti, and James B
...
Network Flows: Theory,
Algorithms, and Applications
...

[8] Ravindra K
...
Orlin, and Robert E
...
Faster algorithms
for the shortest path problem
...

[9] Ravindra K
...
Orlin
...
Operations Research, 37(5):748–759, 1989
...
Ahuja, James B
...
Tarjan
...
SIAM Journal on Computing, 18(5):939–954, 1989
...
Improved algorithms and analysis for
o
secretary problems and generalizations
...

[12] Selim G
...
The Design and Analysis of Parallel Algorithms
...

[13] Mohamad Akra and Louay Bazzi
...
Computational Optimization and Applications, 10(2):195–210, 1998
...
Generating pseudo-random permutations and maximum flow algorithms
...


1232

Bibliography

[15] Arne Andersson
...
In Proceedings of the Third Workshop
on Algorithms and Data Structures, volume 709 of Lecture Notes in Computer Science,
pages 60–71
...

[16] Arne Andersson
...
In Proceedings
of the 37th Annual Symposium on Foundations of Computer Science, pages 135–141, 1996
...
Sorting in linear
time? Journal of Computer and System Sciences, 57:74–93, 1998
...
Apostol
...
Blaisdell Publishing Company, second edition, 1967
...
Arora, Robert D
...
Greg Plaxton
...
In Proceedings of the 10th Annual ACM Symposium on Parallel
Algorithms and Architectures, pages 119–129, 1998
...
Probabilistic checking of proofs and the hardness of approximation problems
...

[21] Sanjeev Arora
...
In Proceedings of the 30th
Annual ACM Symposium on Theory of Computing, pages 337–348, 1998
...
Polynomial time approximation schemes for euclidean traveling salesman
and other geometric problems
...

[23] Sanjeev Arora and Carsten Lund
...
In Dorit S
...
PWS Publishing
Company, 1997
...
Aslam
...
Technical Report TR2001-387, Dartmouth College Department of Computer Science,
2001
...
Atallah, editor
...
CRC Press,
1999
...
Ausiello, P
...
Gambosi, V
...
Marchetti-Spaccamela, and M
...

Complexity and Approximation: Combinatorial Optimization Problems and Their Approximability Properties
...

[27] Shai Avidan and Ariel Shamir
...
ACM Transactions on Graphics, 26(3), article 10, 2007
...
Computer Algorithms: Introduction to Design and Analysis
...

[29] Eric Bach
...

[30] Eric Bach
...
In Annual Review of Computer Science, volume 4,
pages 119–172
...
, 1990
...
Algorithmic Number Theory—Volume I: Efficient Algorithms
...

[32] David H
...
Simon
...
The Journal of Supercomputing, 4(4):357–371, 1990
...
Improved decremental algorithms for maintaining transitive closure and all-pairs shortest paths
...

[34] R
...
Symmetric binary B-trees: Data structure and maintenance algorithms
...

[35] R
...
M
...
Organization and maintenance of large ordered indexes
...

[36] Pierre Beauchemin, Gilles Brassard, Claude Cr´ peau, Claude Goutier, and Carl Pomerance
...
Journal of Cryptology, 1(1):53–
64, 1988
...
Dynamic Programming
...

[38] Richard Bellman
...
Quarterly of Applied Mathematics, 16(1):87–90,
1958
...
Lower bounds for algebraic computation trees
...

[40] Michael A
...
Demaine, and Martin Farach-Colton
...

In Proceedings of the 41st Annual Symposium on Foundations of Computer Science, pages
399–409, 2000
...
Bent and John W
...
Finding the median requires 2n comparisons
...

[42] Jon L
...
Writing Efficient Programs
...

[43] Jon L
...
Programming Pearls
...

[44] Jon L
...
Saxe
...
SIGACT News, 12(3):36–44, 1980
...
Tightening simplex mixed-integer sets with
guaranteed bounds
...

[46] Patrick Billingsley
...
John Wiley & Sons, second edition, 1986
...
Blelloch
...
PhD thesis, Department of
Electrical Engineering and Computer Science, MIT, 1989
...

[48] Guy E
...
Programming parallel algorithms
...


Communications of the ACM,

[49] Guy E
...
Gibbons, and Yossi Matias
...
In Proceedings of the 7th Annual ACM Symposium
on Parallel Algorithms and Architectures, pages 1–12, 1995
...
Floyd, Vaughan Pratt, Ronald L
...
Tarjan
...
Journal of Computer and System Sciences, 7(4):448–461, 1973
...
Blumofe, Christopher F
...
Kuszmaul, Charles E
...
Randall, and Yuli Zhou
...
Journal
of Parallel and Distributed Computing, 37(1):55–69, 1996
...
Blumofe and Charles E
...
Scheduling multithreaded computations by
work stealing
...

[53] B´ la Bollob´ s
...
Academic Press, 1985
...
Fundamentals of Algorithmics
...

[55] Richard P
...
The parallel evaluation of general arithmetic expressions
...

[56] Richard P
...
An improved Monte Carlo factorization algorithm
...

[57] J
...
Buhler, H
...
Lenstra, Jr
...
Factoring integers with the number
field sieve
...
K
...
W
...
, editors, The Development of the Number
Field Sieve, volume 1554 of Lecture Notes in Mathematics, pages 50–94
...

[58] J
...
Wegman
...
Journal of
Computer and System Sciences, 18(2):143–154, 1979
...
Using OpenMP: Portable Shared
Memory Parallel Programming
...

[60] Bernard Chazelle
...
Journal of the ACM, 47(6):1028–1047, 2000
...
A randomized maximum-flow algorithm
...

[62] Joseph Cheriyan and S
...
Maheshwari
...
SIAM Journal on Computing, 18(6):1057–1086, 1989
...
Cherkassky and Andrew V
...
On implementing the push-relabel method
for the maximum flow problem
...

[64] Boris V
...
Goldberg, and Tomasz Radzik
...
Mathematical Programming, 73(2):129–174, 1996
...
Cherkassky, Andrew V
...
Buckets, heaps, lists and
monotone priority queues
...

[66] H
...
A measure of asymptotic efficiency for tests of a hypothesis based on the sum
of observations
...

[67] Kai Lai Chung
...
Springer, 1974
...
Chv´ tal
...
Mathematics of Operations
a
Research, 4(3):233–235, 1979
...
Chv´ tal
...
W
...
Freeman and Company, 1983
...
Chv´ tal, D
...
Klarner, and D
...
Knuth
...

a
Technical Report STAN-CS-72-292, Computer Science Department, Stanford University,
1972
...
, Burlington, Massachusetts
...
Available
at http://www
...
com/archive/docs/cilk1guide
...
The intrinsic computational difficulty of functions
...
NorthHolland, 1964
...
Cohen and H
...
Lenstra, Jr
...
Mathematics of Computation, 42(165):297–330, 1984
...
Comer
...
ACM Computing Surveys, 11(2):121–137, 1979
...
The complexity of theorem proving procedures
...

[76] James W
...
Tukey
...
Mathematics of Computation, 19(90):297–301, 1965
...
Modifications to the number field sieve
...


Journal of Cryptology,

[78] Don Coppersmith and Shmuel Winograd
...

Journal of Symbolic Computation, 9(3):251–280, 1990
...
Cormen, Thomas Sundquist, and Leonard F
...
Asymptotically tight
bounds for performing BMMC permutations on parallel disk systems
...

[80] Don Dailey and Charles E
...
Using Cilk to write multiprocessor chess programs
...
J
...
Monien, editors, Advances in Computer Games, volume 9,
pages 25–52
...

[81] Paolo D’Alberto and Alexandru Nicolau
...
In
Proceedings of the 21st Annual International Conference on Supercomputing, pages 284–
292, June 2007
...
Algorithms
...

[83] Roman Dementiev, Lutz Kettner, Jens Mehnert, and Peter Sanders
...
In Proceedings of the Sixth Workshop on Algorithm Engineering and Experiments and the First Workshop on Analytic Algorithmics and Combinatorics,
pages 142–151, January 2004
...
Italiano
...
Journal of Computer and System Sciences, 72(5):813–837, 2006
...
Denardo and Bennett L
...
Shortest-route methods: 1
...
Operations Research, 27(1):161–186, 1979
...
Tarjan
...
SIAM
Journal on Computing, 23(4):738–761, 1994
...
Hellman
...
IEEE Transactions on Information Theory, IT-22(6):644–654, 1976
...
W
...
A note on two problems in connexion with graphs
...


1236

Bibliography

[89] E
...
Dinic
...
Soviet Mathematics Doklady, 11(5):1277–1280, 1970
...
Tarjan
...
SIAM Journal on Computing, 21(6):1184–1192,
1992
...
Dixon
...
The American Mathematical Monthly,
91(6):333–352, 1984
...
On lower bounds for selecting
a
the median
...

[93] Dorit Dor and Uri Zwick
...
SIAM Journal on Computing, 28(5):1722–
1758, 1999
...
Median selection requires
...
SIAM Journal
on Discrete Mathematics, 14(3):312–325, 2001
...
Drake
...
McGraw-Hill, 1967
...
Driscoll, Harold N
...
Tarjan
...
Communications of the ACM, 31(11):1343–1354, 1988
...
Driscoll, Neil Sarnak, Daniel D
...
Tarjan
...
Journal of Computer and System Sciences, 38(1):86–124, 1989
...
Eager, John Zahorjan, and Edward D
...
Speedup versus efficiency in
parallel systems
...

[99] Herbert Edelsbrunner
...
Springer, 1987
...
Paths, trees, and flowers
...

[101] Jack Edmonds
...
Mathematical Programming, 1(1):127–
136, 1971
...
Karp
...
Journal of the ACM, 19(2):248–264, 1972
...
Graph Algorithms
...

[104] William Feller
...
John Wiley &
Sons, third edition, 1968
...
Floyd
...
Communications of the ACM,
5(6):345, 1962
...
Floyd
...
Communications of the ACM, 7(12):701,
1964
...
Floyd
...
In Raymond E
...
Thatcher, editors, Complexity of Computer Computations, pages 105–
109
...


Bibliography

1237

[108] Robert W
...
Rivest
...
Communications of the ACM, 18(3):165–172, 1975
...
Ford, Jr
...
R
...
Flows in Networks
...

[110] Lestor R
...
and Selmer M
...
A tournament problem
...

[111] Michael L
...
New bounds on the complexity of the shortest path problem
...

[112] Michael L
...
Storing a sparse table with O
...
Journal of the ACM, 31(3):538–544, 1984
...
Fredman and Michael E
...
The cell probe complexity of dynamic data structures
...

[114] Michael L
...
Tarjan
...
Journal of the ACM, 34(3):596–615, 1987
...
Fredman and Dan E
...
Surpassing the information theoretic bound with
fusion trees
...

[116] Michael L
...
Willard
...
Journal of Computer and System Sciences, 48(3):533–551,
1994
...
Johnson
...
Proceedings of the IEEE, 93(2):216–231, 2005
...
Leiserson, and Keith H
...
The implementation of the Cilk-5
multithreaded language
...

[119] Harold N
...
Path-based depth-first search for strong and biconnected components
...

[120] Harold N
...
Galil, T
...
Tarjan
...
Combinatorica, 6(2):109–
122, 1986
...
Gabow and Robert E
...
A linear-time algorithm for a special case of disjoint
set union
...

[122] Harold N
...
Tarjan
...

SIAM Journal on Computing, 18(5):1013–1036, 1989
...
All pairs shortest distances for graphs with small integer
length edges
...

[124] Zvi Galil and Oded Margalit
...
Journal of Computer and System Sciences, 54(2):243–254, 1997
...
Dynamic programming with convexity, concavity and sparsity
...


1238

Bibliography

[126] Zvi Galil and Joel Seiferas
...
Journal of Computer and
System Sciences, 26(3):280–294, 1983
...
Rivest
...
In Proceedings of the 4th ACM-SIAM
Symposium on Discrete Algorithms, pages 165–174, 1993
...
Garey, R
...
Graham, and J
...
Ullman
...
In Proceedings of the Fourth Annual ACM Symposium on Theory of
Computing, pages 143–150, 1972
...
Garey and David S
...
Computers and Intractability: A Guide to the
Theory of NP-Completeness
...
H
...

[130] Saul Gass
...
International Thomson Publishing, fourth edition, 1975
...
Algorithms for minimum coloring, maximum clique, minimum covering by
a a
cliques, and maximum independent set of a chordal graph
...

[132] Alan George and Joseph W-H Liu
...
Prentice Hall, 1981
...
N
...
F
...
Variable-length binary encodings
...

[134] Michel X
...
Williamson
...
Journal of the
ACM, 42(6):1115–1145, 1995
...
Goemans and David P
...
The primal-dual method for approximation
algorithms and its application to network design problems
...
Hochbaum, editor,
Approximation Algorithms for NP-Hard Problems, pages 144–191
...

[136] Andrew V
...
Efficient Graph Algorithms for Sequential and Parallel Computers
...

[137] Andrew V
...
Scaling algorithms for the shortest paths problem
...

[138] Andrew V
...
Beyond the flow decomposition barrier
...

´
[139] Andrew V
...
Tarjan
...
In Bernhard Korte, L´ szl´ Lov´ sz, Hans J¨ rgen Pr¨ mel, and Alexander Schrijver, editors, Paths,
a o
a
u
o
Flows, and VLSI-Layout, pages 101–164
...

[140] Andrew V
...
Tarjan
...

Journal of the ACM, 35(4):921–940, 1988
...
Goldfarb and M
...
Todd
...
In G
...
Nemhauser, A
...
G
...
J
...
1, Optimization, pages 73–170
...

[142] Shafi Goldwasser and Silvio Micali
...
Journal of Computer and
System Sciences, 28(2):270–299, 1984
...
Rivest
...
SIAM Journal on Computing, 17(2):281–308,
1988
...
Golub and Charles F
...
Matrix Computations
...

[145] G
...
Gonnet
...
Addison-Wesley, 1984
...
Gonzalez and Richard E
...
Digital Image Processing
...

[147] Michael T
...
Data Structures and Algorithms in Java
...

[148] Michael T
...
Algorithm Design: Foundations, Analysis, and
Internet Examples
...

[149] Ronald L
...
Bounds for certain multiprocessor anomalies
...

[150] Ronald L
...
An efficient algorithm for determining the convex hull of a finite planar
set
...

[151] Ronald L
...
On the history of the minimum spanning tree problem
...

[152] Ronald L
...
Knuth, and Oren Patashnik
...


Concrete Mathematics
...
The Science of Programming
...

[154] M
...
Geometric Algorithms and Combio
a o
a
natorial Optimization
...

[155] Leo J
...
A dichromatic framework for balanced trees
...

[156] Dan Gusfield
...
Cambridge University Press, 1997
...
Halberstam and R
...
Ingram, editors
...
Cambridge University Press, 1967
...
Improved fast integer sorting in linear space
...

[159] Yijie Han
...
n3
...
Algorithmica, 51(4):428–434, 2008
...
Graph Theory
...

[161] Gregory C
...
Reingold
...
SIGACT News, 31(3):86–95, 2000
...
Hartmanis and R
...
Stearns
...
Transactions of the American Mathematical Society, 117:285–306, May 1965
...
Heideman, Don H
...
Sidney Burrus
...
IEEE ASSP Magazine, 1(4):14–21, 1984
...
Henzinger and Valerie King
...
In Proceedings of the 36th Annual Symposium on Foundations of Computer Science,
pages 664–672, 1995
...
Henzinger and Valerie King
...
Journal of the ACM, 46(4):502–516, 1999
...
Henzinger, Satish Rao, and Harold N
...
Computing vertex connectivity:
New bounds from old techniques
...

[167] Nicholas J
...
Exploiting fast matrix multiplication within the level 3 BLAS
...

[168] W
...
Guy L
...
Data parallel algorithms
...

[169] C
...
R
...
Algorithm 63 (PARTITION) and algorithm 65 (FIND)
...

[170] C
...
R
...
Quicksort
...

[171] Dorit S
...
Efficient bounds for the stable set, vertex cover and set packing problems
...

[172] Dorit S
...
Approximation Algorithms for NP-Hard Problems
...

[173] W
...
On the distribution of the number of successes in independent trials
...

[174] Micha Hofri
...
Springer, 1987
...
Analysis of Algorithms
...

[176] John E
...
Karp
...
SIAM Journal on Computing, 2(4):225–231, 1973
...
Hopcroft, Rajeev Motwani, and Jeffrey D
...
Introduction to Automata Theory, Languages, and Computation
...

[178] John E
...
Tarjan
...
Communications of the ACM, 16(6):372–378, 1973
...
Hopcroft and Jeffrey D
...
Set merging algorithms
...

[180] John E
...
Ullman
...
Addison-Wesley, 1979
...
Computer Algorithms
...

[182] T
...
Hu and M
...
Shing
...
Part I
...

[183] T
...
Hu and M
...
Shing
...
Part II
...


Bibliography

1241

[184] T
...
Hu and A
...
Tucker
...
SIAM Journal on Applied Mathematics, 21(4):514–532, 1971
...
Huffman
...
Proceedings of the IRE, 40(9):1098–1101, 1952
...
Jacobson, Jeremy R
...
Implementation of Strassen’s algorithm for matrix multiplication
...

[187] Oscar H
...
Kim
...
Journal of the ACM, 22(4):463–468, 1975
...
J
...
C
...
Sorting by address calculation
...

[189] R
...
Jarvis
...

Information Processing Letters, 2(1):18–21, 1973
...
Johnson
...
Journal of Computer and System Sciences, 9(3):256–278, 1974
...
Johnson
...
Journal of Algorithms, 13(3):502–524, 1992
...
Johnson
...
Journal of
the ACM, 24(1):1–13, 1977
...
Algorithms
...

[194] A
...
Ofman
...
Soviet
Physics—Doklady, 7(7):595–596, 1963
...

[195] David R
...
Klein, and Robert E
...
A randomized linear-time algorithm
to find minimum spanning trees
...

[196] David R
...
Phillips
...
SIAM Journal on Computing, 22(6):1199–1217, 1993
...
Linear Programming
...

a
[198] N
...
A new polynomial-time algorithm for linear programming
...

[199] Richard M
...
Reducibility among combinatorial problems
...
Miller and
James W
...
Plenum
Press, 1972
...
Karp
...
Discrete Applied Mathematics, 34(1–3):165–201, 1991
...
Karp and Michael O
...
Efficient randomized pattern-matching algorithms
...

[202] A
...
Karzanov
...

Soviet Mathematics Doklady, 15(2):434–437, 1974
...
A simpler minimum spanning tree verification algorithm
...

[204] Valerie King, Satish Rao, and Robert E
...
A faster deterministic maximum flow algorithm
...

[205] Jeffrey H
...
Algorithms and Data Structures: Design, Correctness, Analysis
...

[206] D
...
Kirkpatrick and R
...
The ultimate planar convex hull algorithm? SIAM Journal
on Computing, 15(2):287–299, 1986
...
Klein and Neal E
...
Approximation algorithms for NP-hard optimization
problems
...
CRC Press, 1999
...
Algorithm Design
...

[209] Donald E
...
Fundamental Algorithms, volume 1 of The Art of Computer Programming
...
Third edition, 1997
...
Knuth
...
Addison-Wesley, 1969
...

[211] Donald E
...
Sorting and Searching, volume 3 of The Art of Computer Programming
...
Second edition, 1998
...
Knuth
...
Acta Informatica, 1(1):14–25, 1971
...
Knuth
...
SIGACT News, 8(2):18–23,
1976
...
Knuth, James H
...
, and Vaughan R
...
Fast pattern matching in
strings
...

[215] J
...
Linear verification for spanning trees
...

o
[216] Bernhard Korte and L´ szl´ Lov´ sz
...

a o
a
In F
...
Springer, 1981
...
Structural properties of greedoids
...

[218] Bernhard Korte and L´ szl´ Lov´ sz
...
In W
...
Academic Press, 1984
...
Greedoids and linear objective functions
...

[220] Dexter C
...
The Design and Analysis of Algorithms
...

[221] David W
...
N
...
Gossiping in minimal time
...

[222] Joseph B
...
On the shortest spanning subtree of a graph and the traveling salesman
problem
...

[223] Leslie Lamport
...
IEEE Transactions on Computers, C-28(9):690–691, 1979
...
Lawler
...
Holt, Rinehart,
and Winston, 1976
...
Lawler, J
...
Lenstra, A
...
G
...
B
...
The
Traveling Salesman Problem
...

[226] C
...
Lee
...
IRE Transactions on
Electronic Computers, EC-10(3):346–365, 1961
...
Tight bounds on the complexity of parallel sorting
...

[228] Tom Leighton
...
Class
notes
...
ist
...
edu/252350
...

[229] Tom Leighton and Satish Rao
...
Journal of the ACM, 46(6):787–832, 1999
...
Optimize managed code for multi-core machines
...

[231] Debra A
...
Hirschberg
...
ACM Computing Surveys,
19(3):261–296, 1987
...
K
...
W
...
, M
...
Manasse, and J
...
Pollard
...

In A
...
Lenstra and H
...
Lenstra, Jr
...
Springer, 1993
...
W
...
Factoring integers with elliptic curves
...


Annals of Mathematics,

[234] L
...
Levin
...
Problemy Peredachi Informatsii, 9(3):265–266,
1973
...

[235] Anany Levitin
...
Addison-Wesley,
2007
...
Lewis and Christos H
...
Elements of the Theory of Computation
...

[237] C
...
Liu
...
McGraw-Hill, 1968
...
On the ratio of optimal integral and fractional covers
...

[239] L´ szl´ Lov´ sz and M
...
Plummer
...
North Holland, 1986
...
Maggs and Serge A
...
Minimum-cost spanning tree as a path-finding
problem
...

[241] Michael Main
...
Addison-Wesley, 1999
...
Introduction to Algorithms: A Creative Approach
...

[243] Conrado Mart´nez and Salvador Roura
...
Journal of the
ı
ACM, 45(2):288–323, 1998
...
Masek and Michael S
...
A faster algorithm computing string edit distances
...


1244

Bibliography

[245] H
...
Maurer, Th
...
-W
...
Implementing dictionaries using binary trees of
very small height
...

[246] Ernst W
...
Lectures on Proof Verifiu
o
cation and Approximation Algorithms, volume 1367 of Lecture Notes in Computer Science
...

[247] C
...
McGeoch
...

13(5):426–441, 1995
...
D
...
A killer adversary for quicksort
...

[249] Kurt Mehlhorn
...

Springer, 1984
...
Graph Algorithms and NP-Completeness, volume 2 of Data Structures and
Algorithms
...

[251] Kurt Mehlhorn
...
Springer, 1984
...
Bounded ordered dictionaries in O
...
n/ space
...

[253] Kurt Mehlhorn and Stefan N¨ her
...
Cambridge University Press, 1999
...
Menezes, Paul C
...
Vanstone
...
CRC Press, 1997
...
Miller
...
Journal of Computer and
System Sciences, 13(3):300–317, 1976
...
Mitchell
...
The MIT Press, 1996
...
B
...
Guillotine subdivisions approximate polygonal subdivisions: A simple polynomial-time approximation scheme for geometric TSP, k-MST, and related problems
...

[258] Louis Monier
...
PhD thesis, L’Universit´ Paris-Sud,
e
1980
...
Evaluation and comparison of two efficient probabilistic primality testing
algorithms
...

[260] Edward F
...
The shortest path through a maze
...
Harvard University Press, 1959
...
Randomized approximation algorithms in combinatorial optimization
...
PWS Publishing Company,
1997
...
Randomized Algorithms
...

[263] J
...
Munro and V
...
Fast stable in-place sorting with O
...
Algorithmica,
16(2):151–160, 1996
...
Nievergelt and E
...
Reingold
...
SIAM Journal
on Computing, 2(1):33–43, 1973
...
Zuckerman
...
John
Wiley & Sons, fourth edition, 1980
...
Oppenheim and Ronald W
...
Buck
...
Prentice Hall, second edition, 1998
...
Oppenheim and Alan S
...
Hamid Nawab
...

Prentice Hall, second edition, 1997
...
Orlin
...
Mathematical Programming, 78(1):109–129, 1997
...
Computational Geometry in C
...

[270] Christos H
...
Computational Complexity
...

[271] Christos H
...
Combinatorial Optimization: Algorithms
and Complexity
...

[272] Michael S
...
Progress in selection
...

[273] Mihai Pˇ trascu and Mikkel Thorup
...
In Proa ¸
ceedings of the 38th Annual ACM Symposium on Theory of Computing, pages 232–240,
2006
...
Randomization does not help searching predecessors
...

[275] Pavel A
...
Computational Molecular Biology: An Algorithmic Approach
...

[276] Steven Phillips and Jeffery Westbrook
...
In Proceedings of the 25th Annual ACM Symposium on Theory of Computing, pages 402–411,
1993
...
M
...
A Monte Carlo method for factorization
...

[278] J
...
Pollard
...
In A
...
Lenstra and H
...
Lenstra, Jr
...
Springer, 1993
...
On the distribution of pseudoprimes
...

[280] Carl Pomerance, editor
...
American Mathematical Society, 1990
...
Pratt
...
John Wiley & Sons, fourth edition, 2007
...
Preparata and Michael Ian Shamos
...

Springer, 1985
...
Press, Saul A
...
Vetterling, and Brian P
...
Numerical Recipes in C++: The Art of Scientific Computing
...

[284] William H
...
Teukolsky, William T
...
Flannery
...
Cambridge University Press, third edition,
2007
...
C
...
Shortest connection networks and some generalizations
...

[286] William Pugh
...
Communications of
the ACM, 33(6):668–676, 1990
...
Purdom, Jr
...
Brown
...
Holt, Rinehart,
and Winston, 1985
...
Rabin
...
In J
...
Traub, editor, Algorithms and Complexity: New Directions and Recent Results, pages 21–39
...

[289] Michael O
...
Probabilistic algorithm for testing primality
...

[290] P
...
D
...
Randomized rounding: A technique for provably good
algorithms and algorithmic proofs
...

[291] Rajeev Raman
...
SIGACT News,
28(2):81–87, 1997
...
Intel Threading Building Blocks: Outfitting C++ for Multi-core Processor
Parallelism
...
, 2007
...
Reingold, J¨ rg Nievergelt, and Narsingh Deo
...
Prentice Hall, 1977
...
Reingold, Kenneth J
...
K-M-P string matching revisited
...

[295] Hans Riesel
...
Birkh¨ user, second edition, 1994
...
Rivest, Adi Shamir, and Leonard M
...
A method for obtaining digital
signatures and public-key cryptosystems
...
See also U
...
Patent 4,405,829
...
A remark on Stirling’s formula
...


American Mathematical Monthly,

[298] D
...
Rosenkrantz, R
...
Stearns, and P
...
Lewis
...
SIAM Journal on Computing, 6(3):563–581, 1977
...
An improved master theorem for divide-and-conquer recurrences
...
Springer,
1997
...
A
...
Probability Theory: A Concise Course
...


Bibliography

1247

[301] S
...
Gonzalez
...
Journal of the ACM,
23(3):555–565, 1976
...
Sch¨ nhage, M
...
Pippenger
...
Journal of Computer
o
and System Sciences, 13(2):184–199, 1976
...
Theory of Linear and Integer Programming
...

[304] Alexander Schrijver
...
CWI Quarterly, 6(3):169–183,
1993
...
Implementing quicksort programs
...


Communications of the ACM,

[306] Robert Sedgewick
...
Addison-Wesley, second edition, 1988
...
An Introduction to the Analysis of Algorithms
...

[308] Raimund Seidel
...

Journal of Computer and System Sciences, 51(3):400–403, 1995
...
R
...
Randomized search trees
...

[310] Jo˜ o Setubal and Jo˜ o Meidanis
...
PWS
a
a
Publishing Company, 1997
...
Shaffer
...

Prentice Hall, second edition, 2001
...
Origins of the analysis of the Euclidean algorithm
...

[313] Michael I
...
Geometric intersection problems
...

[314] M
...
A strong-connectivity algorithm and its applications in data flow analysis
...

[315] David B
...
Computing near-optimal solutions to combinatorial optimization problems
...
American Mathematical Society, 1995
...
All pairs shortest paths in undirected graphs with integer
weights
...

[317] Michael Sipser
...
Thomson Course Technology,
second edition, 2006
...
Skiena
...
Springer, second edition, 1998
...
Sleator and Robert E
...
A data structure for dynamic trees
...


1248

Bibliography

[320] Daniel D
...
Tarjan
...
Journal of the
ACM, 32(3):652–686, 1985
...
Ten Lectures on the Probabilistic Method, volume 64 of CBMS-NSF Regional
Conference Series in Applied Mathematics
...

[322] Daniel A
...
Smoothed analysis of algorithms: Why the simplex algorithm usually takes polynomial time
...

[323] Gilbert Strang
...
Wellesley-Cambridge Press, 1986
...
Linear Algebra and Its Applications
...

[325] Volker Strassen
...
Numerische Mathematik, 14(3):354–
356, 1969
...
G
...
A special case of the maximal common subsequence problem
...

[327] Robert E
...
Depth first search and linear graph algorithms
...

[328] Robert E
...
Efficiency of a good but not linear set union algorithm
...

[329] Robert E
...
A class of algorithms which require nonlinear time to maintain disjoint
sets
...

[330] Robert E
...
Data Structures and Network Algorithms
...

[331] Robert E
...
Amortized computational complexity
...

[332] Robert E
...
Class notes: Disjoint set union
...

[333] Robert E
...
Worst-case analysis of set union algorithms
...

[334] George B
...
, Maurice D
...
Giordano
...
Addison-Wesley, eleventh edition, 2005
...
Faster deterministic sorting and priority queues in linear space
...

[336] Mikkel Thorup
...
Journal of the ACM, 46(3):362–394, 1999
...
On RAM priority queues
...

[338] Richard Tolimieri, Myoung An, and Chao Lu
...
Springer, second edition, 1997
...
van Emde Boas
...
In Proceedings
of the 16th Annual Symposium on Foundations of Computer Science, pages 75–84, 1975
...
van Emde Boas
...
Information Processing Letters, 6(3):80–82, 1977
...
van Emde Boas, R
...
Zijlstra
...
Mathematical Systems Theory, 10(1):99–127, 1976
...
Handbook of Theoretical Computer Science, Volume A: Algorithms and Complexity
...

[343] Charles Van Loan
...
Society for
Industrial and Applied Mathematics, 1992
...
Vanderbei
...
Kluwer Academic
Publishers, 1996
...
Vazirani
...
Springer, 2001
...
Verma
...

SIAM Journal on Computing, 26(2):568–581, 1997
...
Pipelined van Emde Boas tree: Algorithms, analysis, and applications
...

[348] Antony F
...
Fast approximate Fourier transforms for irregularly spaced data
...

[349] Stephen Warshall
...
Journal of the ACM, 9(1):11–12, 1962
...
Waterman
...
Chapman & Hall, 1995
...
Data Structures and Problem Solving Using C++
...

[352] Mark Allen Weiss
...
Addison-Wesley,
third edition, 2006
...
Data Structures and Algorithm Analysis in C++
...

[354] Mark Allen Weiss
...
Addison-Wesley,
second edition, 2007
...
On the abstract properties of linear dependence
...

[356] Herbert S
...
Algorithms and Complexity
...

[357] J
...
J
...
Algorithm 232 (HEAPSORT)
...

[358] Shmuel Winograd
...
In Actes du Congr` s Internae
tional des Math´ maticiens, volume 3, pages 283–288, 1970
...
-C
...
A lower bound to finding convex hulls
...

[360] Chee Yap
...
Unpublished manuscript
...
nyu
...


1250

Bibliography

[361] Yinyu Ye
...
John Wiley & Sons, 1997
...
CRC Standard Mathematical Tables and Formulae
...


Index

This index uses the following conventions
...
” When
an entry refers to a place other than the main text, the page number is followed by
a tag: ex
...
for problem, fig
...
for footnote
...

˛
...

y (conjugate of the golden ratio), 59

...
n/-approximation algorithm, 1106, 1123
o-notation, 50–51, 64
O-notation, 45 fig
...

e-notation, 62 pr
...
, 48–49, 64
1
-notation, 62 pr
...

‚-notation, 44–47, 45 fig
...


f g (set), 1158
2 (set member), 1158
62 (not a set member), 1158
;
(empty language), 1058
(empty set), 1158
 (subset), 1159
(proper subset), 1159
W (such that), 1159
\ (set intersection), 1159
[ (set union), 1159

(set difference), 1159
jj
(flow value), 710
(length of a string), 986
(set cardinality), 1161
(Cartesian product), 1162
(cross product), 1016
hi
(sequence), 1166
(standard encoding), 1057
n
k (choose), 1185
k k (euclidean norm), 1222
Š (factorial), 57
d e (ceiling), 54
b c (floor), 54
p
#
(lower square root), 546
p
"
P (upper square root), 546
Q (sum), 1145
(product), 1148
! (adjacency relation), 1169
; (reachability relation), 1170
^ (AND), 697, 1071
: (NOT), 1071
_ (OR), 697, 1071
˚ (group operator), 939
˝ (convolution operator), 901

1252

Index

(closure operator), 1058
j (divides relation), 927
− (does-not-divide relation), 927
Á (equivalent modulo n), 54, 1165 ex
...
p / (Legendre symbol), 982 pr
...

AA-tree, 338
abelian group, 940
A BOVE , 1024
above relation (absent child, 1178
absolutely convergent series, 1146
absorption laws for sets, 1160
abstract problem, 1054
acceptable pair of integers, 972
acceptance
by an algorithm, 1058
by a finite automaton, 996
accepting state, 995
accounting method, 456–459
for binary counters, 458
for dynamic tables, 465–466
for stack operations, 457–458, 458 ex
...

add instruction, 23
addition
of binary integers, 22 ex
...

adjacency-list representation, 590
replaced by a hash table, 593 ex
...

for dynamic tables, 465
for Fibonacci heaps, 518, 522 ex
...

Floyd-Warshall algorithm for, 693–697, 706
Johnson’s algorithm for, 700–706
by matrix multiplication, 686–693, 706–707
by repeated squaring, 689–691
alphabet, 995, 1057
˛
...

for breadth-first search, 597
for depth-first search, 606
for Dijkstra’s algorithm, 661
for disjoint-set data structures, 566–567,
568 ex
...
, 575–581, 581–582 ex
...

for the generic push-relabel algorithm, 746
for Graham’s scan, 1036
for the Knuth-Morris-Pratt algorithm, 1006
for making binary search dynamic, 473 pr
...

for self-organizing lists with move-to-front,
476 pr
...

for weight-balanced trees, 473 pr
...

AND function (^), 697, 1071
AND gate, 1070
and, in pseudocode, 22
antiparallel edges, 711–712
antisymmetric relation, 1164
A NY-S EGMENTS -I NTERSECT , 1025
approximation
by least squares, 835–839
of summation by integrals, 1154–1156
approximation algorithm, 10, 1105–1140
for bin packing, 1134 pr
...

for maximum clique, 1111 ex
...

for maximum matching, 1135 pr
...

for maximum-weight cut, 1127 ex
...

randomized, 1123

1253

for set cover, 1117–1122, 1139
for subset sum, 1128–1134, 1139
for traveling-salesman problem, 1111–1117,
1139
for vertex cover, 1108–1111, 1139
for weighted set cover, 1135 pr
...
, 1139
approximation error, 836
approximation ratio, 1106, 1123
approximation scheme, 1107
A PPROX -M IN -W EIGHT-VC, 1126
A PPROX -S UBSET-S UM, 1131
A PPROX -TSP-T OUR, 1112
A PPROX -V ERTEX -C OVER, 1109
arbitrage, 679 pr
...

passing as a parameter, 21
articulation point, 621 pr
...

and graph algorithms, 588
and linearity of summations, 1146
asymptotic upper bound, 47
attribute of an object, 21
augmentation of a flow, 716
augmenting data structures, 339–355
augmenting path, 719–720, 763 pr
...
, 960–961, 964

1254

Index

automaton
finite, 995
string-matching, 996–1002
auxiliary hash function, 272
auxiliary linear program, 886
average-case running time, 28, 116
AVL-I NSERT , 333 pr
...
, 337
axioms, for probability, 1190
babyface, 602 ex
...

BALANCE , 333 pr
...
, 337
B-trees, 484–504
k-neighbor trees, 338
red-black trees, 308–338
scapegoat trees, 338
splay trees, 338, 482
treaps, 333 pr
...

2-3 trees, 337, 504
weight-balanced trees, 338, 473 pr
...

base-a pseudoprime, 967
base case, 65, 84
base, in DNA, 391
basic feasible solution, 866
basic solution, 866
basic variable, 855
basis function, 835
Bayes’s theorem, 1194
B ELLMAN -F ORD, 651
Bellman-Ford algorithm, 651–655, 682
for all-pairs shortest paths, 684
in Johnson’s algorithm, 702–704
and objective functions, 670 ex
...

B ELOW, 1024
Bernoulli trial, 1201
and balls and bins, 133–134
and streaks, 135–139

best-case running time, 29 ex
...

biconnected component, 621 pr
...
, 47–48, 64
big-omega notation, 45 fig
...

binary entropy function, 1187
binary gcd algorithm, 981 pr
...

with fast insertion, 473 pr
...

in multithreaded merging, 799–800
in searching B-trees, 499 ex
...
, 337
deletion from, 295–298, 299 ex
...

insertion into, 294–295
k-neighbor trees, 338
maximum key of, 291
minimum key of, 291
optimal, 397–404, 413
predecessor in, 291–292
querying, 289–294
randomly built, 299–303, 304 pr
...

scapegoat trees, 338
searching, 289–291
for sorting, 299 ex
...

weight-balanced trees, 338
see also red-black tree
binary-search-tree property, 287
in treaps, 333 pr
...
min-heap property, 289 ex
...

representation of, 246
superimposed upon a bit vector, 533–534
see also binary search tree
binomial coefficient, 1186–1187
binomial distribution, 1203–1206
and balls and bins, 133
maximum value of, 1207 ex
...

binomial tree, 527 pr
...

bipartite graph, 1172
corresponding flow network of, 732
d -regular, 736 ex
...

bipartite matching, 530, 732–736, 747 ex
...

birthday paradox, 130–133, 142 ex
...

bitonic euclidean traveling-salesman problem,
405 pr
...

bitonic tour, 405 pr
...

bit-reversal permutation, 472 pr
...

B IT-R EVERSED -I NCREMENT, 472 pr
...
, 532–536
black-height, 309
black vertex, 594, 603
blocking flow, 765
block structure in pseudocode, 20
Bob, 959
Boole’s inequality, 1195 ex
...
, 1079,
1086 ex
...

boolean matrix multiplication, 832 ex
...

bottleneck traveling-salesman problem,
1117 ex
...

bounding a summation, 1149–1156
box, nesting, 678 pr
...

breadth-first tree, 594, 600
bridge, 621 pr
...

B-tree, 484–504
compared with red-black trees, 484, 490
creating, 492
deletion from, 499–502
full node in, 489
height of, 489–490
insertion into, 493–497
minimum degree of, 489
minimum key of, 497 ex
...

B-T REE -S PLIT-C HILD, 494
B UBBLESORT , 40 pr
...

B UILD -M IN -H EAP, 159
butterfly operation, 915
by, in pseudocode, 21
cache, 24, 449 pr
...

cache miss, 449 pr
...

call
in a multithreaded computation, 776
of a subroutine, 23, 25 n
...

capacity constraint, 709–710
cardinality of a set (j j), 1161
Carmichael number, 968, 975 ex
...

cascading cut, 520
C ASCADING -C UT , 519
Catalan numbers, 306 pr
...

chain of a convex hull, 1038
changing a key, in a Fibonacci heap, 529 pr
...

choose n , 1185
k
chord, 345 ex
...

in depth-first search, 609–610, 611 ex
...

clique, 1086–1089, 1105
approximation algorithm for, 1111 ex
...

CLIQUE, 1087
closed interval, 348
closed semiring, 707
closest pair, finding, 1039–1044, 1047
closest-point heuristic, 1117 ex
...

coarsening leaves of recursion
in merge sort, 39 pr
...

colinearity, 1016
collision, 257
resolution by chaining, 257–260
resolution by open addressing, 269–277
collision-resistant hash function, 964
coloring, 1103 pr
...

color, of a red-black-tree node, 308
column-major order, 208 pr
...

column vector, 1218
combination, 1185
combinational circuit, 1071
combinational element, 1070
combine step, in divide-and-conquer, 30, 65
comment, in pseudocode (/ 21
/),
commodity, 862
common divisor, 929
greatest, see greatest common divisor

1257

common multiple, 939 ex
...

compact list, 250 pr
...

C OMPACT-L IST-S EARCH0, 251 pr
...

compare-exchange operation, 208 pr
...

randomized, 205 pr
...

complement
of an event, 1190
of a graph, 1090
of a language, 1058
Schur, 820, 834
of a set, 1160
complementary slackness, 894 pr
...

complete step, 782
completion time, 447 pr
...

complexity class, 1059
co-NP, 1064
NP, 1049, 1064
NPC, 1050, 1069
P, 1049, 1055
complexity measure, 1059
complex numbers
inverting matrices of, 832 ex
...

complex root of unity, 906
interpolation at, 912–913
component
biconnected, 621 pr
...

computational depth, 812
computational geometry, 1014–1047
computational problem, 5–6
computation dag, 777
computation, multithreaded, 777
C OMPUTE -P REFIX -F UNCTION, 1006
C OMPUTE -T RANSITION -F UNCTION, 1001
concatenation
of languages, 1058
of strings, 986
concrete problem, 1055
concurrency keywords, 774, 776, 785
concurrency platform, 773
conditional branch instruction, 23
conditional independence, 1195 ex
...

conjunctive normal form, 1049, 1082
connected component, 1170
identified using depth-first search, 612 ex
...
, 852–853
inequality, 852–853
linear, 846
nonnegativity, 851, 853
tight, 865

violation of, 865
constraint graph, 666–668
contain, in a path, 1170
continuation edge, 778
continuous uniform probability distribution,
1192
contraction
of a dynamic table, 467–471
of a matroid, 442
of an undirected graph by an edge, 1172
control instructions, 23
convergence property, 650, 672–673
convergent series, 1146
converting binary to decimal, 933 ex
...

convex layers, 1044 pr
...

convex set, 714 ex
...

counting sort, 194–197
in radix sort, 198
C OUNTING -S ORT , 195
coupon collector’s problem, 134
cover
path, 761 pr
...

credit, 456
critical edge, 729
critical path
of a dag, 657
of a multithreaded computation, 779
cross a cut, 626
cross edge, 609
cross product ( ), 1016

Index

cryptosystem, 958–965, 983
cubic spline, 840 pr
...
, 679 pr
...

net flow across, 720
of an undirected graph, 626
weight of, 1127 ex
...

negative-weight, see negative-weight cycle
and shortest paths, 646–647
cyclic group, 955
cyclic rotation, 1012 ex
...

in shortest-paths algorithms, 706 pr
...
, 337
binary search trees, 286–307
binomial heaps, 527 pr
...
, 532–536
B-trees, 484–504
deques, 236 ex
...
, 482
potential of, 459
priority queues, 162–166
proto van Emde Boas structures, 538–545
queues, 232, 234–235
radix trees, 304 pr
...
, 338
2-3-4 heaps, 529 pr
...

2-3 trees, 337, 504
van Emde Boas trees, 531–560
weight-balanced trees, 338
data type, 23
deadline, 444
deallocation of objects, 243–244
decision by an algorithm, 1058–1059
decision problem, 1051, 1054
and optimization problems, 1051
decision tree, 192–193
D ECREASE -K EY, 162, 505
decreasing a key
in Fibonacci heaps, 519–522
in 2-3-4 heaps, 529 pr
...

degeneracy, 874
degree
of a binomial-tree root, 527 pr
...

deletion
from binary search trees, 295–298, 299 ex
...

from heaps, 166 ex
...

from van Emde Boas trees, 554–556
DeMorgan’s laws
for propositional logic, 1083
for sets, 1160, 1162 ex
...

density
of prime numbers, 965–966
of a rod, 370 ex
...

of a circuit, 919
of a node in a rooted tree, 1177
of quicksort recursion tree, 178 ex
...

depth-determination problem, 583 pr
...

in finding strongly connected components,
615–621, 623
in topological sorting, 612–615
depth-first tree, 603
deque, 236 ex
...

deterministic algorithm, 123
multithreaded, 787
D ETERMINISTIC -S EARCH, 143 pr
...

diameter of a tree, 602 ex
...

differentiation of a series, 1147
digital signature, 960
digraph, see directed graph
D IJKSTRA, 658
Dijkstra’s algorithm, 658–664, 682
for all-pairs shortest paths, 684, 704
implemented with a Fibonacci heap, 662
implemented with a min-heap, 662
with integer edge weights, 664 ex
...

similarity to Prim’s algorithm, 634, 662
D IRECT-A DDRESS -D ELETE, 254
direct addressing, 254–255, 532–536
D IRECT-A DDRESS -I NSERT , 254
D IRECT-A DDRESS -S EARCH, 254
direct-address table, 254–255
directed acyclic graph (dag), 1172

Index

and back edges, 613
and component graphs, 617
and hamiltonian paths, 1066 ex
...

for representing a multithreaded
computation, 777
single-source shortest-paths algorithm for,
655–658
topological sort of, 612–615, 623
directed graph, 1168
all-pairs shortest paths in, 684–707
constraint graph, 666
Euler tour of, 623 pr
...

PERT chart, 657, 657 ex
...

shortest path in, 643
single-source shortest paths in, 643–683
singly connected, 612 ex
...

transitive closure of, 697
transpose of, 592 ex
...

see also directed acyclic graph, graph,
network
directed segment, 1015–1017
directed version of an undirected graph, 1172
D IRECTION, 1018
dirty area, 208 pr
...

in connected components, 562–564
in depth determination, 583 pr
...

in off-line minimum, 582 pr
...

disjoint-set forest, 568–572
analysis of, 575–581, 581 ex
...

see also disjoint-set data structure
disjoint sets, 1161
disjunctive normal form, 1083
disk, 1028 ex
...

euclidean, 1039
Lm , 1044 ex
...
, 1044 ex
...

uniform, 1191
distributive laws for sets, 1160
divergent series, 1146
divide-and-conquer method, 30–35, 65
analysis of, 34–35
for binary search, 39 ex
...

for fast Fourier transform, 909–912
for finding the closest pair of points,
1040–1043
for finding the convex hull, 1030
for matrix inversion, 829–831
for matrix multiplication, 76–83, 792–797
for maximum-subarray problem, 68–75
for merge sort, 30–37, 797–805
for multiplication, 920 pr
...

division theorem, 928
divisor, 927–928
common, 929
see also greatest common divisor
DNA, 6–7, 390–391, 406 pr
...

double hashing, 272–274, 277 ex
...

duality, 879–886, 895 pr
...

dual linear program, 879
dummy key, 397
dynamic graph, 562 n
...

transitive closure of, 705 pr
...

for all-pairs shortest paths, 686–697
for bitonic euclidean traveling-salesman
problem, 405 pr
...


compared with greedy algorithms, 381,
390 ex
...

elements of, 378–390
for Floyd-Warshall algorithm, 693–697
for inventory planning, 411 pr
...

for longest simple path in a weighted
directed acyclic graph, 404 pr
...

reconstructing an optimal solution in, 387
relation to divide-and-conquer, 359
for rod-cutting, 360–370
for seam carving, 409 pr
...

top-down with memoization, 365
for transitive closure, 697–699
for Viterbi algorithm, 408 pr
...

dynamic set, 229–231
see also data structure
dynamic table, 463–471
analyzed by accounting method, 465–466
analyzed by aggregate analysis, 465
analyzed by potential method, 466–471
load factor of, 463
dynamic tree, 482
e, 55
E Œ  (expected value), 1197
early-first form, 444
early task, 444
edge, 1168
admissible, 749
antiparallel, 711–712
attributes of, 592
back, 609
bridge, 621 pr
...

classification in depth-first search, 609–610

Index

continuation, 778
critical, 729
cross, 609
forward, 609
inadmissible, 749
light, 626
negative-weight, 645–646
residual, 716
return, 779
safe, 626
saturated, 739
spawn, 778
tree, 601, 603, 609
weight of, 591
edge connectivity, 731 ex
...

Edmonds-Karp algorithm, 727–730
elementary event, 1189
elementary insertion, 465
element of a set (2), 1158
ellipsoid algorithm, 850, 897
elliptic-curve factorization method, 984
elseif, in pseudocode, 20 n
...

-universal hash function, 269 ex
...
, 852
and inequality constraints, 853
tight, 865

1263

violation of, 865
equation
and asymptotic notation, 49–50
normal, 837
recurrence, see recurrence
equivalence class, 1164
modulo n (Œan ), 928
equivalence, modular (Á), 54, 1165 ex
...

equivalent linear programs, 852
error, in pseudocode, 22
escape problem, 760 pr
...
, 983
euclidean distance, 1039
euclidean norm (k k), 1222
Euler’s constant, 943
Euler’s phi function, 943
Euler’s theorem, 954, 975 ex
...
, 1048
and hamiltonian cycles, 1048
evaluation of a polynomial, 41 pr
...

derivatives of, 922 pr
...

event, 1190
event point, 1023
event-point schedule, 1023
E XACT-S UBSET-S UM, 1129
excess flow, 736
exchange property, 437
exclusion and inclusion, 1163 ex
...

expansion of a dynamic table, 464–467
expectation, see expected value
expected running time, 28, 117
expected value, 1197–1199
of a binomial distribution, 1204
of a geometric distribution, 1202
of an indicator random variable, 118
explored vertex, 605
exponential function, 55–56
exponential height, 300
exponential search tree, 212, 483
exponential series, 1147
exponentiation instruction, 24
exponentiation, modular, 956
E XTENDED -B OTTOM -U P -C UT-ROD, 369

1264

Index

E XTENDED -E UCLID, 937
E XTEND -S HORTEST-PATHS, 688
extension of a set, 438
exterior of a polygon, 1020 ex
...

extracting the maximum key
from d -ary heaps, 167 pr
...

from Young tableaus, 167 pr
...

farthest-pair problem, 1030
FASTER -A LL -PAIRS -S HORTEST-PATHS, 691,
692 ex
...

multithreaded algorithm for, 804 ex
...

feasibility problem, 665, 894 pr
...

F IB -H EAP -D ECREASE -K EY, 519
F IB -H EAP -D ELETE, 522
F IB -H EAP -E XTRACT-M IN, 513
F IB -H EAP -I NSERT , 510

F IB -H EAP -L INK, 516
F IB -H EAP -P RUNE , 529 pr
...

compared with binary heaps, 506–507
creating, 510
decreasing a key in, 519–522
deletion from, 522, 526 pr
...

running times of operations on, 506 fig
...
, 523
computation of, 774–780, 981 pr
...

F IND -M AX -C ROSSING -S UBARRAY, 71
F IND -M AXIMUM -S UBARRAY, 72
find path, 569
F IND -S ET , 562
disjoint-set-forest implementation of, 571,
585
linked-list implementation of, 564
finished vertex, 603
finishing time, in depth-first search, 605
and strongly connected components, 618
finish time, in activity selection, 415
finite automaton, 995
for string matching, 996–1002
F INITE -AUTOMATON -M ATCHER, 999
finite group, 940
finite sequence, 1166
finite set, 1161
first-fit heuristic, 1134 pr
...

Floyd-Warshall algorithm, 693–697,
699–700 ex
...

F ORD -F ULKERSON, 724
Ford-Fulkerson method, 714–731, 765
F ORD -F ULKERSON -M ETHOD, 715
forest, 1172–1173
depth-first, 603
disjoint-set, 568–572
for, in pseudocode, 20–21
and loop invariants, 19 n
...

formula satisfiability, 1079–1081, 1105
forward edge, 609
forward substitution, 816–817
Fourier transform, see discrete Fourier
transform, fast Fourier transform
fractional knapsack problem, 426, 428 ex
...

freeing of objects, 243–244
free list, 243
F REE -O BJECT , 244
free tree, 1172–1176
frequency domain, 898
full binary tree, 1178, 1180 ex
...

functional iteration, 58
fundamental theorem of linear programming,
892
furthest-in-future strategy, 449 pr
...

Gabow’s scaling algorithm for single-source
shortest paths, 679 pr
...
, 1002 ex
...
, 766
garbage collection, 151, 243
gate, 1070
Gaussian elimination, 819, 842
gcd, see greatest common divisor
general number-field sieve, 984
generating function, 108 pr
...
2/, 1227 pr
...

gossiping, 478
G RAFT , 583 pr
...

complement of, 1090
component, 617
constraint, 666–668
dense, 589
depth-first search of, 603–612, 623
dynamic, 562 n
...

hamiltonian, 1061
incidence matrix of, 448 pr
...

interval, 422 ex
...

sparse, 589
static, 562 n
...

gray vertex, 594, 603
greatest common divisor (gcd), 929–930,
933 ex
...

Euclid’s algorithm for, 933–939, 981 pr
...

recursion theorem for, 934
greedoid, 450
G REEDY, 440

G REEDY-ACTIVITY-S ELECTOR, 421
greedy algorithm, 414–450
for activity selection, 415–422
for coin changing, 446 pr
...
, 418, 423–427
Dijkstra’s algorithm, 658–664
elements of, 423–428
for fractional knapsack problem, 426
greedy-choice property in, 424–425
for Huffman code, 428–437
Kruskal’s algorithm, 631–633
and matroids, 437–443
for minimum spanning tree, 631–638
for multithreaded scheduling, 781–783
for off-line caching, 449 pr
...

on a weighted matroid, 439–442
for weighted set cover, 1135 pr
...

group, 939–946
cyclic, 955
operator (˚), 939
guessing the solution, in the substitution
method, 84–85
half 3-CNF satisfiability, 1101 ex
...

halting problem, 1048
halving lemma, 908
HAM-CYCLE, 1062
hamiltonian cycle, 1049, 1061, 1091–1096,
1105
hamiltonian graph, 1061
hamiltonian path, 1066 ex
...

HAM-PATH, 1066 ex
...


Index

harmonic number, 1147, 1153–1154
harmonic series, 1147, 1153–1154
H ASH -D ELETE , 277 ex
...

-universal, 269 ex
...

double, 272–274, 277 ex
...

in memoization, 365, 387
with open addressing, 269–277
perfect, 277–282, 285
to replace adjacency lists, 593 ex
...

H ASH -S EARCH, 271, 277 ex
...

secondary, 278
see also hashing
hash value, 256
hat-check problem, 122 ex
...

binomial, 527 pr
...

compared with Fibonacci heaps, 506–507
d -ary, 167 pr
...

deletion from, 166 ex
...

and treaps, 333 pr
...

H EAP -D ECREASE -K EY, 165 ex
...

H EAP -E XTRACT-M AX, 163
H EAP -E XTRACT-M IN, 165 ex
...

heap property, 152
maintenance of, 154–156
vs
...

heapsort, 151–169
H EAPSORT , 160
heel, 602 ex
...

black-, 309
of a B-tree, 489–490
of a d -ary heap, 167 pr
...

of a node in a tree, 1177
of a red-black tree, 309
of a tree, 1177
height-balanced tree, 333 pr
...

high endpoint of an interval, 348
high function, 537, 546
H IRE -A SSISTANT , 115
hiring problem, 114–115, 123–124, 145
on-line, 139–141
probabilistic analysis of, 120–121
hit
cache, 449 pr
...

H OPCROFT-K ARP, 764 pr
...

horizontal ray, 1021 ex
...
, 900
in the Rabin-Karp algorithm, 990
H UFFMAN, 431
Huffman code, 428–437, 450
hull, convex, 8, 1029–1039, 1046 pr
...

ideal parallel computer, 779
idempotency laws for sets, 1159
identity, 939
identity matrix, 1218
if, in pseudocode, 20
image, 1167
image compression, 409 pr
...
, 593 ex
...

inclusion and exclusion, 1163 ex
...

of random variables, 1197
of subproblems in dynamic programming,
383–384
independent family of subsets, 437
independent set, 1101 pr
...

in analysis of streaks, 138–139
in analysis of the birthday paradox, 132–133
in approximation algorithm for
MAX-3-CNF satisfiability, 1124
in bounding the right tail of the binomial
distribution, 1212–1213
in bucket sort analysis, 202–204
expected value of, 118
in hashing analysis, 259–260
in hiring-problem analysis, 120–121
and linearity of expectation, 119
in quicksort analysis, 182–184, 187 pr
...

in universal-hashing analysis, 265–266
induced subgraph, 1171
inequality constraint, 852
and equality constraints, 853
inequality, linear, 846
infeasible linear program, 851
infeasible solution, 851
infinite sequence, 1166
infinite set, 1161
infinite sum, 1145
infinity, arithmetic with, 650
I NITIALIZE -P REFLOW, 740
I NITIALIZE -S IMPLEX, 871, 887
I NITIALIZE -S INGLE -S OURCE, 648
initial strand, 779
injective function, 1167
inner product, 1222
inorder tree walk, 287, 293 ex
...

input
to an algorithm, 5
to a combinational circuit, 1071
distribution of, 116, 122
to a logic gate, 1070
size of, 25
input alphabet, 995
I NSERT , 162, 230, 463 ex
...

into direct-address tables, 254
into dynamic tables, 464–467
elementary, 465
into Fibonacci heaps, 510–511
into heaps, 164
into interval trees, 349
into linked lists, 237–238
into open-address hash tables, 270
into order-statistic trees, 343
into proto van Emde Boas structures, 544
into queues, 234
into red-black trees, 315–323
into stacks, 232
into sweep-line statuses, 1024
into treaps, 333 pr
...

into van Emde Boas trees, 552–554
into Young tableaus, 167 pr
...

compared with quicksort, 178 ex
...

in merge sort, 39 pr
...

using binary search, 39 ex
...

instance
of an abstract problem, 1051, 1054
of a problem, 5
instructions of the RAM model, 23
integer data type, 23
integer linear programming, 850, 895 pr
...

integers (Z), 1158
integer-valued flow, 733
integrality theorem, 734
integral, to approximate summations,
1154–1156
integration of a series, 1147
interior of a polygon, 1020 ex
...

interpolation by a cubic spline, 840 pr
...

at complex roots of unity, 912–913
intersection
of chords, 345 ex
...

I NTERVAL -D ELETE , 349
interval graph, 422 ex
...

interval tree, 348–354
interval trichotomy, 348
intractability, 1048
invalid shift, 985
inventory planning, 411 pr
...

multiplicative, modulo n, 949
inversion
in a self-organizing list, 476 pr
...
, 122 ex
...

inverter, 1070
invertible matrix, 1223
isolated vertex, 1169
isomorphic graphs, 1171
iterated function, 63 pr
...

of 2-3-4 trees, 503 pr
...

Karmarkar’s algorithm, 897
Karp’s minimum mean-weight cycle algorithm,
680 pr
...
, 1180 pr
...

key, 16, 147, 162, 229
dummy, 397
interpreted as a natural number, 263
median, of a B-tree node, 493
public, 959, 962
secret, 959, 962
static, 277
keywords, in pseudocode, 20–22
multithreaded, 774, 776–777, 785–786
“killer adversary” for quicksort, 190
Kirchhoff’s current law, 708
Kleene star ( ), 1058
KMP algorithm, 1002–1013
KMP-M ATCHER, 1005
knapsack problem
fractional, 426, 428 ex
...
, 1137 pr
...

Knuth-Morris-Pratt algorithm, 1002–1013
k-permutation, 126, 1184
Kraft inequality, 1180 ex
...

k-sorted, 207 pr
...

k-universal hashing, 284 pr
...

proving NP-completeness of, 1078–1079
verification of, 1063
last-in, first-out, 232
see also stack
late task, 444
layers
convex, 1044 pr
...

LCA, 584 pr
...

LCS, 7, 390–397, 413
LCS-L ENGTH, 394
leading submatrix, 833, 839 ex
...

least common multiple, 939 ex
...

L EFT-ROTATE , 313, 353 ex
...

left subtree, 1178
a
Legendre symbol
...

length
of a path, 1170
of a sequence, 1166
of a spine, 333 pr
...

lexicographic sorting, 304 pr
...

linear function, 26, 845
linear independence, 1223
linear inequality, 846
linear-inequality feasibility problem, 894 pr
...

linear probing, 272
linear programming, 7, 843–897
algorithms for, 850
applications of, 849
duality in, 879–886
ellipsoid algorithm for, 850, 897
finding an initial solution in, 886–891
fundamental theorem of, 892
interior-point methods for, 850, 897
Karmarkar’s algorithm for, 897
and maximum flow, 860–861
and minimum-cost circulation, 896 pr
...

and multicommodity flow, 862–863
simplex algorithm for, 864–879, 896
and single-pair shortest path, 859–860
and single-source shortest paths, 664–670,
863 ex
...

linear speedup, 780
line segment, 1015
comparable, 1022
determining turn of, 1017

1271

determining whether any intersect,
1021–1029, 1047
determining whether two intersect,
1017–1019
link
of binomial trees, 527 pr
...
, 250 pr
...

self-organizing, 476 pr
...

ln (natural logarithm), 56
load factor
of a dynamic table, 463
of a hash table, 258
load instruction, 23
local variable, 21
logarithm function (log), 56–57
discrete, 955
iterated (lg ), 58–59
logical parallelism, 777
logic gate, 1070
longest common subsequence, 7, 390–397, 413
longest palindrome subsequence, 405 pr
...

LONGEST-PATH-LENGTH, 1060 ex
...

longest simple path, 1048
in an unweighted graph, 382
in a weighted directed acyclic graph, 404 pr
...

for the generic minimum-spanning-tree
method, 625
for the generic push-relabel algorithm, 743
for Graham’s scan, 1034
for heapsort, 160 ex
...

for increasing a key in a heap, 166 ex
...

for red-black tree insertion, 318
for the relabel-to-front algorithm, 755
for searching an interval tree, 352
for the simplex algorithm, 872
for string-matching automata, 998, 1000
and termination, 19
low endpoint of an interval, 348
lower bounds
on approximations, 1140
asymptotic, 48
for average sorting, 207 pr
...

for convex hull, 1038 ex
...

for minimum-weight vertex cover,
1124–1126
for multithreaded computations, 780
and potential functions, 478
for priority-queue operations, 531
and recurrences, 67
for simultaneous minimum and maximum,
215 ex
...

for sorting, 191–194, 205 pr
...

on summations, 1152, 1154
lower median, 213
p
, 546
lower square root #
lower-triangular matrix, 1219, 1222 ex
...

low function, 537, 546
LU decomposition, 806 pr
...
, 815
computation of, 822–825
of a diagonal matrix, 827 ex
...

of a permutation matrix, 827 ex
...

M AKE -T REE , 583 pr
...
, 1044 ex
...

master method for solving a recurrence, 93–97
master theorem, 94
proof of, 97–106
matched vertex, 732
matching
bipartite, 732, 763 pr
...

maximum, 1135 pr
...

perfect, 735 ex
...

determinant of, 1224–1225
diagonal, 1218
Hermitian, 832 ex
...
, 593 ex
...
, 827–831, 842
lower-triangular, 1219, 1222 ex
...

multiplication of, see matrix multiplication
negative of, 1220
permutation, 1220, 1222 ex
...

pseudoinverse of, 837
scalar multiple of, 1220
subtraction of, 1221
symmetric, 1220
symmetric positive-definite, 832–835, 842
Toeplitz, 921 pr
...
, 1217
transpose of, multithreaded, 792 ex
...

Vandermonde, 902, 1226 pr
...

and computing the determinant, 832 ex
...

and matrix inversion, 828–831, 842

1273

multithreaded algorithm for, 792–797,
806 pr
...

Strassen’s algorithm for, 79–83, 111–112
M ATRIX -M ULTIPLY, 371
matrix-vector multiplication, multithreaded,
785–787, 792 ex
...
, 450, 642
for task scheduling, 443–446
M AT-V EC, 785
M AT-V EC -M AIN -L OOP, 786
M AT-V EC -W RONG, 790
MAX-CNF satisfiability, 1127 ex
...

M AX -F LOW-B Y-S CALING, 763 pr
...

deletion from, 166 ex
...
, 481 n
...

M AX -H EAPIFY, 154
M AX -H EAP -I NSERT , 164
building a heap with, 166 pr
...

maximal matching, 1110, 1135 pr
...

maximal subset, in a matroid, 438
maximization linear program, 846
and minimization linear programs, 852
maximum, 213
in binary search trees, 291
of a binomial distribution, 1207 ex
...

in proto van Emde Boas structures, 544 ex
...
, 766
Hopcroft-Karp algorithm for, 763 pr
...

push-relabel algorithms for, 736–760, 765
relabel-to-front algorithm for, 748–760
scaling algorithm for, 762 pr
...

maximum matching, 1135 pr
...

maximum-subarray problem, 68–75, 111
max-priority queue, 162
MAX-3-CNF satisfiability, 1123–1124, 1139
M AYBE -MST-A, 641 pr
...

M AYBE -MST-C, 641 pr
...

median, 213–227
multithreaded algorithm for, 805 ex
...

of two sorted lists, 804 ex
...

median key, of a B-tree node, 493
median-of-3 method, 188 pr
...

linked-list implementation of, 250 pr
...

2-3-4 heaps, 529 pr
...
, 481 n
...

mergeable min-heap, 250 n
...
, 505
M ERGE -L ISTS, 1129
merge sort, 12, 30–37
compared with insertion sort, 14 ex
...

M ERGE -S ORT , 34
M ERGE -S ORT 0, 797
merging
of k sorted lists, 166 ex
...

multithreaded algorithm for, 798–801
of two sorted arrays, 30
M ILLER -R ABIN, 970
Miller-Rabin primality test, 968–975, 983
M IN -G AP, 354 ex
...

building, 156–159
d -ary, 706 pr
...
, 481 n
...

in Prim’s algorithm, 636
M IN -H EAPIFY, 156 ex
...

min-heap ordering, 507
min-heap property, 153, 507
maintenance of, 156 ex
...

vs
...

minimization linear program, 846
and maximization linear programs, 852
minimum, 213
in binary search trees, 291

Index

in a bit vector with a superimposed binary
tree, 533
in a bit vector with a superimposed tree of
constant height, 535
in B-trees, 497 ex
...

in order-statistic trees, 347 ex
...

in van Emde Boas trees, 550
M INIMUM, 162, 214, 230, 505
minimum-cost circulation, 896 pr
...

minimum-cost spanning tree, see minimum
spanning tree
minimum cut, 721, 731 ex
...

minimum node, of a Fibonacci heap, 508
minimum path cover, 761 pr
...

generic method for, 625–630
Kruskal’s algorithm for, 631–633
Prim’s algorithm for, 634–636
relation to matroids, 437, 439–440
second-best, 638 pr
...

missing child, 1178
mod, 54, 928
modifying operation, 230
modular arithmetic, 54, 923 pr
...

modular exponentiation, 956
M ODULAR -E XPONENTIATION, 957
modular linear equations, 946–950
M ODULAR -L INEAR -E QUATION -S OLVER,
949
modulo, 54, 928
Monge array, 110 pr
...

move-to-front heuristic, 476 pr
...

much-greater-than ( ), 574
much-less-than ( ), 783
multicommodity flow, 862–863
minimum-cost, 864 ex
...

multigraph, 1172
converting to equivalent undirected graph,
593 ex
...

scalar, 1220
multiple assignment, 21
multiple sources and sinks, 712
multiplication
of complex numbers, 83 ex
...

of matrices, see matrix multiplication
of a matrix chain, 370–378
matrix-vector, multithreaded, 785–787,
789–790, 792 ex
...


1276

Index

multiset, 1158 n
...

Floyd-Warshall algorithm, 797 ex
...

for LUP decomposition, 806 pr
...

for matrix multiplication, 792–797, 806 pr
...
, 797 ex
...

for median, 805 ex
...

for partitioning, 804 ex
...

for quicksort, 811 pr
...

for a simple stencil calculation, 809 pr
...

Strassen’s algorithm, 795–796
multithreaded composition, 784 fig
...

natural numbers (N), 1158
keys interpreted as, 263
negative of a matrix, 1220
negative-weight cycle
and difference constraints, 667
and relaxation, 677 ex
...
,
700 ex
...

neighbor list, 750
nested parallelism, 776, 805 pr
...


net flow across a cut, 720
network
admissible, 749–750
flow, see flow network
residual, 715–719
for sorting, 811
N EXT-T O -T OP, 1031
NIL, 21
node, 1176
see also vertex
nonbasic variable, 855
nondeterministic multithreaded algorithm, 787
nondeterministic polynomial time, 1064 n
...

noninvertible matrix, 1223
nonnegativity constraint, 851, 853
nonoverlappable string pattern, 1002 ex
...

nontrivial square root of 1, modulo n, 956
no-path property, 650, 672
normal equation, 837
norm of a vector, 1222
NOT function (:), 1071
not a set member (62), 1158
not equivalent (6Á), 54
NOT gate, 1070
NP (complexity class), 1049, 1064, 1066 ex
...

of the formula-satisfiability problem,
1079–1081, 1105
of the graph-coloring problem, 1103 pr
...

of the hamiltonian-cycle problem,
1091–1096, 1105
of the hamiltonian-path problem, 1101 ex
...

of integer linear programming, 1101 ex
...

proving, of a language, 1078–1079
of scheduling with profits and deadlines,
1104 pr
...

of the set-partition problem, 1101 ex
...

of the subset-sum problem, 1097–1100
of the 3-CNF-satisfiability problem,
1082–1085, 1105
of the traveling-salesman problem,
1096–1097
of the vertex-cover problem, 1089–1091,
1105
of 0-1 integer programming, 1100 ex
...
, 47–48, 64
O 0 -notation, 62 pr
...

O
object, 21
allocation and freeing of, 243–244
array implementation of, 241–246
passing as parameter, 21
objective function, 664, 847, 851
objective value, 847, 851
oblivious compare-exchange algorithm, 208 pr
...

off-line problem
caching, 449 pr
...

minimum, 582 pr
...
, 48–49, 64
1-approximation algorithm, 1107

1277

one-pass method, 585
one-to-one correspondence, 1167
one-to-one function, 1167
on-line convex-hull problem, 1039 ex
...

with linear probing, 272
with quadratic probing, 272, 283 pr
...

order-statistic tree, 339–345
querying, 347 ex
...

OS-R ANK, 342
OS-S ELECT , 341
out-degree, 1169
outer product, 1222
output
of an algorithm, 5
of a combinational circuit, 1071
of a logic gate, 1070
overdetermined system of linear equations, 814
overflow
of a queue, 235
of a stack, 233
overflowing vertex, 736
discharge of, 751
overlapping intervals, 348
finding all, 354 ex
...

overlapping rectangles, 354 ex
...
, 1105
package wrapping, 1037, 1047
page on a disk, 486, 499 ex
...

pair, ordered, 1161
pairwise disjoint sets, 1161
pairwise independence, 1193
pairwise relatively prime, 931
palindrome, 405 pr
...

parallel algorithm, 10, 772
see also multithreaded algorithm
parallel computer, 772
ideal, 779
parallel for, in pseudocode, 785–786
parallelism
logical, 777

of a multithreaded computation, 780
nested, 776
of a randomized multithreaded algorithm,
811 pr
...

parallel-machine-scheduling problem, 1136 pr
...

parallel random-access machine, 811
parallel slackness, 781
rule of thumb, 783
parallel, strands being logically in, 778
parameter, 21
costs of passing, 107 pr
...

partition function, 361 n
...

Hoare’s method for, 185 pr
...

randomized, 179
partition of a set, 1161, 1164
Pascal’s triangle, 1188 ex
...

critical, 657
find, 569
hamiltonian, 1066 ex
...

path length, of a tree, 304 pr
...

path-relaxation property, 650, 673

Index

pattern, in string matching, 985
nonoverlappable, 1002 ex
...

permutation, 1167
bit-reversal, 472 pr
...

k-permutation, 126, 1184
linear, 1229 pr
...
, 1226 ex
...

P ERMUTE -B Y-C YCLIC, 129 ex
...

P ERMUTE -W ITHOUT-I DENTITY, 128 ex
...
, 482
P ERSISTENT-T REE -I NSERT , 331 pr
...

P-F IB, 776
phase, of the relabel-to-front algorithm, 758
phi function (
...

pivot
in linear programming, 867, 869–870,
878 ex
...

Pollard’s rho heuristic, 976–980, 980 ex
...

kernel of, 1038 ex
...

polylogarithmically bounded, 57
polynomial, 55, 898
addition of, 898
asymptotic behavior of, 61 pr
...

evaluation of, 41 pr
...
, 923 pr
...

multiplication of, 899, 903–905, 920 pr
...

polynomial-time computability, 1056
polynomial-time decision, 1059
polynomial-time reducibility (ÄP ), 1067,
1077 ex
...

positional tree, 1178
positive-definite matrix, 1225
post-office location problem, 225 pr
...

for dynamic tables, 466–471
for Fibonacci heaps, 509–512, 517–518,
520–522
for the generic push-relabel algorithm, 746
for min-heaps, 462 ex
...

for self-organizing lists with move-to-front,
476 pr
...

nontrivial, 933 ex
...

power set, 1161
Pr f g (probability distribution), 1190
PRAM, 811
predecessor
in binary search trees, 291–292
in a bit vector with a superimposed binary
tree, 534
in a bit vector with a superimposed tree of
constant height, 535
in breadth-first trees, 594
in B-trees, 497 ex
...

in proto van Emde Boas structures, 544 ex
...

prefix
of a sequence, 392
of a string (<), 986
prefix code, 429
prefix computation, 807 pr
...

preorder, total, 1165
preorder tree walk, 287
presorting, 1043
Prim’s algorithm, 634–636, 642
with an adjacency matrix, 637 ex
...


similarity to Dijkstra’s algorithm, 634, 662
for sparse graphs, 638 pr
...

P RINT-A LL -PAIRS -S HORTEST-PATH, 685
P RINT-C UT-ROD -S OLUTION, 369
P RINT-I NTERSECTING -S EGMENTS, 1028 ex
...

priority queue, 162–166
in constructing Huffman codes, 431
in Dijkstra’s algorithm, 661
heap implementation of, 162–166
lower bounds for, 531
max-priority queue, 162
min-priority queue, 162, 165 ex
...

of balls and bins, 133–134
of birthday paradox, 130–133
of bucket sort, 201–204, 204 ex
...
, 282 ex
...

of file comparison, 995 ex
...

of hashing with chaining, 258–260
of height of a randomly built binary search
tree, 299–303
of hiring problem, 120–121, 139–141
of insertion into a binary search tree with
equal keys, 303 pr
...

of lower bound for sorting, 205 pr
...

of on-line hiring problem, 139–141
of open-address hashing, 274–276, 277 ex
...
, 185 ex
...

of perfect hashing, 279–282
of Pollard’s rho heuristic, 977–980
of probabilistic counting, 143 pr
...
, 303 ex
...

of searching a compact list, 250 pr
...

of sorting points by distance from origin,
204 ex
...

probability, 1189–1196
probability density function, 1196
probability distribution, 1190
probability distribution function, 204 ex
...

see also linear probing, quadratic probing,
double hashing
problem
abstract, 1054
computational, 5–6
concrete, 1055
decision, 1051, 1054
intractable, 1048
optimization, 359, 1050, 1054
solution to, 6, 1054–1055
tractable, 1048

1281

procedure, 6, 16–17
Q
product
...

outer, 1222
of polynomials, 899
rule of, 1184
scalar flow, 714 ex
...

program counter, 1073
programming, see dynamic programming,
linear programming
proper ancestor, 1176
proper descendant, 1176
proper subgroup, 944
proper subset ( ), 1159
proto van Emde Boas structure, 538–545
cluster in, 538
compared with van Emde Boas trees, 547
deletion from, 544
insertion into, 544
maximum in, 544 ex
...

successor in, 543–544
summary in, 540
P ROTO - V EB-I NSERT , 544
P ROTO - V EB-M EMBER, 541
P ROTO - V EB-M INIMUM, 542
proto-vEB structure, see proto van Emde Boas
structure
P ROTO - V EB-S UCCESSOR, 543
prune-and-search method, 1030
pruning a Fibonacci heap, 529 pr
...

P-S CAN -2, 808 pr
...

P-S CAN -D OWN, 809 pr
...

pseudocode, 16, 20–22
pseudoinverse, 837
pseudoprime, 966–968
P SEUDOPRIME , 967
pseudorandom-number generator, 117
P-S QUARE -M ATRIX -M ULTIPLY, 793

1282

Index

P-T RANSPOSE , 792 ex
...

push operation (in push-relabel algorithms),
738–739
nonsaturating, 739, 745
saturating, 739, 745
push-relabel algorithm, 736–760, 765
basic operations in, 738–740
by discharging an overflowing vertex of
maximum height, 760 ex
...

gap heuristic for, 760 ex
...

relabel-to-front algorithm, 748–760
quadratic function, 27
quadratic probing, 272, 283 pr
...

quantile, 223 ex
...

linked-list implementation of, 240 ex
...

quicksort, 170–190
analysis of, 174–185
average-case analysis of, 181–184
compared with insertion sort, 178 ex
...

good worst-case implementation of, 223 ex
...

multithreaded algorithm for, 811 pr
...

stack depth of, 188 pr
...

use of insertion sort in, 185 ex
...

quotient, 928
R (set of real numbers), 1158
Rabin-Karp algorithm, 990–995, 1013
R ABIN -K ARP -M ATCHER, 993
race, 787–790
R ACE -E XAMPLE , 788
radix sort, 197–200
compared with quicksort, 199
R ADIX -S ORT , 198
radix tree, 304 pr
...

for fuzzy sorting of intervals, 189 pr
...

for MAX-3-CNF satisfiability, 1123–1124,
1139
Miller-Rabin primality test, 968–975, 983
multithreaded, 811 pr
...
, 187–188 pr
...
,
984
and probabilistic analysis, 123–124
quicksort, 179–180, 185 ex
...

randomized rounding, 1139
for searching a compact list, 250 pr
...

R ANDOMIZED -H IRE -A SSISTANT, 124
R ANDOMIZED -PARTITION, 179
R ANDOMIZED -Q UICKSORT , 179, 303 ex
...

randomized rounding, 1139
R ANDOMIZED -S ELECT, 216
R ANDOMIZE -I N -P LACE , 126

Index

randomly built binary search tree, 299–303,
304 pr
...

random sampling, 129 ex
...

random variable, 1196–1201
indicator, see indicator random variable
range, 1167
of a matrix, 1228 pr
...

of a node in a disjoint-set forest, 569, 575,
581 ex
...

row, 1223
rate of growth, 28
ray, 1021 ex
...

RB-I NSERT , 315
RB-I NSERT-F IXUP, 316
RB-J OIN, 332 pr
...

recurrence, 34, 65–67, 83–113
solution by Akra-Bazzi method, 112–113
solution by master method, 93–97
solution by recursion-tree method, 88–93
solution by substitution method, 83–88
recurrence equation, see recurrence
recursion, 30
recursion tree, 37, 88–93
in proof of master theorem, 98–100
and the substitution method, 91–92
R ECURSIVE -ACTIVITY-S ELECTOR, 419
recursive case, 65

1283

R ECURSIVE -FFT, 911
R ECURSIVE -M ATRIX -C HAIN, 385
red-black tree, 308–338
augmentation of, 346–347
compared with B-trees, 484, 490
deletion from, 323–330
in determining whether any line segments
intersect, 1024
for enumerating keys in a range, 348 ex
...

maximum key of, 311
minimum key of, 311
predecessor in, 311
properties of, 308–312
relaxed, 311 ex
...

rotation in, 312–314
searching in, 311
successor in, 311
see also interval tree, order-statistic tree
R EDUCE , 807 pr
...

reducibility, 1067–1068
reduction algorithm, 1052, 1067
reduction function, 1067
reduction, of an array, 807 pr
...

release time, 447 pr
...

R EPETITION -M ATCHER, 1013 pr
...

residual capacity, 716, 719
residual edge, 716
residual network, 715–719
residue, 54, 928, 982 pr
...

rho heuristic, 976–980, 980 ex
...
n/-approximation algorithm, 1106, 1123
R IGHT , 152
right child, 1178
right-conversion, 314 ex
...

R IGHT-ROTATE , 313
right rotation, 312
right spine, 333 pr
...

root
of a tree, 1176
of unity, 906–907
of Zn , 955
rooted tree, 1176
representation of, 246–249
root list, of a Fibonacci heap, 509
rotation
cyclic, 1012 ex
...

rule of product, 1184
rule of sum, 1183
running time, 25
average-case, 28, 116
best-case, 29 ex
...

safe edge, 626
S AME -C OMPONENT , 563
sample space, 1189
sampling, 129 ex
...
, 1139
satisfiable formula, 1049, 1079
satisfying assignment, 1072, 1079
saturated edge, 739
saturating push, 739, 745
scalar flow product, 714 ex
...
, 765
in single-source shortest paths, 679 pr
...

S CAN, 807 pr
...

event-point, 1023
scheduler, for multithreaded computations,
777, 781–783, 812
centralized, 782
greedy, 782
work-stealing algorithm for, 812
scheduling, 443–446, 447 pr
...
,
1136 pr
...

seam carving, 409 pr
...

binary search, 39 ex
...

in direct-address tables, 254
for an exact interval, 354 ex
...

in linked lists, 237
in open-address hash tables, 270–271
in proto van Emde Boas structures, 540–541
in red-black trees, 311
in an unsorted array, 143 pr
...

second-best minimum spanning tree, 638 pr
...

in order-statistic trees, 340–341
in worst-case linear time, 220–224
selection sort, 29 ex
...
, 478
semiconnected graph, 621 ex
...

finite, 1166
infinite, 1166
inversion in, 41 pr
...
, 345 ex
...
, 1146–1148
strands being logically in, 778
set (f g), 1158–1163
cardinality (j j), 1161
convex, 714 ex
...

intersection (\), 1159
member (2), 1158
not a member (62), 1158
union ([), 1159
set-covering problem, 1117–1122, 1139
weighted, 1135 pr
...

shadow of a point, 1038 ex
...

and breadth-first search, 597–600, 644
convergence property of, 650, 672–673
and difference constraints, 664–670
Dijkstra’s algorithm for, 658–664
in a directed acyclic graph, 655–658
in -dense graphs, 706 pr
...
, 706
Gabow’s scaling algorithm for, 679 pr
...
, 700 ex
...

signature, 960
simple cycle, 1170
simple graph, 1170
simple path, 1170
longest, 382, 1048
simple polygon, 1020 ex
...

simple uniform hashing, 259
simplex, 848
S IMPLEX, 871
simplex algorithm, 848, 864–879, 896–897
single-destination shortest paths, 644
single-pair shortest path, 381, 644
as a linear program, 859–860
single-source shortest paths, 643–683
Bellman-Ford algorithm for, 651–655
with bitonic paths, 682 pr
...

Gabow’s scaling algorithm for, 679 pr
...

and longest paths, 1048
singleton, 1161
singly connected graph, 612 ex
...
, 709, 712
size
of an algorithm’s input, 25, 926–927,
1055–1057
of a binomial tree, 527 pr
...

parallel, 781
slack variable, 855
slot
of a direct-access table, 254
of a hash table, 256
S LOW-A LL -PAIRS -S HORTEST-PATHS, 689
smoothed analysis, 897
?Socrates, 790
solution
to an abstract problem, 1054
basic, 866
to a computational problem, 6
to a concrete problem, 1055
feasible, 665, 846, 851
infeasible, 851
optimal, 851
to a system of linear equations, 814
sorted linked list, 236
see also linked list
sorting, 5, 16–20, 30–37, 147–212, 797–805
bubblesort, 40 pr
...

comparison sort, 191
counting sort, 194–197
fuzzy, 189 pr
...

lexicographic, 304 pr
...

lower bounds for, 191–194, 211, 531
merge sort, 12, 30–37, 797–805
by oblivious compare-exchange algorithms,
208 pr
...

of points by polar angle, 1020 ex
...

quicksort, 170–190
radix sort, 197–200
selection sort, 29 ex
...

with variable-length items, 206 pr
...

sorting network, 811
source vertex, 594, 644, 709, 712
span law, 780
spanning tree, 439, 624
bottleneck, 640 pr
...

verification of, 642
see also minimum spanning tree
span, of a multithreaded computation, 779
sparse graph, 589
all-pairs shortest paths for, 700–705
and Prim’s algorithm, 638 pr
...

spawn, in pseudocode, 776–777
spawn edge, 778
speedup, 780
of a randomized multithreaded algorithm,
811 pr
...

of a treap, 333 pr
...

splitting
of B-tree nodes, 493–495
of 2-3-4 trees, 503 pr
...

square root, modulo a prime, 982 pr
...

stack, 232–233
in Graham’s scan, 1030
implemented by queues, 236 ex
...

operations analyzed by accounting method,
457–458
operations analyzed by aggregate analysis,
452–454
operations analyzed by potential method,
460–461
for procedure execution, 188 pr
...

S TACK -E MPTY, 233
standard deviation, 1200
standard encoding (h i), 1057
standard form, 846, 850–854
star-shaped polygon, 1038 ex
...

static set of keys, 277
static threading, 773
stencil, 809 pr
...

Stirling’s approximation, 57
storage management, 151, 243–244, 245 ex
...

store instruction, 23
straddle, 1017
strand, 777
final, 779
independent, 789
initial, 779
logically in parallel, 778

1288

Index

logically in series, 778
Strassen’s algorithm, 79–83, 111–112
multithreaded, 795–796
streaks, 135–139
strictly decreasing, 53
strictly increasing, 53
string, 985, 1184
string matching, 985–1013
based on repetition factors, 1012 pr
...
, 1002 ex
...

strongly connected component, 1170
decomposition into, 615–621, 623
S TRONGLY-C ONNECTED -C OMPONENTS, 617
strongly connected graph, 1170
subgraph, 1171
predecessor, see predecessor subgraph
subgraph-isomorphism problem, 1100 ex
...

executing, 25 n
...

substitution method, 83–88
and recursion trees, 91–92
substring, 1184
subtract instruction, 23
subtraction of matrices, 1221
subtree, 1176
maintaining sizes of, in order-statistic trees,
343–344

success, in a Bernoulli trial, 1201
successor
in binary search trees, 291–292
in a bit vector with a superimposed binary
tree, 533
in a bit vector with a superimposed tree of
constant height, 535
finding ith, of a node in an order-statistic
tree, 344 ex
...

in proto van Emde Boas structures, 543–544
in red-black trees, 311
in Van Emde Boas trees, 550–551
S UCCESSOR, 230
such that (W), 1159
suffix (=), 986
suffix function, 996
suffix-function inequality, 999
suffix-function recursion lemma, 1000
P
sum
...

infinite, 1145
of matrices, 1220
of polynomials, 898
rule of, 1183
telescoping, 1148
S UM -A RRAYS, 805 pr
...

summary
in a bit vector with a superimposed tree of
constant height, 534
in proto van Emde Boas structures, 540
in van Emde Boas trees, 546
summation, 1145–1157
in asymptotic notation, 49–50, 1146
bounding, 1149–1156
formulas and properties of, 1145–1149
linearity of, 1146
summation lemma, 908
supercomputer, 772
superpolynomial time, 1048
supersink, 712
supersource, 712
surjection, 1167
SVD, 842
sweeping, 1021–1029, 1045 pr
...

symmetric matrix, 1220, 1222 ex
...

symmetric positive-definite matrix, 832–835,
842
symmetric relation, 1163
symmetry of ‚-notation, 52
sync, in pseudocode, 776–777
system of difference constraints, 664–670
system of linear equations, 806 pr
...

TABLE -D ELETE, 468
TABLE -I NSERT , 464
tail
of a binomial distribution, 1208–1215
of a linked list, 236
of a queue, 234
tail recursion, 188 pr
...

target, 1097
Tarjan’s off-line least-common-ancestors
algorithm, 584 pr
...
, 450
tautology, 1066 ex
...

Taylor series, 306 pr
...

Theta-notation, 44–47, 64
thread, 773
Threading Building Blocks, 774
3-CNF, 1082
3-CNF-SAT, 1082
3-CNF satisfiability, 1082–1085, 1105
approximation algorithm for, 1123–1124,
1139
and 2-CNF satisfiability, 1049
3-COLOR, 1103 pr
...

Toeplitz matrix, 921 pr
...

total preorder, 1165
total relation, 1165
tour
bitonic, 405 pr
...
, 1048
of a graph, 1096
track, 486
tractability, 1048
trailing pointer, 295
transition function, 995, 1001–1002, 1012 ex
...

of dynamic graphs, 705 pr
...

of a directed graph, 592 ex
...

transpose symmetry of asymptotic notation, 52
traveling-salesman problem
approximation algorithm for, 1111–1117,
1139
bitonic euclidean, 405 pr
...

NP-completeness of, 1096–1097
with the triangle inequality, 1112–1115
without the triangle inequality, 1115–1116

1290

Index

traversal of a tree, 287, 293 ex
...
, 338
T REAP -I NSERT , 333 pr
...
, 337
binary, see binary tree
binomial, 527 pr
...

breadth-first, 594, 600
B-trees, 484–504
decision, 192–193
depth-first, 603
diameter of, 602 ex
...

height of, 1177
interval, 348–354
k-neighbor, 338
minimum spanning, see minimum spanning
tree
optimal binary search, 397–404, 413
order-statistic, 339–345
parse, 1082
recursion, 37, 88–93
red-black, see red-black tree
rooted, 246–249, 1176
scapegoat, 338
search, see search tree
shortest-paths, 647–648, 673–676
spanning, see minimum spanning tree,
spanning tree
splay, 338, 482
treap, 333 pr
...

van Emde Boas, 531–560
walk of, 287, 293 ex
...
, 323–324
tree edge, 601, 603, 609
T REE -I NSERT , 294, 315
T REE -M AXIMUM, 291
T REE -M INIMUM, 291

T REE -P REDECESSOR, 292
T REE -S EARCH, 290
T REE -S UCCESSOR, 292
tree walk, 287, 293 ex
...
, 1225 ex
...

tridiagonal matrix, 1219
trie (radix tree), 304 pr
...

T RIM, 1130
trimming a list, 1130
trivial divisor, 928
truth assignment, 1072, 1079
truth table, 1070
TSP, 1096
tuple, 1162
twiddle factor, 912
2-CNF-SAT, 1086 ex
...

and 3-CNF satisfiability, 1049
two-pass method, 571
2-3-4 heap, 529 pr
...

splitting, 503 pr
...

biconnected component of, 621 pr
...

clique in, 1086
coloring of, 1103 pr
...


Index

computing a minimum spanning tree in,
624–642
converting to, from a multigraph, 593 ex
...

grid, 760 pr
...

matching of, 732
nonhamiltonian, 1061
vertex cover of, 1089, 1108
see also graph
undirected version of a directed graph, 1172
uniform hashing, 271
uniform probability distribution, 1191–1192
uniform random permutation, 116, 125
union
of dynamic sets, see uniting
of languages, 1058
of sets ([), 1159
U NION, 505, 562
disjoint-set-forest implementation of, 571
linked-list implementation of, 565–567,
568 ex
...

of 2-3-4 heaps, 529 pr
...

universe, 1160
of keys in van Emde Boas trees, 532
universe size, 532
unmatched vertex, 732
unsorted linked list, 236
see also linked list
until, in pseudocode, 20
unweighted longest simple paths, 382
unweighted shortest paths, 381
upper bound, 47

1291

upper-bound property, 650, 671–672
upper median, 213
p
, 546
upper square root "
upper-triangular matrix, 1219, 1225 ex
...

Vandermonde matrix, 902, 1226 pr
...

successor in, 550–551
summary in, 546
Var Œ  (variance), 1199
variable
basic, 855
entering, 867
leaving, 867
nonbasic, 855
in pseudocode, 21
random, 1196–1201
slack, 855
see also indicator random variable
variable-length code, 429
variance, 1199
of a binomial distribution, 1205
of a geometric distribution, 1203
V EB-E MPTY-T REE -I NSERT, 553
vEB tree, see van Emde Boas tree
V EB-T REE -D ELETE, 554
V EB-T REE -I NSERT , 553
V EB-T REE -M AXIMUM , 550
V EB-T REE -M EMBER , 550
V EB-T REE -M INIMUM , 550
V EB-T REE -P REDECESSOR, 552
V EB-T REE -S UCCESSOR , 551

1292

Index

vector, 1218, 1222–1224
convolution of, 901
cross product of, 1016
orthonormal, 842
in the plane, 1015
Venn diagram, 1160
verification, 1061–1066
of spanning trees, 642
verification algorithm, 1063
vertex
articulation point, 621 pr
...

in a graph, 1168
intermediate, 693
isolated, 1169
overflowing, 736
of a polygon, 1020 ex
...

VORP, 411 pr
...
, 342, 1114
weak duality, 880–881, 886 ex
...

weight
of a cut, 1127 ex
...

of a path, 643
weight-balanced tree, 338, 473 pr
...

weighted set-covering problem, 1135 pr
...

y-fast trie, 558 pr
...

Z (set of integers), 1158
Zn (equivalence classes modulo n), 928
Zn (elements of multiplicative group
modulo n), 941
ZC (nonzero elements of Zn ), 967
n
zero matrix, 1218
zero of a polynomial modulo a prime, 950 ex
...
, 1125
0-1 knapsack problem, 425, 427 ex
...
,
1139
0-1 sorting lemma, 208 pr
...



Title: Algorithms
Description: Brief description of algorithms