Search for notes by fellow students, in your own course and all over the country.
Browse our notes for titles which look like what you need, you can preview any of the notes via a sample of the contents. After you're happy these are the notes you're after simply pop them into your shopping cart.
Document Preview
Extracts from the notes are below, to see the PDF you'll receive please use the links above
Problem Solving with Algorithms and
Data Structures
Release 3
...
1 Objectives
...
2 Getting Started
...
3 What Is Computer Science?
1
...
1
...
1
...
1
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
Algorithm Analysis
2
...
2
...
2
...
4 Summary
...
5 Key Terms
...
6 Discussion Questions
...
7 Programming Exercises
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
3
3
3
4
8
38
38
38
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
41
41
41
52
59
59
59
60
Basic Data Structures
3
...
3
...
3
...
3
...
3
...
3
...
3
...
3
...
3
...
10 The Ordered List Abstract Data Type
...
11 Summary
...
12 Key Terms
...
13 Discussion Questions
...
14 Programming Exercises
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
Recursion
117
4
...
117
4
...
117
i
4
...
4
4
...
6
4
...
8
4
...
10
5
6
7
ii
Stack Frames: Implementing Recursion
Visualising Recursion
...
Exploring a Maze
...
Key Terms
...
Programming Exercises
...
1 Objectives
...
2 Searching
...
3 Sorting
...
4 Summary
...
5 Key Terms
...
6 Discussion Questions
...
7 Programming Exercises
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
Trees and Tree Algorithms
6
...
6
...
6
...
6
...
6
...
6 Binary Tree Applications
...
7 Tree Traversals
...
8 Binary Search Trees
...
9 Summary
...
10 Key Terms
...
11 Discussion Questions
...
12 Programming Exercises
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
147
147
147
163
181
182
182
183
...
...
...
...
...
...
185
185
185
188
190
198
206
212
215
231
232
232
233
JSON
235
7
...
235
7
...
235
7
...
235
Problem Solving with Algorithms and Data Structures, Release 3
...
0
2
CONTENTS
CHAPTER
ONE
INTRODUCTION
1
...
• To understand abstraction and the role it plays in the problem-solving process
...
• To review the Python programming language
...
2 Getting Started
The way we think about programming has undergone many changes in the years since the first
electronic computers required patch cables and switches to convey instructions from human
to machine
...
Advances such as faster processors, high-speed networks, and large memory capacities have created a spiral of complexity through which computer scientists must navigate
...
The science of computing is concerned with using computers to solve problems
...
You have
also learned that writing computer programs is often hard
...
This chapter emphasizes two important areas for the rest of the text
...
Second, we review the Python programming
language
...
3
Problem Solving with Algorithms and Data Structures, Release 3
...
3 What Is Computer Science?
Computer science is often difficult to define
...
As you are perhaps aware, computer science is not simply
the study of computers
...
Computer science is the study of problems, problem-solving, and the solutions that come out
of the problem-solving process
...
Algorithms are finite processes that if followed will solve the problem
...
Computer science can be thought of as the study of algorithms
...
Although proving this statement
is beyond the scope of this text, the fact that some problems cannot be solved is important for
those who study computer science
...
It is also very common to include the word computable when describing problems and solutions
...
An alternative
definition for computer science, then, is to say that computer science is the study of problems
that are and that are not computable, the study of the existence and the nonexistence of algorithms
...
Solutions
are considered independent from the machine
...
Abstraction allows us to view the problem and solution in such a way as to
separate the so-called logical and physical perspectives
...
Consider the automobile that you may have driven to school or work today
...
You get in, insert the key, start the car, shift, brake, accelerate, and steer in order to
drive
...
You are using the functions provided by the car designers for the purpose of
transporting you from one location to another
...
On the other hand, the mechanic who must repair your automobile takes a very different point
of view
...
She needs to understand how the engine works,
how the transmission shifts gears, how temperature is controlled, and so on
...
”
The same thing happens when we use computers
...
They
view computers from a logical or user perspective
...
They
4
Chapter 1
...
0
Figure 1
...
They must be able to control the low-level
details that a user simply assumes
...
This interface is the way we as users communicate with the underlying
complexities of the implementation
...
Once we import the module, we can perform computations such as
>>> import math
>>> math
...
0
>>>
This is an example of procedural abstraction
...
If we
perform the import correctly, we can assume that the function will provide us with the correct
results
...
This is sometimes referred to as a “black box” view of a process
...
The details are hidden inside (see Figure 1
...
1
...
1 What Is Programming?
Programming is the process of taking an algorithm and encoding it into a notation, a programming language, so that it can be executed by a computer
...
Without an algorithm there can be no program
...
Programming, however, is an important
part of what a computer scientist does
...
Therefore, this language representation and the process of creating
it becomes a fundamental part of the discipline
...
Programming
languages must provide a notational way to represent both the process and the data
...
1
...
What Is Computer Science?
5
Problem Solving with Algorithms and Data Structures, Release 3
...
At a minimum, algorithms require constructs that perform sequential processing, selection
for decision-making, and iteration for repetitive control
...
All data items in the computer are represented as strings of binary digits
...
Data types provide an interpretation for this
binary data so that we can think about the data in terms that make sense with respect to the
problem being solved
...
For example, most programming languages provide a data type for integers
...
g
...
In addition, a data type also
provides a description of the operations that the data items can participate in
...
We have come to
expect that numeric types of data can participate in these arithmetic operations
...
These simple, language-provided constructs and data types, although certainly sufficient to represent complex solutions, are typically at a disadvantage as we work through the
problem-solving process
...
1
...
2 Why Study Data Structures and Abstract Data Types?
To manage the complexity of problems and the problem-solving process, computer scientists
use abstractions to allow them to focus on the “big picture” without getting lost in the details
...
These models allow us to describe the data that our algorithms will
manipulate in a much more consistent way with respect to the problem itself
...
We now turn our attention to a
similar idea, that of data abstraction
...
This means that we are concerned only with what the data
is representing and not with how it will eventually be constructed
...
The idea is that by encapsulating
the details of the implementation, we are hiding them from the user’s view
...
Figure 1
...
The user
interacts with the interface, using the operations that have been specified by the abstract data
type
...
The implementation is
hidden one level deeper
...
The implementation of an abstract data type, often referred to as a data structure, will require
that we provide a physical view of the data using some collection of programming constructs
and primitive data types
...
Introduction
Problem Solving with Algorithms and Data Structures, Release 3
...
2: Abstract Data Type
allow us to define the complex data models for our problems without giving any indication
as to the details of how the model will actually be built
...
Since there will usually be many different ways to implement
an abstract data type, this implementation independence allows the programmer to switch the
details of the implementation without changing the way the user of the data interacts with it
...
1
...
3 Why Study Algorithms?
Computer scientists learn by experience
...
Being exposed to different problem-solving techniques and
seeing how different algorithms are designed helps us to take on the next challenging problem
that we are given
...
Algorithms are often quite different from one another
...
It is entirely possible that there are many different ways to implement the details to
compute the square root function
...
One algorithm might take 10 times as long to return the result as the other
...
Even though they both work, one is perhaps
“better” than the other
...
As we study algorithms, we can learn analysis techniques that
allow us to compare and contrast solutions based solely on their own characteristics, not the
characteristics of the program or computer used to implement them
...
It is important to be able
to distinguish between those problems that have solutions, those that do not, and those where
solutions exist but require too much time or other resources to work reasonably
...
As computer
scientists, in addition to our ability to solve problems, we will also need to know and understand
1
...
What Is Computer Science?
7
Problem Solving with Algorithms and Data Structures, Release 3
...
In the end, there are often many ways to solve a problem
...
1
...
If you are new to Python or find that
you need more information about any of the topics presented, we recommend that you consult
a resource such as the Python Language Reference or a Python Tutorial
...
Python is a modern, easy-to-learn, object-oriented programming language
...
Since Python is an interpreted
language, it is most easily reviewed by simply looking at and describing interactive sessions
...
For example,
>>> print("Algorithms and Data Structures")
Algorithms and Data Structures
>>>
shows the prompt, the print function, the result, and the next prompt
...
4
...
This means
that Python considers data to be the focal point of the problem-solving process
...
Classes are
analogous to abstract data types because a user of a class only sees the state and behavior of
a data item
...
An object is an
instance of a class
...
Python has two main built-in
numeric classes that implement the integer and floating point data types
...
The standard arithmetic operations, +, −, *, /, and ** (exponentiation), can be used with parentheses forcing the order of operations away from normal operator
precedence
...
Note that when two integers are divided, the result is a floating point
...
8
Chapter 1
...
0
Operation Name
less than
greater than
less than or equal
greater than or equal
equal
not equal
logical and
logical or
logical not
Operator
<
>
<=
>=
==
=!
and
or
not
Explanation
Less than operator
Greater than operator
Less than or equal to operator
Greater than or equal to operator
Equality operator
Not equal operator
Both operands True for result to be True
Either operand True for result to be True
Negates the truth value: False becomes
True, True becomes False
Table 1
...
0
print(7/3) #2
...
5
print(3//6) #0
print(3%6) #3
print(2**100) # 1267650600228229401496703205376
The boolean data type, implemented as the Python bool class, will be quite useful for
representing truth values
...
>>> True
True
>>> False
False
>>> False or True
True
>>> not (False or True)
False
>>> True and True
True
Boolean data objects are also used as results for comparison operators such as equality (==)
and greater than (>)
...
Table 1
...
print(5 == 10)
print(10 > 5)
1
...
Review of Basic Python
9
Problem Solving with Algorithms and Data Structures, Release 3
...
3: Variables Hold References to Data Objects
Figure 1
...
In Python, identifiers start with a
letter or an underscore (_), are case sensitive, and can be of any length
...
A Python variable is created when a name is used for the first time on the left-hand side of
an assignment statement
...
The variable will hold a reference to a piece of data and not the data itself
...
3)
...
At this point in our example, the type of the variable is integer as that is
the type of the data currently being referred to by “the_sum
...
4), as shown above with the boolean value True, so does the type of the variable
(the_sum is now of the type boolean)
...
This is a dynamic characteristic of Python
...
10
Chapter 1
...
0
Operation Name
indexing
concatenation
repetition
membership
length
slicing
Operator
[ ]
+
*
in
len
[ : ]
Explanation
Access an element of a sequence
Combine sequences together
Concatenate a repeated number of times
Ask whether an item is in a sequence
Ask the number of items in the sequence
Extract a part of a sequence
Table 1
...
Lists, strings, and tuples are ordered collections that are very similar in
general structure but have specific differences that must be understood for them to be used
properly
...
A list is an ordered collection of zero or more references to Python data objects
...
The empty list is simply [ ]
...
The following fragment shows a variety
of Python data objects in a list
...
5]
3, True, 6
...
5]
my_list
3, True, 6
...
However, in order to remember
the list for later processing, its reference needs to be assigned to a variable
...
Table 1
...
Note that the indices for lists (sequences) start counting with 0
...
Sometimes, you will want to initialize a list
...
For example,
>>> my_list = [0] * 6
>>> my_list
[0, 0, 0, 0, 0, 0]
One very important aside relating to the repetition operator is that the result is a repetition
of references to the data objects in the sequence
...
4
...
0
Method Name
Use
append
insert
pop
pop
sort
reverse
del
index
count
remove
a_list
...
insert(i,item)
a_list
...
pop(i)
a_list
...
reverse()
del a_list[i]
a_list
...
count(item)
a_list
...
3: Methods Provided by Lists in Python
following session:
my_list = [1,2,3,4]
A = [my_list]*3
print(A)
my_list[2]=45
print(A)
The variable A holds a collection of three references to the original list called my_list
...
Lists support a number of methods that will be used to build data structures
...
3 provides
a summary
...
my_list = [1024, 3, True, 6
...
append(False)
print(my_list)
my_list
...
5)
print(my_list)
print(my_list
...
pop(1))
print(my_list)
my_list
...
sort()
print(my_list)
my_list
...
count(6
...
index(4
...
remove(6
...
Introduction
Problem Solving with Algorithms and Data Structures, Release 3
...
Others, such as reverse, simply modify the list with no return value
...
The index range starting
from 0 is again used for these methods
...
my_list
...
” Even simple
data objects such as integers can invoke methods in this way
...
__add__(21)
75
>>>
In this fragment we are asking the integer object 54 to execute its add method (called __add__
in Python) and passing it 21 as the value to add
...
Of course, we usually
write this as 54 + 21
...
One common Python function that is often discussed in conjunction with lists is the range
function
...
By using the
list function, it is possible to see the value of the range object as a list
...
>>> range(10)
range(0, 10)
>>> list(range(10))
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> range(5,10)
range(5, 10)
>>> list(range(5,10))
[5, 6, 7, 8, 9]
>>> list(range(5,10,2))
[5, 7, 9]
>>> list(range(10,1,-1))
[10, 9, 8, 7, 6, 5, 4, 3, 2]
>>>
The range object represents a sequence of integers
...
If you
provide more parameters, it will start and end at particular points and can even skip items
...
In our second example, range(5, 10) starts at 5 and goes up to but not including 10
...
Strings are sequential collections of zero or more letters, numbers and other symbols
...
Literal string values are differentiated
from identifiers by using quotation marks (either single or double)
...
4
...
0
Method Name
Use
center
count
a_string
...
count(item)
ljust
a_string
...
lower()
a_string
...
find(item)
split
a_string
...
4: Methods Provided by Strings in Python
'i'
>>> my_name*2
'DavidDavid'
>>> len(my_name)
5
>>>
Since strings are sequences, all of the sequence operations described above work as you would
expect
...
4
...
upper()
'DAVID'
>>> my_name
...
find('v')
2
>>> my_name
...
split will take a string and return
a list of strings using the split character as a division point
...
If no division is specified, the split method looks for whitespace characters such as tab,
newline and space
...
This is referred to as mutability
...
For example, you
can change an item in a list by using indexing and assignment
...
14
Chapter 1
...
0
>>> my_list
[1, 3, True, 6
...
5]
>>> my_name
'David'
>>> my_name[0]='X'
Traceback (most recent call last):
File "
my_name[0]='X'
TypeError: 'str' object does not support item assignment
>>>
Note that the error (or traceback) message displayed above is obtained on a Mac OS X machine
...
>>> my_name[0]='X'
Traceback (most recent call last):
File "
TypeError: object doesn't support item assignment
>>>
Depending on your operating system, or version of Python, the output may slightly vary
...
You may want to experiment for yourself
and get acquainted with the error message for easier and faster debugging
...
Tuples are very similar to lists in that they are heterogeneous sequences of data
...
A tuple cannot be changed
...
As sequences, they can use any operation
described above
...
96)
my_tuple
True, 4
...
96, 2, True, 4
...
96)
my_tuple[0:2]
True)
However, if you try to change an item in a tuple, you will get an error
...
1
...
Review of Basic Python
15
Problem Solving with Algorithms and Data Structures, Release 3
...
in(set)
len(set)
set1 | set2
set1 & set2
set1 - set2
set1 <= set2
Explanation
Set membership
Returns the cardinality (i
...
the length) of the set
Returns a new set with all elements from both sets
Returns a new set with only the elements common to both sets
Returns a new set with all items from the first set not in second
Asks whether all elements of the first set are in the second
Table 1
...
Sets do not
allow duplicates and are written as comma-delimited values enclosed in curly braces
...
Sets are heterogeneous, and the collection can be assigned
to a variable as below
...
5,False}
{False, 4
...
5,False}
>>> my_set
{False, 3, 4
...
Table 1
...
>>> my_set
{False, 3, 4
...
Table 1
...
Examples of their use follow
...
16
Chapter 1
...
0
Method Name
Use
union
set1
...
intersection(set2) Returns a new set with only the elements
common to both sets
difference
set1
...
issubset(set2)
Asks whether all elements of one set are in
the other
add
set
...
remove(item)
Removes item from the set
pop
set
...
clear()
Removes all elements from the set
Table 1
...
5, 6, 'cat'}
>>> your_set = {99,3,100}
>>> my_set
...
5, 6, 99, 'cat', 100}
>>> my_set | your_set
{False, 3, 4
...
intersection(your_set)
{3}
>>> my_set & your_set
{3}
>>> my_set
...
5, 6, 'cat'}
>>> my_set - your_set
{False, 4
...
issubset(your_set)
True
>>> {3,100} <= your_set
True
>>> my_set
...
5, 6, 'house', 'cat'}
>>> my_set
...
5)
>>> my_set
{False, 3, 6, 'house', 'cat'}
>>> my_set
...
clear()
>>> my_set
set()
>>>
1
...
Review of Basic Python
17
Problem Solving with Algorithms and Data Structures, Release 3
...
7: Operators Provided by Dictionaries in Python
Our final Python collection is an unordered structure called a dictionary
...
This
key-value pair is typically written as key:value
...
For example,
>>> capitals = {'Iowa':'DesMoines','Wisconsin':'Madison'}
>>> capitals
{'Wisconsin': 'Madison', 'Iowa': 'DesMoines'}
>>>
We can manipulate a dictionary by accessing a value via its key or by adding another key-value
pair
...
To add a new value is similar
...
The first pair added ('Utah': 'SaltLakeCity') was placed first in the dictionary
and the second pair added ('California': 'Sacramento') was placed last
...
We also show the length function performing the same role as with previous
collections
...
Table 1
...
8 describe them, and the
session shows them in action
...
You can use the list function to convert them to lists
...
If the key is not present in the
dictionary, get will return None
...
>>> phone_ext={'david':1410, 'brad':1137}
>>> phone_ext
{'brad': 1137, 'david': 1410}
>>> phone_ext
...
Introduction
Problem Solving with Algorithms and Data Structures, Release 3
...
values()
Returns the values of the dictionary in a
dict_values object
my_dict
...
get(k)
Returns the value associated with 𝑘, None otherwise
my_dict
...
keys()
Table 1
...
keys())
['brad', 'david']
>>> "brad" in phone_ext
>>> True
>>> 1137 in phone_ext
>>> False
# 1137 is not a key in phone_ext
>>> phone_ext
...
values())
[1137, 1410]
>>> phone_ext
...
items())
[('brad', 1137), ('david', 1410)]
>>> phone_ext
...
get("kent","NO ENTRY")
'NO ENTRY'
>>> del phone_ext["david"]
>>> phone_ext
{'brad': 1137}
>>>
1
...
2 Input and Output
We often have a need to interact with users, either to get data or to provide some sort of result
...
While Python does have a way to create dialog boxes, there is a much simpler function
that we can use
...
The function is called input
...
This string is often called
the prompt because it contains some helpful text prompting the user to enter something
...
4
...
0
user_name = input('Please enter your name: ')
Now whatever the user types after the prompt will be stored in the user_name variable
...
For example, in the following two
statements, the first asks the user for their name and the second prints the result of some simple
processing based on the string that is provided
...
upper(),
"and has length", len(user_name))
It is important to note that the value returned from the input function will be a string
representing the exact characters that were entered after the prompt
...
In the statements
below, the string that is entered by the user is converted to a float so that it can be used in
further arithmetic processing
...
print takes zero or more parameters and displays them using a single
blank as the default separator
...
In addition, each print ends with a newline character by default
...
These variations are shown in the following
session:
>>> print("Hello")
Hello
>>> print("Hello","World")
Hello World
>>> print("Hello","World", sep="***")
Hello***World
>>> print("Hello","World", end="***")
Hello World***
>>> print("Hello", end="***"); print("World")
Hello***World
>>>
It is often useful to have more control over the look of your output
...
A formatted string is a template in
20
Chapter 1
...
0
Character Output Format
d,i
Integer
u
Unsigned Integer
f
Floating point as m
...
ddddde+/-xx
E
Floating point as m
...
9: String Formatting Conversion Characters
which words or spaces that will remain constant are combined with placeholders for variables
that will be inserted into the string
...
")
contains the words is and years old, but the name and the age will change depending on
the variable values at the time of execution
...
" % (name, age))
This simple example illustrates a new string expression
...
The left side of the expression holds the template or format string,
and the right side holds a collection of values that will be substituted into the format string
...
Values are taken in order, left to right from the collection
and inserted into the format string
...
The format string may
contain one or more conversion specifications
...
In the example
above, the %s specifies a string, while the %d specifies an integer
...
Table 1
...
In addition to the format character, you can also include a format modifier between the % and
the format character
...
Modifiers can also be used to specify the field width along with a
number of digits after the decimal point
...
10 explains these format modifiers
...
The collection will be either a tuple or a dictionary
...
That is, the first element in the tuple corresponds
to the first format character in the format string
...
In this case all format characters must use the (name)
1
...
Review of Basic Python
21
Problem Solving with Algorithms and Data Structures, Release 3
...
(name)
%-20d
%+20d
%020d
%20
...
Get the value from the supplied dictionary using name as the key
...
10: Additional formatting options
modifier to specify the name of the key
...
2f cents"%(item,price))
banana costs 24
...
2f cents"%(item,price))
banana costs 24
...
1f cents"%item_dict)
banana costs 24
...
More about these features can be found in the Python
library reference manual
...
4
...
Both of these are supported by Python in various forms
...
For iteration, Python provides a standard while statement and a very powerful for statement
...
For example,
>>> counter = 1
>>> while counter <= 5:
print("Hello, world")
counter = counter + 1
Hello, world
Hello, world
22
Chapter 1
...
0
Hello, world
Hello, world
Hello, world
>>>
prints out the phrase “Hello, world” five times
...
If the condition is True, the body of the statement will
execute
...
The while statement is a very general purpose iterative structure that we will use in a number
of different algorithms
...
A
fragment such as
while counter <= 10 and not done:
...
The value of the variable counter would need to be less than or equal to
10 and the value of the variable done would need to be False (not False is True) so that
True and True results in True
...
The for statement can be used to iterate over the members of a collection, so long
as the collection is a sequence
...
The body of the
iteration is then executed
...
A common use of the for statement is to implement definite iteration over a range of values
...
4
...
0
4
9
16
>>>
will perform the print function five times
...
This value is then squared and printed
...
The following code fragment iterates over a list of strings and for each string processes
each character by appending it to a list
...
word_list = ['cat','dog','rabbit']
letter_list = [ ]
for a_word in word_list:
for a_letter in a_word:
letter_list
...
Most programming languages provide two versions of this useful construct:
the ifelse and the if
...
if n < 0:
print("Sorry, value is negative")
else:
print(math
...
If it is, a
message is printed stating that it is negative
...
Selection constructs, as with any control construct, can be nested so that the result of one
question helps decide whether to ask the next
...
if score >= 90:
print('A')
else:
if score >= 80:
print('B')
else:
if score >= 70:
print('C')
else:
if score >= 60:
print('D')
else:
24
Chapter 1
...
0
print('F')
Python also has a single way selection construct, the if statement
...
In the case where the condition is false, processing
simply continues on to the next statement after the if
...
If it is, then it is modified by the
absolute value function
...
if n < 0:
n = abs(n)
print(math
...
The is known as a list comprehension
...
For example, if we would like to create a list of the first 10 perfect squares, we could use a for statement:
>>> sq_list = []
>>> for x in range(1, 11):
sq_list
...
The value
of x * x is then computed and added to the list that is being constructed
...
For example,
>>> sq_list = [x * x for x in range(1, 11) if x % 2 != 0]
>>> sq_list
[1, 9, 25, 49, 81]
>>>
This list comprehension constructed a list that only contained the squares of the odd numbers
in the range from 1 to 10
...
>>>[ch
...
4
...
0
['C', 'M', 'P', 'R', 'H', 'N', 'S', 'N']
>>>
Self Check
Test your understanding of what we have covered so far by trying the following two exercises
...
word_list = ['cat','dog','rabbit']
letter_list = [ ]
for a_word in word_list:
for a_letter in a_word:
letter_list
...
Modify the given code so that the final list only contains a single copy of each letter
...
Redo the given code using list comprehensions
...
# the answer is: ['c', 'a', 't', 'd', 'o', 'g', 'r', 'a',
'b', 'b', 'i', 't']
1
...
4 Exception Handling
There are two types of errors that typically occur when writing programs
...
For example, it is incorrect to write a for statement and forget the
colon
...
Syntax errors are usually
more frequent when you are first learning a language
...
This can be due to an error in the underlying algorithm or an error in
your translation of that algorithm
...
In this case, the logic error leads to a runtime error that causes
the program to terminate
...
26
Chapter 1
...
0
Most of the time, beginning programmers simply think of exceptions as fatal runtime errors
that cause the end of execution
...
In addition, programmers can create their own exceptions if they detect a situation in
the program execution that warrants it
...
” You can “handle” the exception
that has been raised by using a try statement
...
If the user enters a value that is greater than or equal to 0, the print will show the square root
...
>>> a_number = int(input("Please enter an integer "))
Please enter an integer -23
>>> print(math
...
sqrt(a_number))
ValueError: math domain error
>>>
We can handle this exception by calling the print function from within a try block
...
For example:
>>> try:
print(math
...
sqrt(abs(a_number)))
Bad Value for square root
Using absolute value instead
4
...
This means that the program will not terminate but instead will continue on
to the next statements
...
For example, instead of calling the square root function with a negative number, we could have
checked the value first and then raised our own exception
...
Note that the program would still
terminate but now the exception that caused the termination is something explicitly created by
1
...
Review of Basic Python
27
Problem Solving with Algorithms and Data Structures, Release 3
...
>>> if a_number < 0:
...
else:
...
sqrt(a_number))
...
See the Python reference manual for a list of all the available exception types and for
how to create your own
...
4
...
In general, we can hide the details of
any computation by defining a function
...
It may also explicitly return a value
...
>>> def square(n):
...
>>> square(3)
9
>>> square(square(3))
81
>>>
The syntax for this function definition includes the name, square, and a parenthesized list
of formal parameters
...
The details, hidden “inside the box,”
simply compute the result of n ** 2 and return it
...
Note that the call to square returns an integer that can in turn be passed to another
invocation
...
” Newton’s Method for approximating square roots performs an iterative
computation that converges on the correct value
...
Introduction
Problem Solving with Algorithms and Data Structures, Release 3
...
The initial guess used here is 2𝑛
...
1 shows a
function definition that accepts a value 𝑛 and returns the square root of 𝑛 after making 20
guesses
...
Listing 1
...
Any
characters that follow the # on a line are ignored
...
1: square_root Function
def square_root(n):
root = n / 2 #initial guess will be 1/2 of n
for k in range(20):
root = (1 / 2) * (root + (n / root))
return root
>>>square_root(9)
3
...
549981495186216
>>>
Self Check
Here is a self check that really covers everything so far
...
Well, suppose we replace a monkey with a Python function
...
The
way we will simulate this is to write a function that generates a string that is 27 characters long
by choosing random letters from the 26 letters in the alphabet plus the space
...
A third function will repeatedly call generate and score, then if 100% of the letters are correct
we are done
...
To make
it easier to follow your program’s progress this third function should print out the best string
generated so far and its score every 1000 tries
...
This is a type of algorithm in the
1
...
Review of Basic Python
29
Problem Solving with Algorithms and Data Structures, Release 3
...
1
...
6 Object-Oriented Programming in Python: Defining Classes
We stated earlier that Python is an object-oriented programming language
...
One of the
most powerful features in an object-oriented programming language is the ability to allow a
programmer (problem solver) to create new classes that model data that is needed to solve the
problem
...
By building a class that implements
an abstract data type, a programmer can take advantage of the abstraction process and at the
same time provide the details necessary to actually use the abstraction in a program
...
A Fraction Class
A very common example to show the details of implementing a user-defined class is to construct
a class to implement the abstract data type Fraction
...
There are times, however, that it would be
most appropriate to be able to create data objects that “look like” fractions
...
The top value, known as the numerator, can be
5
any integer
...
Although it is possible to create a floating point
approximation for any fraction, in this case we would like to represent the fraction as an exact
value
...
We need to be able to add, subtract, multiply, and divide fractions
...
In
5
addition, all fraction methods should return results in their lowest terms so that no matter what
computation is performed, we always end up with the most common form
...
For this example,
class Fraction:
#the methods go here
provides the framework for us to define the methods
...
The constructor defines the way in which data objects are created
...
In Python, the constructor method is always called __init__(two
30
Chapter 1
...
0
Figure 1
...
2
...
2: Fraction Class and its constructor
class Fraction:
def __init__(self,top,bottom):
self
...
den = bottom
Notice that the formal parameter list contains three items (self, top, bottom)
...
It must
always be the first formal parameter; however, it will never be given an actual parameter value
upon invocation
...
The notation self
...
Likewise, self
...
The values of the two formal parameters are initially assigned to the state,
allowing the new fraction object to know its starting value
...
This happens
by using the name of the class and passing actual values for the necessary state (note that we
never directly invoke __init__)
...
3
5
(three-fifths)
...
5
The next thing we need to do is implement the behavior that the abstract data type requires
...
>>> my_f = Fraction(3, 5)
>>> print(my_f)
1
...
Review of Basic Python
31
Problem Solving with Algorithms and Data Structures, Release 3
...
Fraction object at 0x409b1acc>
The Fraction object, my_f, does not know how to respond to this request to print
...
The only choice my_f has is to show the actual reference that is stored in the variable
(the address itself)
...
There are two ways we can solve this problem
...
We can implement this method as shown
in Listing 1
...
If we create a Fraction object as before, we can ask it to show itself, in other
words, print itself in the proper format
...
In order
to make printing work properly, we need to tell the Fraction class how to convert itself into
a string
...
Listing 1
...
num, "/", self
...
show()
3 / 5
>>> print(my_f)
<__main__
...
One of these, __str__, is the method to convert an object into a string
...
What we need to do is provide a “better” implementation for this method
...
To do this, we simply define a method with the name __str__ and give it a new implementation as shown in Listing 1
...
This definition does not need any other information except the
special parameter self
...
The resulting string will be returned any time a Fraction object
is asked to convert itself to a string
...
Listing 1
...
num) + "/" + str(self
...
__str__()
32
Chapter 1
...
0
'3/5'
>>> str(my_f)
'3/5'
>>>
We can override many other methods for our new Fraction class
...
We would like to be able to create two
Fraction objects and then add them together using the standard “+” notation
...
We can fix this by providing the Fraction class with a
method that overrides the addition method
...
The first, self, is always needed, and the second represents the other
operand in the expression
...
__add__(f2)
would ask the Fraction object f1 to add the Fraction object f2 to itself
...
Two fractions must have the same denominator to be added
...
5
...
We can use this method by writing
a standard arithmetic expression involving fractions, assigning the result of the addition, and
then printing our result
...
5: Adding Fractions
def __add__(self, other_fraction):
new_num = self
...
den +
self
...
num
new_den = self
...
den
1
...
Review of Basic Python
33
Problem Solving with Algorithms and Data Structures, Release 3
...
Note that 6 is the correct
8
result ( 1 + 12) but that it is not in the “lowest terms” representation
...
In order to be sure that our results are always in the lowest terms, we need a helper
4
function that knows how to reduce fractions
...
We can then divide the numerator and the denominator by the GCD
and the result will be reduced to lowest terms
...
Euclid’s Algorithm states that the greatest common divisor of two integers 𝑚 and 𝑛 is 𝑛 if
𝑛 divides 𝑚 evenly
...
We will simply provide an iterative
implementation here
...
This is acceptable for our fraction class because we have said that
a negative fraction will be represented by a negative numerator
...
To put a fraction in lowest terms,
we will divide the numerator and the denominator by their greatest common divisor
...
Dividing the top and the bottom by 2 creates
8
a new fraction, 3 (see Listing 1
...
4
Listing 1
...
num*other_fraction
...
den*other_fraction
...
den * other_fraction
...
Introduction
Problem Solving with Algorithms and Data Structures, Release 3
...
6: An Instance of the Fraction Class with Two Methods
>>>
>>>
>>>
>>>
3/4
>>>
f1 = Fraction(1, 4)
f2 = Fraction(1, 2)
f3 = f1 + f2
print(f3)
Our Fraction object now has two very useful methods and looks like Figure 1
...
An additional group of methods that we need to include in our example Fraction class will allow
two fractions to compare themselves to one another
...
f1 == f2 will only be True if they are references to the same object
...
This is called shallow equality (see Figure 1
...
We can create deep equality (see Figure 1
...
The __eq__ method is another standard method
available in any class
...
In the Fraction class, we can implement the __eq__ method by again putting the two
fractions in common terms and then comparing the numerators (see Listing 1
...
It is
important to note that there are other relational operators that can be overridden
...
Listing 1
...
num * other
...
num * self
...
We leave the remaining
arithmetic and relational methods as exercises
...
4
...
0
Figure 1
...
Introduction
Problem Solving with Algorithms and Data Structures, Release 3
...
num = top
self
...
num) + "/" + str(self
...
num, "/", self
...
num * other_fraction
...
den * other_fraction
...
den * other_fraction
...
num * other
...
num * self
...
Also implement comparison operators > and <
...
4
...
0
1
...
• Computer science uses abstraction as a tool for representing both processes and data
...
• Python is a powerful, yet easy-to-use, object-oriented language
...
• Dictionaries and sets are nonsequential collections of data
...
• Programmers can override standard methods as well as create new methods
...
1
...
7 Programming Exercises
1
...
2
...
Modify the constructor for the Fraction class so that GCD is used to
reduce fractions immediately
...
Make the necessary modifications
...
Implement the remaining simple arithmetic operators (__sub__, __mul__, and
__truediv__)
...
Implement the remaining relational operators (__gt__, __ge__, __lt__, __le__, and
__ne__)
38
Chapter 1
...
0
5
...
If either is not an integer the constructor
should raise an exception
...
In the definition of fractions we assumed that negative fractions have a negative numerator and a positive denominator
...
In general, this is an unnecessary constraint
...
7
...
How does it differ from __add__? When is it used?
Implement __radd__
...
Repeat the last question but this time consider the __iadd__ method
...
Research the __repr__ method
...
10
...
Now design a class to represent a deck of
cards
...
11
...
Write a program to solve the puzzle
...
7
...
0
40
Chapter 1
...
1 Objectives
• To understand why algorithm analysis is important
...
• To understand the “Big-O” execution time of common operations on Python lists and
dictionaries
...
• To understand how to benchmark simple Python programs
...
2 What Is Algorithm Analysis?
It is very common for beginning computer science students to compare their programs with
one another
...
An interesting question often arises
...
As we stated
in Chapter 1, an algorithm is a generic, step-by-step list of instructions for solving a problem
...
A program, on the other hand, is an algorithm that has
been encoded into some programming language
...
To explore this difference further, consider the function shown below
...
The algorithm uses the idea of an
accumulator variable that is initialized to 0
...
def sum_of_n(n):
the_sum = 0
for i in range(1,n+1):
41
Problem Solving with Algorithms and Data Structures, Release 3
...
At first glance it may look strange, but upon further
inspection you can see that this function is essentially doing the same thing as the previous
one
...
We did not use good identifier names to
assist with readability, and we used an extra assignment statement during the accumulation
step that was not really necessary
...
The answer
depends on your criteria
...
In fact, you have probably seen many examples of this
in your introductory programming course since one of the goals there is to help you write
programs that are easy to read and easy to understand
...
(We certainly hope that you will continue to
strive to write readable, understandable code
...
We want to be able to consider two algorithms and
say that one is better than the other because it is more efficient in its use of those resources
or perhaps because it simply uses fewer
...
They both use essentially the same algorithm to solve the summation problem
...
There are two different ways to look at this
...
The amount of space required by a
problem solution is typically dictated by the problem instance itself
...
As an alternative to space requirements, we can analyze and compare algorithms based on
the amount of time they require to execute
...
One way we can measure the execution
time for the function sum_of_n is to do a benchmark analysis
...
In Python, we can benchmark a
function by noting the starting time and ending time with respect to the system we are using
...
Algorithm Analysis
Problem Solving with Algorithms and Data Structures, Release 3
...
By calling this function twice, at the beginning
and at the end, and then computing the difference, we can get an exact number of seconds
(fractions in most cases) for execution
...
time()
the_sum = 0
for i in range(1, n+1):
the_sum = the_sum + i
end = time
...
The function returns a tuple consisting of the result and the amount of
time (in seconds) required for the calculation
...
7f seconds" % sum_of_n_2(10000))
Sum is 50005000 required 0
...
0018620 seconds
Sum is 50005000 required 0
...
0019162 seconds
Sum is 50005000 required 0
...
For n equal to 1, 000, 000 we get:
>>>for i in range(5):
print("Sum is %d required %10
...
1948988 seconds
Sum is 500000500000 required 0
...
1809771 seconds
Sum is 500000500000 required 0
...
1646299 seconds
>>>
In this case, the average again turns out to be about 10 times the previous
...
This function, sum_of_n_3, takes advantage of a closed equation 𝑖=0 𝑖 = (𝑛)(𝑛+1)
2
2
...
What Is Algorithm Analysis?
43
Problem Solving with Algorithms and Data Structures, Release 3
...
def sum_of_n_3(n):
return (n * (n + 1)) / 2
print(sum_of_n_3(10))
we do the same benchmark measurement for sum_of_n_3, using five different values for n
(10, 000, 100, 000, 1, 000, 000, 10, 000, 000, and 100, 000, 000), we get the following results:
Sum
Sum
Sum
Sum
Sum
is
is
is
is
is
50005000 required 0
...
00000191 seconds
500000500000 required 0
...
00000095 seconds
5000000050000000 required 0
...
First, the times recorded above are
shorter than any of the previous examples
...
It appears that sum_of_n_3 is hardly impacted by the number of integers being
added
...
This is likely the
reason it is taking longer
...
However, there is a problem
...
It could take even longer to perform sum_of_n_3 if the computer were older
...
The
benchmark technique computes the actual time to execute
...
Instead, we would like to have a characterization that
is independent of the program or computer being used
...
2
...
1 Big-O Notation
When trying to characterize an algorithm’s efficiency in terms of execution time,independent
of any particular program or computer, it is important to quantify the number of operations or
steps that the algorithm will require
...
Deciding on an appropriate basic unit of computation can be a
complicated problem and will depend on how the algorithm is implemented
...
In the function
sum_of_n, the number of assignment statements is 1 (the_sum= 0) plus the value of n (the
number of times we perform the_sum=the_sum+𝑖)
...
Algorithm Analysis
Problem Solving with Algorithms and Data Structures, Release 3
...
The parameter 𝑛 is often referred to as the “size of the problem,” and
we can read this as “𝑇 (𝑛) is the time it takes to solve a problem of size 𝑛, namely 1 + 𝑛 steps
...
We can then say that the sum of the first 100, 000
integers is a bigger instance of the summation problem than the sum of the first 1, 000
...
Our goal then is to show how the algorithm’s execution time changes
with respect to the size of the problem
...
It turns out that the
exact number of operations is not as important as determining the most dominant part of the
𝑇 (𝑛) function
...
This dominant term is what, in the end, is used for comparison
...
Order of magnitude is often called Big-O notation (for “order”) and written as
𝑂(𝑓 (𝑛))
...
The function 𝑓 (𝑛) provides a simple representation of the dominant part of the original 𝑇 (𝑛)
...
As 𝑛 gets large, the constant 1 will become less and
less significant to the final result
...
It is important to note that the 1
is certainly significant for 𝑇 (𝑛)
...
As another example, suppose that for some algorithm, the exact number of steps is 𝑇 (𝑛) =
5𝑛2 + 27𝑛 + 1005
...
However, as 𝑛 gets larger, the 𝑛2 term becomes the most important
...
Again, to approximate 𝑇 (𝑛) as 𝑛 gets large, we can ignore the
other terms and focus on 5𝑛2
...
We would say then that the function 𝑇 (𝑛) has an order of magnitude 𝑓 (𝑛) = 𝑛2 , or
simply that it is 𝑂(𝑛2 )
...
For these kinds of algorithms we need to characterize their performance in terms of best case,
worst case, or average case performance
...
Whereas a different data set for the
exact same algorithm might have extraordinarily good performance
...
It is important
for a computer scientist to understand these distinctions so they are not misled by one particular
case
...
These are shown in Table 2
...
In order to decide which of these functions is
the dominant part of any 𝑇 (𝑛) function, we must see how they compare with one another as 𝑛
gets large
...
1 shows graphs of the common functions from Table 2
...
Notice that when 𝑛 is small,
2
...
What Is Algorithm Analysis?
45
Problem Solving with Algorithms and Data Structures, Release 3
...
1: Common Functions for Big-O
Figure 2
...
It is hard to tell which is
dominant
...
As a final example, suppose that we have the fragment of Python code shown below
...
a = 5
b = 6
c = 10
for i in range(n):
for j in range(n):
x = i * i
y = j * j
46
Chapter 2
...
0
Figure 2
...
The first term is the
constant 3, representing the three assignment statements at the start of the fragment
...
The third term is 2𝑛, two statements iterated 𝑛 times
...
This gives us
𝑇 (𝑛) = 3 + 3𝑛2 + 2𝑛 + 1 = 3𝑛2 + 2𝑛 + 4
...
Note that all of
the other terms as well as the coefficient on the dominant term can be ignored as 𝑛 grows larger
...
2 shows a few of the common Big-O functions as they compare with the 𝑇 (𝑛) function
discussed above
...
However, as 𝑛 grows,
the cubic function quickly overtakes 𝑇 (𝑛)
...
Self Check
Write two Python functions to find the minimum number in a list
...
𝑂(𝑛2 )
...
2
...
What Is Algorithm Analysis?
47
Problem Solving with Algorithms and Data Structures, Release 3
...
2
...
One string is an anagram of another if the second
is simply a rearrangement of the first
...
The
strings 'python' and 'typhon' are anagrams as well
...
Our goal is to write a boolean function that
will take two strings and return whether they are anagrams
...
If it is possible to “checkoff” each character, then the
two strings must be anagrams
...
However, since strings in Python are immutable, the first
step in the process will be to convert the second string to a list
...
def anagram_solution1(s1,s2):
a_list = list(s2)
pos1 = 0
still_ok = True
while pos1 < len(s1) and still_ok:
pos2 = 0
found = False
while pos2 < len(a_list) and not found:
if s1[pos1] == a_list[pos2]:
found = True
else:
pos2 = pos2 + 1
if found:
a_list[pos2] = None
else:
still_ok = False
pos1 = pos1 + 1
return still_ok
print(anagram_solution1('abcd','dcba'))
To analyze this algorithm, we need to note that each of the 𝑛 characters in s1 will cause an
iteration through up to 𝑛 characters in the list from s2
...
The number of visits then becomes the sum of
48
Chapter 2
...
0
the integers from 1 to 𝑛
...
1
2
can be ignored
...
So, if we
begin by sorting each string alphabetically, from a to z, we will end up with the same string if
the original two strings are anagrams
...
Again, in Python
we can use the built-in sort method on lists by simply converting each string to a list at the
start
...
sort()
a_list2
...
However, the two calls to the
Python sort method are not without their own cost
...
In the
end, this algorithm will have the same order of magnitude as that of the sorting process
...
2
...
For
the anagram detection problem, we can simply generate a list of all possible strings using the
2
...
What Is Algorithm Analysis?
49
Problem Solving with Algorithms and Data Structures, Release 3
...
However, there is a difficulty with this approach
...
The total number of candidate
strings is 𝑛 * (𝑛 − 1) * (𝑛 − 2) * · · · * 3 * 2 * 1, which is 𝑛!
...
It turns out that 𝑛! grows even faster than 2 𝑛 as 𝑛 gets large
...
If we processed one possibility every second, it would still take us 77, 146, 816, 596 years to go through
the entire list
...
2
...
4 Solution 4: Count and Compare
Our final solution to the anagram problem takes advantage of the fact that any two anagrams
will have the same number of a’s, the same number of b’s, the same number of c’s, and so on
...
Since there are 26 possible characters, we can use a list of 26 counters,
one for each possible character
...
In the end, if the two lists of counters are identical, the strings must be
anagrams
...
However, unlike the first solution, none of them
are nested
...
The third
iteration, comparing the two lists of counts, always takes 26 steps since there are 26 possible
50
Chapter 2
...
0
characters in the strings
...
That is 𝑂(𝑛)
...
Before leaving this example, we need to say something about space requirements
...
In other words, this algorithm sacrificed space in order
to gain time
...
On many occasions you will need to make decisions between
time and space trade-offs
...
However,
if the underlying alphabet had millions of characters, there would be more concern
...
Self Check
Q-1: Given the following code fragment, what is its Big-O running time?
test = 0
for i in range(n):
for j in range(n):
test = test + i * j
1
...
𝑂(𝑛2 )
3
...
𝑂(𝑛3 )
Q-2: Given the following code fragment what is its Big-O running time?
test = 0
for i in range(n):
test = test + 1
for j in range(n):
test = test - 1
1
...
𝑂(𝑛2 )
3
...
𝑂(𝑛3 )
Q-3: Given the following code fragment what is its Big-O running time?
i = n
while i > 0:
k = 2 + 2
i = i // 2
2
...
What Is Algorithm Analysis?
51
Problem Solving with Algorithms and Data Structures, Release 3
...
𝑂(𝑛)
2
...
𝑂(log 𝑛)
4
...
3 Performance of Python Data Structures
Now that you have a general idea of Big-O notation and the differences between the different
functions, our goal in this section is to tell you about the Big-O performance for the operations
on Python lists and dictionaries
...
It is important for you
to understand the efficiency of these Python data structures because they are the building blocks
we will use as we implement other data structures in the remainder of the book
...
In later chapters you will see
some possible implementations of both lists and dictionaries and how the performance depends
on the implementation
...
3
...
Each of these choices could have an impact on how fast list operations perform
...
Of course they also tried to make the less common operations fast,
but when a tradeoff had to be made the performance of a less common operation was often
sacrificed in favor of the more common operation
...
Both of these operations take the same amount of time no matter how large the list becomes
...
Another very common programming task is to grow a list
...
You can use the append method or the concatenation operator
...
However, the concatenation operator is 𝑂(𝑘) where 𝑘 is the size of the list that is
being concatenated
...
Let us look at four different ways we might generate a list of n numbers starting with 0
...
Next, we will try creating the list using list comprehension and finally,
and perhaps the most obvious way, using the range function wrapped by a call to the list
constructor
...
def test1():
l = []
52
Chapter 2
...
0
for i in range(1000):
l = l + [i]
def test2():
l = []
for i in range(1000):
l
...
The timeit module is designed to allow Python developers to make cross-platform
timing measurements by running functions in a consistent environment and using timing mechanisms that are as similar as possible across operating systems
...
The
first parameter is a Python statement that you want to time; the second parameter is a statement
that will run once to set up the test
...
By default timeit will try to run the statement
one million times
...
However, since it executes the statement a million times you can
read the result as the number of microseconds to execute the test one time
...
The following session shows how long it takes to run each of our test
functions 1000 times
...
Timer when defining a Timer object
t1 = Timer("test1()", "from __main__ import test1")
print("concat ",t1
...
timeit(number=1000), "milliseconds")
t3 = Timer("test3()", "from __main__ import test3")
print("comprehension ",t3
...
timeit(number=1000), "milliseconds")
concat 6
...
306292057037 milliseconds
comprehension 0
...
0655000209808 milliseconds
2
...
Performance of Python Data Structures
53
Problem Solving with Algorithms and Data Structures, Release 3
...
The setup statement may look very strange to you, so let us consider it in more detail
...
In this case the statement
from __main__ import test1 imports the function test1 from the __main__ namespace into the namespace that timeit sets up for the timing experiment
...
From the experiment above it is clear that the append operation at 0
...
54 milliseconds
...
It is interesting to note that the list comprehension is twice as fast as
a for loop with an append operation
...
So it would not be accurate to say that the concatenation operation takes 6
...
54 milliseconds
...
Now that we have seen how performance can be measured concretely you can look at Table 2
...
After thinking carefully about
Table 2
...
When pop is called
on the end of the list it takes 𝑂(1) but when pop is called on the first element in the list or
anywhere in the middle it is 𝑂(𝑛)
...
When an item is taken from the front of the list, in Python’s implementation, all the other
elements in the list are shifted one position closer to the beginning
...
2 you will see that this implementation also allows the index
operation to be 𝑂(1)
...
As a way of demonstrating this difference in performance let us do another experiment using
the timeit module
...
We will also want to measure this time for lists of
different sizes
...
The code below shows one attempt to measure the difference between the two uses of pop
...
0003 milliseconds, whereas
popping from the beginning takes 4
...
For a list of two million elements this is a
factor of 16, 000
...
The first is the statement from
__main__ import $x$
...
This approach allows us to time just the single pop statement and
get the most accurate measure of the time for that single operation
...
Algorithm Analysis
Problem Solving with Algorithms and Data Structures, Release 3
...
2: Big-O Efficiency of Python List Operators
1000 times it is also important to point out that the list is decreasing in size by 1 each time
through the loop
...
05%
...
pop(0)",
"from __main__ import x")
pop_end = Timer("x
...
timeit(number=1000)
4
...
timeit(number=1000)
0
...
To validate that claim we need to look at
the performance of both calls over a range of list sizes
...
pop(0)", "from __main__ import x")
pop_end = Timer("x
...
timeit(number=1000)
x = list(range(i))
2
...
Performance of Python Data Structures
55
Problem Solving with Algorithms and Data Structures, Release 3
...
3: Comparing the Performance of pop and pop(0)
pz = pop_zero
...
5f, %15
...
3 shows the results of our experiment
...
This is
exactly what we would expect to see for a 𝑂(𝑛) and 𝑂(1) algorithm
...
That is why the loop runs the test one thousand times in the first place to statistically gather
enough information to make the measurement reliable
...
3
...
As you probably recall, dictionaries
differ from lists in that you can access items in a dictionary by a key rather than a position
...
The thing that is
most important to notice right now is that the get item and set item operations on a dictionary
are 𝑂(1)
...
Checking to see
whether a key is in the dictionary or not is also 𝑂(1)
...
Algorithm Analysis
Problem Solving with Algorithms and Data Structures, Release 3
...
3: Big-O Efficiency of Python Dictionary Operations
is summarized in Table 2
...
One important side note on dictionary performance is that the efficiencies we provide in the table are for average performance
...
For our last performance experiment we will compare the performance of the contains operation
between lists and dictionaries
...
The experiment we will use to
compare the two is simple
...
Then we will pick
numbers at random and check to see if the numbers are in the list
...
We will repeat the same experiment for a dictionary that contains numbers as the keys
...
The code below implements this comparison
...
The difference is that on line 7 x is a list, and on line 9 x
is a dictionary
...
Timer("random
...
timeit(number=1000)
x = {j:None for j in range(i)}
d_time = t
...
3f,%10
...
4 summarizes the results
...
For the
smallest list size of 10, 000 elements a dictionary is 89
...
For the largest
list size of 990, 000 elements the dictionary is 11, 603 times faster! You can also see that the
time it takes for the contains operator on the list grows linearly with the size of the list
...
It can also be seen that the
2
...
Performance of Python Data Structures
57
Problem Solving with Algorithms and Data Structures, Release 3
...
4: Comparing the in Operator for Python Lists and Dictionaries
time for the contains operator on a dictionary is constant even as the dictionary size grows
...
004 milliseconds and for the
dictionary size of 990, 000 it also took 0
...
Since Python is an evolving language, there are always changes going on behind the scenes
...
As of this writing the Python wiki has a nice time complexity page that can be found
at the Time Complexity Wiki
...
list
...
list
...
list
...
list[10]
5
...
Algorithm Analysis
Problem Solving with Algorithms and Data Structures, Release 3
...
‘𝑥’ in my_dict
2
...
my_dict[‘𝑥’] == 10
4
...
all of the above are 𝑂(1)
2
...
• Big-O notation allows algorithms to be classified by their dominant process with respect
to the size of the problem
...
5 Key Terms
average case
checking off
log linear
quadratic
Big-O notation
exponential
logarithmic
time complexity
brute force
linear
order of magnitude
worst case
2
...
Give the Big-O performance of the following code fragment:
for i in range(n):
for j in range(n):
k = 2 + 2
2
...
Give the Big-O performance of the following code fragment:
i = n
while i > 0:
k = 2 + 2
i = i // 2
4
...
4
...
0
for i in range(n):
for j in range(n):
for k in range(n):
k = 2 + 2
5
...
Give the Big-O performance of the following code fragment:
for
k
for
k
for
k
i
=
j
=
k
=
in range(n):
2 + 2
in range(n):
2 + 2
in range(n):
2 + 2
2
...
Devise an experiment to verify that the list index operator is 𝑂(1)
...
Devise an experiment to verify that get item and set item are 𝑂(1) for dictionaries
...
Devise an experiment that compares the performance of the del operator on lists and
dictionaries
...
Given a list of numbers in random order write a linear time algorithm to find the 𝑘th
smallest number in the list
...
5
...
Algorithm Analysis
CHAPTER
THREE
BASIC DATA STRUCTURES
3
...
• To be able to implement the ADTs stack, queue, and deque using Python lists
...
• To understand prefix, infix, and postfix expression formats
...
• To use stacks to convert expressions from infix to postfix
...
• To be able to recognize problem properties where stacks, queues, and deques are appropriate data structures
...
• To be able to compare the performance of our linked list implementation with Python’s
list implementation
...
2 What Are Linear Structures?
We will begin our study of data structures by considering four simple but very powerful concepts
...
Once an item is added, it stays in that
position relative to the other elements that came before and came after it
...
Linear structures can be thought of as having two ends
...
” You could also call them
the “top” and the “bottom
...
What distinguishes
one linear structure from another is the way in which items are added and removed, in particular
the location where these additions and removals occur
...
0
Figure 3
...
Some structures might allow items to be removed from
either end
...
They
appear in many algorithms and can be used to solve a variety of important problems
...
3 Stacks
3
...
1 What is a Stack?
A stack (sometimes called a “push-down stack”) is an ordered collection of items where the
addition of new items and the removal of existing items always takes place at the same end
...
” The end opposite the top is known as the “base
...
The most recently added item is the
one that is in position to be removed first
...
It provides an ordering based on length of time in the collection
...
Many examples of stacks occur in everyday situations
...
Imagine a stack of books on a desk (Figure 3
...
The only book whose cover is visible
is the one on top
...
Figure 3
...
This one contains a number of primitive Python
data objects
...
Assume you start out with a clean desktop
...
You are constructing a stack
...
The order that they are removed is exactly the reverse of the order
that they were placed
...
The order of insertion is the reverse of the order of removal
...
3 shows
the Python data object stack as it was created and then again as items are removed
...
Considering this reversal property, you can perhaps think of examples of stacks that occur as
62
Chapter 3
...
0
Figure 3
...
3: The Reversal Property of Stacks
3
...
Stacks
63
Problem Solving with Algorithms and Data Structures, Release 3
...
is_empty()
s
...
push('dog')
s
...
push(True)
s
...
is_empty()
s
...
4)
s
...
pop()
s
...
4]
[4,'dog',True]
[4,'dog']
[4,'dog']
True
'dog'
3
False
8
...
1: Sample Stack Operations
you use your computer
...
As you navigate
from web page to web page, those pages are placed on a stack (actually it is the URLs that are
going on the stack)
...
If you click on the Back button, you begin to move in reverse order
through the pages
...
4 The Stack Abstract Data Type
The stack abstract data type is defined by the following structure and operations
...
” Stacks are ordered LIFO
...
• Stack() creates a new stack that is empty
...
• push(item) adds a new item to the top of the stack
...
• pop() removes the top item from the stack
...
The stack is modified
...
It needs no parameters
...
• is_empty() tests to see whether the stack is empty
...
• size() returns the number of items on the stack
...
For example, if s is a stack that has been created and starts out empty, then Table 3
...
Under stack contents, the top item is listed at the far
right
...
Basic Data Structures
Problem Solving with Algorithms and Data Structures, Release 3
...
4
...
Recall that when we give an abstract data type a physical
implementation we refer to the implementation as a data structure
...
The stack operations are implemented as methods
...
We will use a list
...
For example, if we have the list [2, 5, 3, 6, 7, 4], we need only to decide which end of
the list will be considered the top of the stack and which will be the base
...
The following stack implementation assumes that the end of the list will hold the top element
of the stack
...
pop operations will manipulate that same end
...
items = []
def is_empty(self):
return self
...
items
...
items
...
items[len(self
...
items)
Remember that nothing happens when we click the run button other than the definition of the
class
...
shows the Stack class in action as we
perform the sequence of operations from Table 3
...
s = Stack()
print(s
...
push(4)
s
...
4
...
0
print(s
...
push(True)
print(s
...
is_empty())
s
...
4)
print(s
...
pop())
print(s
...
In this case, the previous pop and append
methods would no longer work and we would have to index position 0 (the first item in the
list) explicitly using pop and insert
...
class Stack:
def __init__(self):
self
...
items == []
def push(self, item):
self
...
insert(0, item)
def pop(self):
return self
...
pop(0)
def peek(self):
return self
...
items)
s = Stack()
s
...
push('true')
print(s
...
However, even though the
stack will work either way, if we consider the performance of the two implementations, there
is definitely a difference
...
This
means that the first implementation will perform push and pop in constant time no matter how
many items are on the stack
...
Clearly, even
though the implementations are logically equivalent, they would have very different timings
when performing benchmark testing
...
Basic Data Structures
Problem Solving with Algorithms and Data Structures, Release 3
...
4
...
push('x')
m
...
pop()
m
...
peek()
1
...
‘y’
3
...
The stack is empty
Given the following sequence of stack operations, what is the top item on the stack when the
sequence is complete?
m = Stack()
m
...
push('y')
m
...
is_empty():
m
...
pop()
1
...
the stack is empty
3
...
‘z’
Write a function rev_string(my_str) that uses a stack to reverse the characters in a string
...
4
...
You have
no doubt written arithmetic expressions such as
(5 + 6) * (7 + 8)/(4 + 3)
where parentheses are used to order the performance of operations
...
4
...
0
Figure 3
...
Lisp is
notorious for using lots and lots of parentheses
...
Balanced parentheses means that each opening symbol has a corresponding closing symbol and the pairs
of parentheses are properly nested
...
The challenge then is to write an algorithm that will read a string of parentheses from left to
right and decide whether the symbols are balanced
...
As you process symbols from left to right, the most recent opening
parenthesis must match the next closing symbol (see Figure 3
...
Also, the first opening symbol
processed may have to wait until the very last symbol for its match
...
This
is a clue that stacks can be used to solve the problem
...
Starting with an empty stack, process the
parenthesis strings from left to right
...
If, on the other
hand, a symbol is a closing parenthesis, pop the stack
...
If at any time there is
no opening symbol on the stack to match a closing symbol, the string is not balanced prop68
Chapter 3
...
0
erly
...
1
import Stack #import the Stack class as previously defined
2
3
4
5
6
7
8
9
10
11
12
13
14
15
def par_checker(symbol_string):
s = Stack()
balanced = True
index = 0
while index < len(symbol_string) and balanced:
symbol = symbol_string[index]
if symbol == "(":
s
...
is_empty():
balanced = False
else:
s
...
is_empty():
return True
else:
return False
23
24
25
print(par_checker('((()))'))
print(par_checker('(()'))
This function, par_checker, assumes that a Stack class is available and returns a boolean result
as to whether the string of parentheses is balanced
...
If the current symbol
is (, then it is pushed on the stack (lines 9–10)
...
The returned value is not used since we know it must be an opening
symbol seen earlier
...
3
...
4 Balanced Symbols (A General Case)
he balanced parentheses problem shown above is a specific case of a more general situation
that arises in many programming languages
...
For example, in Python
square brackets, [ and ], are used for lists; curly braces, { and }, are used for dictionaries;
and parentheses, ( and ), are used for tuples and arithmetic expressions
...
Strings of symbols
such as
{ { ( [ ] [ ] ) } ( ) }
3
...
The Stack Abstract Data Type
69
Problem Solving with Algorithms and Data Structures, Release 3
...
Compare those with the following strings that are not balanced:
( [ ) ]
( ( ( ) ] ) )
[ { ( ) ]
The simple parentheses checker from the previous section can easily be extended to handle
these new types of symbols
...
When a closing symbol
does appear, the only difference is that we must check to be sure that it correctly matches the
type of the opening symbol on top of the stack
...
Once again, if the entire string is processed and nothing is left on the stack, the
string is correctly balanced
...
Each symbol that is
removed from the stack must be checked to see that it matches the current closing symbol
...
1
import Stack # As previously defined
2
3
# Completed extended par_checker for: [,{,(,),},]
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
def par_checker(symbol_string):
s = Stack()
balanced = True
index = 0
while index < len(symbol_string) and balanced:
symbol = symbol_string[index]
if symbol in "([{":
s
...
is_empty():
balanced = False
else:
top = s
...
Basic Data Structures
Problem Solving with Algorithms and Data Structures, Release 3
...
is_empty():
return True
else:
return False
25
26
27
28
29
def matches(open, close):
opens = "([{"
closes = ")]}"
return opens
...
index(close)
30
31
32
33
print(par_checker('{{([][])}()}'))
print(par_checker('[{()]'))
These two examples show that stacks are very important data structures for the processing
of language constructs in computer science
...
There are a number of other
important uses for stacks in computer science
...
3
...
5 Converting Decimal Numbers to Binary Numbers
In your study of computer science, you have probably been exposed in one way or another to
the idea of a binary number
...
Without
the ability to convert back and forth between common representations and binary numbers, we
would need to interact with computers in very awkward ways
...
They are used in computer programs and computation
all the time
...
The decimal number 2331 0 and its corresponding binary equivalent
111010012 are interpreted respectively as 2 * 102 + 3 * 101 + 3 * 100 and 1 * 27 + 1 * 26 + 1 *
25 + 0 * 24 + 1 * 23 + 0 * 22 + 0 * 21 + 1 * 20
But how can we easily convert integer values into binary numbers? The answer is an algorithm
called “Divide by 2” that uses a stack to keep track of the digits for the binary result
...
A simple
iteration then continually divides the decimal number by 2 and keeps track of the remainder
...
An even value
will have a remainder of 0
...
An odd value will have a
remainder of 1 and will have the digit 1 in the ones place
...
As shown in Figure 3
...
The Python code in below implements the Divide by 2 algorithm
...
Line 7 uses the
built-in modulo operator, %, to extract the remainder and line 8 then pushes it on the stack
...
Line
3
...
The Stack Abstract Data Type
71
Problem Solving with Algorithms and Data Structures, Release 3
...
5: Decimal-To-Binary Conversion)
11 creates an empty string
...
The binary string is then returned
...
push(rem)
dec_number = dec_number // 2
6
7
8
9
10
bin_string = ""
while not rem_stack
...
pop())
11
12
13
14
return bin_string
15
16
17
print(divide_by_2(42))
he algorithm for binary conversion can easily be extended to perform the conversion for any
base
...
The most
common of these are binary, octal (base 8), and hexadecimal (base 16)
...
The “Divide by 2” idea is simply replaced with a more
general “Divide by base
...
The remainders are still
pushed onto the stack until the value being converted becomes 0
...
Base 2 through base 10
72
Chapter 3
...
0
numbers need a maximum of 10 digits, so the typical digit characters 0, 1, 2, 3, 4, 5, 6, 7,
8, and 9 work fine
...
We can no longer
simply use the remainders, as they are themselves represented as two-digit decimal numbers
...
import Stack # As previously defined
def base_converter(dec_number, base):
digits = "0123456789ABCDEF"
rem_stack = Stack()
while dec_number > 0:
rem = dec_number % base
rem_stack
...
is_empty():
new_string = new_string + digits[rem_stack
...
For
example, hexadecimal uses the ten decimal digits along with the first six alphabet characters
for the 16 digits
...
0 is at position 0, 1 is at position 1, A is at position 10, B is at position
11, and so on
...
For example, if the
remainder 13 is removed from the stack, the digit D is appended to the resulting string
...
4
...
What is the value of 25 expressed as an octal number?
2
...
What is the value of 26 expressed in base 26?
3
...
7 Infix, Prefix, and Postfix Expressions
When you write an arithmetic expression such as 𝐵*𝐶, the form of the expression provides you
with information so that you can interpret it correctly
...
4
...
0
in the expression
...
Consider another infix example, 𝐴 + 𝐵 * 𝐶
...
Which operands do they work on? Does the + work on 𝐴
and 𝐵 or does the * take 𝐵 and 𝐶? The expression seems ambiguous
...
The reason for this is that you know something about the
operators + and *
...
Operators of higher precedence are
used before operators of lower precedence
...
The precedence order for arithmetic operators places multiplication
and division above addition and subtraction
...
Let’s interpret the troublesome expression 𝐴 + 𝐵 * 𝐶 using operator precedence
...
(𝐴 + 𝐵) * 𝐶 would force the addition of 𝐴
and 𝐵 to be done first before the multiplication
...
Although all this may be obvious to you, remember that computers need to know exactly what
operators to perform and in what order
...
This type of expression uses one pair of parentheses for each operator
...
There is also no
need to remember any precedence rules
...
𝐴 + 𝐵 + 𝐶 + 𝐷 can be written
as (((𝐴 + 𝐵) + 𝐶) + 𝐷) since the addition operations associate from left to right
...
Consider the infix expression 𝐴 + 𝐵
...
Likewise, we could move
the operator to the end
...
These look a bit strange
...
Prefix expression notation requires that all operators
precede the two operands that they work on
...
A few more examples should help to make this a
bit clearer (see Table 3
...
𝐴 + 𝐵 * 𝐶 would be written as +𝐴 * 𝐵𝐶 in prefix
...
The addition
operator then appears before the 𝐴 and the result of the multiplication
...
Again, the order of operations is preserved
since the * appears immediately after the 𝐵 and the 𝐶, denoting that * has precedence, with
+ coming after
...
Now consider the infix expression (𝐴 + 𝐵) * 𝐶
...
However, when
74
Chapter 3
...
0
Infix Expression
𝐴+ 𝐵
𝐴+ 𝐵* 𝐶
Prefix Expression
+𝐴𝐵
+𝐴 * 𝐵𝐶
Postfix Expression
𝐴𝐵+
𝐴𝐵𝐶 * +
Table 3
...
3: An Expression with Parentheses
𝐴+𝐵 was written in prefix, the addition operator was simply moved before the operands, +𝐴𝐵
...
The multiplication
operator is moved in front of the entire expression, giving us * + 𝐴𝐵𝐶
...
The multiplication can be done to that result and the
remaining operand 𝐶
...
Consider these three expressions again (see Table 3
...
Something very important has happened
...
Only infix notation requires the additional symbols
...
In many ways, this makes infix the least desirable notation to use
...
4 shows some additional examples of infix expressions and the equivalent prefix and
postfix expressions
...
3
...
8 Conversion of Infix Expressions to Prefix and Postfix
So far, we have used ad hoc methods to convert between infix expressions and the equivalent
prefix and postfix expression notations
...
The first technique that we will consider uses the notion of a fully parenthesized expression that
was discussed earlier
...
On closer observation, however, you
can see that each parenthesis pair also denotes the beginning and the end of an operand pair
with the corresponding operator in the middle
...
If we were to move the
Infix Expression
Prefix Expression
𝐴+ 𝐵* 𝐶 + 𝐷
+ + 𝐴 * 𝐵𝐶𝐷
(𝐴 + 𝐵) * (𝐶 + 𝐷)
* + 𝐴𝐵 + 𝐶𝐷
𝐴* 𝐵+ 𝐶 * 𝐷
+ * 𝐴𝐵 * 𝐶𝐷
𝐴+ 𝐵+ 𝐶 + 𝐷
+ + +𝐴𝐵𝐶𝐷
Postfix Expression
𝐴𝐵𝐶 * +𝐷+
𝐴𝐵 + 𝐶𝐷 + *
𝐴𝐵 * 𝐶𝐷 * +
𝐴𝐵 + 𝐶 + 𝐷+
Table 3
...
4
...
0
Figure 3
...
7: Moving Operators to the Left for Prefix Notation)
multiplication symbol to that position and remove the matching left parenthesis, giving us
𝐵𝐶*, we would in effect have converted the subexpression to postfix notation
...
6)
If we do the same thing but instead of moving the symbol to the position of the right parenthesis,
we move it to the left, we get prefix notation (see Figure 3
...
The position of the parenthesis
pair is actually a clue to the final position of the enclosed operator
...
Then move the enclosed
operator to the position of either the left or the right parenthesis depending on whether you
want prefix or postfix notation
...
Figure 3
...
3
...
9 General Infix-to-Postfix Conversion
We need to develop an algorithm to convert any infix expression to a postfix expression
...
Consider once again the expression 𝐴 + 𝐵 * 𝐶
...
We have already noted that the operands 𝐴, 𝐵, and 𝐶 stay in their relative positions
...
Let us look again at the operators in the infix
expression
...
However, in the postfix
expression, + is at the end since the next operator, *, has precedence over addition
...
8: Converting a Complex Expression to Prefix and Postfix Notations)
76
Chapter 3
...
0
of the operators in the original expression is reversed in the resulting postfix expression
...
Also, the order of these saved operators may need to
be reversed due to their precedence
...
Since the addition operator comes before the multiplication operator and has
lower precedence, it needs to appear after the multiplication operator is used
...
What about (𝐴 + 𝐵) * 𝐶? Recall that 𝐴𝐵 + 𝐶* is the postfix equivalent
...
In this case, when we see *, + has
already been placed in the result expression because it has precedence over * by virtue of the
parentheses
...
When we see
a left parenthesis, we will save it to denote that another operator of high precedence will be
coming
...
When that right parenthesis does
appear, the operator can be popped from the stack
...
This
will provide the reversal that we noted in the first example
...
Whenever we read a new operator, we will need to consider
how that operator compares in precedence with the operators, if any, already on the stack
...
The operator tokens are
*, /, +, and −, along with the left and right parentheses, ( and )
...
The following steps will produce a string of
tokens in postfix order
...
Create an empty stack called op_stack for keeping operators
...
2
...
3
...
• If the token is an operand, append it to the end of the output list
...
• If the token is a right parenthesis, pop the op_stack until the corresponding left
parenthesis is removed
...
• If the token is an operator, *, /, +, or −, push it on the op_stack
...
When the input expression has been completely processed, check the op_stack
...
Figure 3
...
Note
that the first * operator is removed upon seeing the + operator
...
At the end of the
3
...
The Stack Abstract Data Type
77
Problem Solving with Algorithms and Data Structures, Release 3
...
9: Converting 𝐴 * 𝐵 + 𝐶 * 𝐷 to Postfix Notation)
infix expression the stack is popped twice, removing both operators and placing + as the last
operator in the postfix expression
...
This dictionary will map each operator to an integer that
can be compared against the precedence levels of other operators (we have arbitrarily used the
integers 3, 2, and 1)
...
This way any
operator that is compared against it will have higher precedence and will be placed on top of it
...
The complete conversion
function is shown below
...
split()
13
for token in token_list:
if token in "ABCDEFGHIJKLMNOPQRSTUVWXYZ" or token in
"0123456789":
postfix_list
...
push(token)
elif token == ')':
top_token = op_stack
...
Basic Data Structures
Problem Solving with Algorithms and Data Structures, Release 3
...
append(top_token)
top_token = op_stack
...
is_empty()) and \
(prec[op_stack
...
append(op_stack
...
push(token)
29
30
31
32
while not op_stack
...
append(op_stack
...
join(postfix_list)
33
34
35
36
print(infix_to_postfix("A * B + C * D"))
print(infix_to_postfix("( A + B ) * C - ( D - E ) * ( F + G )"))
...
>>> infix_to_postfix("( A + B ) * ( C + D )")
'A B + C D + *'
>>> infix_to_postfix("( A + B ) * C")
'A B + C *'
>>> infix_to_postfix("A + B * C")
'A B C * +'
>>>
3
...
10 Postfix Evaluation
As a final stack example, we will consider the evaluation of an expression that is already in
postfix notation
...
However, as you scan
the postfix expression, it is the operands that must wait, not the operators as in the conversion
algorithm above
...
To see this in more detail, consider the postfix expression 456 * +
...
At this point, you are still unsure
what to do with them until you see the next symbol
...
In this case, the next symbol is another operand
...
Now we see an operator, *
...
By popping the stack twice, we can get the proper operands
and then perform the multiplication (in this case getting the result 30)
...
When the final operator is processed, there will be only
one value left on the stack
...
Figure 3
...
3
...
The Stack Abstract Data Type
79
Problem Solving with Algorithms and Data Structures, Release 3
...
10: Sack Contents During Evaluation)
Figure 3
...
11 shows a slightly more complex example, 78+32+/
...
First, the stack size grows, shrinks, and then grows again as the subexpressions
are evaluated
...
Recall that the
operands in the postfix expression are in their original order since postfix changes only the
placement of operators
...
Since division is not a commutative operator, in other words 15/5 is not the same as
5/15, we must be sure that the order of the operands is not switched
...
The operators are *,
/, +, and −, and the operands are assumed to be single-digit integer values
...
1
...
2
...
3
...
• If the token is an operand, convert it from a string to an integer and push the value
onto the operand_stack
...
Pop the
operand_stack twice
...
Perform the arithmetic operation
...
80
Chapter 3
...
0
4
...
Pop
the operand_stack and return the value
...
To assist
with the arithmetic, a helper function do_math is defined that will take two operands and an
operator and then perform the proper arithmetic operation
...
split()
for token in token_list:
if token in "0123456789":
operand_stack
...
pop()
operand1 = operand_stack
...
push(result)
return operand_stack
...
Using these programs as a starting
point, you can easily see how error detection and reporting can be included
...
Self Check
1
...
2
...
4
...
0
Figure 3
...
5 Queues
We now turn our attention to another linear data structure
...
Like
stacks, queues are relatively simple and yet can be used to solve a wide range of important
problems
...
5
...
” As an element enters the queue it starts at the rear and makes its way toward the
front, waiting until that time when it is the next element to be removed
...
The item that
has been in the collection the longest is at the front
...
It is also known as “first-come first-served
...
We wait in a line for a movie, we wait in the check-out line at a grocery store, and we wait in
the cafeteria line (so that we can pop the tray stack)
...
There is no jumping in the
middle and no leaving before you have waited the necessary amount of time to get to the front
...
12 shows a simple queue of Python data objects
...
Our computer laboratory has 30
computers networked with a single printer
...
The first task in is the next to be
completed
...
We will explore this interesting example in more detail later
...
The scheduling of what gets done next is typically based on
a queuing algorithm that tries to execute programs as quickly as possible and serve as many
users as it can
...
This is due to the computer doing other work at that moment
...
82
Chapter 3
...
0
Queue Operation
Queue Contents
Return Value
q
...
enqueue(4)
q
...
enqueue(True)
q
...
is_empty()
q
...
4)
q
...
dequeue()
q
...
4,True,'dog',4]
[8
...
4,True]
[8
...
5: Example Queue Operations
3
...
2 The Queue Abstract Data Type
The queue abstract data type is defined by the following structure and operations
...
” Queues maintain a FIFO
ordering property
...
• Queue() creates a new queue that is empty
...
• enqueue(item) adds a new item to the rear of the queue
...
• dequeue() removes the front item from the queue
...
The queue is modified
...
It needs no parameters and returns a
boolean value
...
It needs no parameters and returns an
integer
...
5 shows the results of a sequence of queue operations
...
4 was the first item enqueued so it is the first item returned
by dequeue
...
5
...
As before, we will use the power and simplicity of the list collection to build the
internal representation of the queue
...
The
implementation shown below assumes that the rear is at position 0 in the list
...
The pop operation
3
...
Queues
83
Problem Solving with Algorithms and Data Structures, Release 3
...
Recall that this also
means that enqueue will be 𝑂(𝑛) and dequeue will be 𝑂(1)
...
items = []
def is_empty(self):
return self
...
items
...
items
...
items)
Self Check
Suppose you have the following series of queue operations:
q = Queue()
q
...
enqueue('dog')
q
...
dequeue()
What items are left in the queue?
1
...
‘dog’, 3
3
...
‘hello’, ‘dog’, 3
3
...
4 Simulation: Hot Potato
One of the typical applications for showing a queue in action is to simulate a real situation that
requires data to be managed in a FIFO manner
...
In this game (see Figure 3
...
At a certain point in the game, the action is stopped
and the child who has the item (the potato) is removed from the circle
...
84
Chapter 3
...
0
Figure 3
...
Based on a legend
about the famous first-century historian Flavius Josephus, the story is told that in the Jewish
revolt against Rome, Josephus and 39 of his comrades held out against the Romans in a cave
...
They arranged themselves in a circle
...
Josephus, according to the legend, was among other
things an accomplished mathematician
...
When the time came, instead of killing himself, he joined the Roman side
...
Some count every third man and some allow
the last man to escape on a horse
...
We will implement a general simulation of Hot Potato
...
It will return the name of the last person
remaining after repetitive counting by num
...
To simulate the circle, we will use a queue (see Figure 3
...
Assume that the child holding
the potato will be at the front of the queue
...
She will
then wait until all the others have been at the front before it will be her turn again
...
This process will continue until only one name remains (the size of the queue
is 1)
...
import Queue
# As previously defined
def hot_potato(name_list, num):
sim_queue = Queue()
for name in name_list:
sim_queue
...
5
...
0
Figure 3
...
size() > 1:
for i in range(num):
sim_queue
...
dequeue())
sim_queue
...
dequeue()
print(hot_potato(["Bill", "David", "Susan", "Jane", "Kent",
"Brad"], 7))
Note that in this example the value of the counting constant is greater than the number of names
in the list
...
Also, notice that the list is loaded into the queue such
that the first name on the list will be at the front of the queue
...
A variation of this implementation,
described in the exercises, allows for a random counter
...
5
...
Recall that as students send printing tasks to the shared printer, the tasks
are placed in a queue to be processed in a first-come first-served manner
...
The most important of these might be whether the printer is capable of
handling a certain amount of work
...
Consider the following situation in a computer science laboratory
...
These students typically print up to twice
during that time, and the length of these tasks ranges from 1 to 20 pages
...
The printer could be
switched to give better quality, but then it would produce only five pages per minute
...
What page rate should be used?
86
Chapter 3
...
0
Figure 3
...
We will need to construct
representations for students, printing tasks, and the printer (Figure 3
...
As students submit
printing tasks, we will add them to a waiting list, a queue of print tasks attached to the printer
...
Of interest for us is the average amount of time students will wait for their
papers to be printed
...
To model this situation we need to use some probabilities
...
If each length from 1 to 20 is equally likely, the actual
length for a print task can be simulated by using a random number between 1 and 20 inclusive
...
If there are 10 students in the lab and each prints twice, then there are 20 print tasks per hour
on average
...
Twenty tasks per hour means
that on average there will be one task every 180 seconds:
1 hour
1 minute
1 task
20 tasks
×
×
=
1 hour
60 minutes 60 seconds
180 seconds
For every second we can simulate the chance that a print task occurs by generating a random
number between 1 and 180 inclusive
...
Note that it is possible that many tasks could be created in a row or we may wait quite a while
for a task to appear
...
You want to simulate the real situation as
closely as possible given that you know general parameters
...
5
...
1
...
Each task will be given a timestamp upon its arrival
...
3
...
Queues
87
Problem Solving with Algorithms and Data Structures, Release 3
...
For each second (current_second):
• Does a new print task get created? If so, add it to the queue with the current_second
as the timestamp
...
– Subtract the timestamp from the current_second to compute the waiting time
for that task
...
– Based on the number of pages in the print task, figure out how much time will
be required
...
It also subtracts one
second from the time required for that task
...
3
...
3
...
7 Python Implementation
To design this simulation we will create classes for the three real-world objects described above:
Printer, Task, and PrintQueue
...
If it does, then it is busy
and the amount of time needed can be computed from the number of pages in the task
...
The tick method
decrements the internal timer and sets the printer to idle if the task is completed
...
page_rate = ppm
self
...
time_remaining = 0
def tick(self):
if self
...
time_remaining = self
...
time_remaining <= 0:
self
...
current_task != None:
return True
else:
return False
88
Chapter 3
...
0
def start_next(self,new_task):
self
...
time_remaining = new_task
...
page_rate
The Task class will represent a single printing task
...
We have chosen to use the randrange
function from the random module
...
randrange(1, 21)
18
>>> random
...
This
timestamp will represent the time that the task was created and placed in the printer queue
...
import random
class Task:
def __init__(self, time):
self
...
pages = random
...
timestamp
def get_pages(self):
return self
...
timestamp
The main simulation implements the algorithm described above
...
A boolean helper function, new_print_task, decides
whether a new printing task has been created
...
Print tasks
arrive once every 180 seconds
...
The simulation function allows us to set the total time and
the pages per minute for the printer
...
5
...
0
import Task
# As previously defined
import random
def simulation(num_seconds, pages_per_minute):
lab_printer = Printer(pages_per_minute)
print_queue = Queue()
waiting_times = []
for current_second in range(num_seconds):
if new_print_task():
task = Task(current_second)
print_queue
...
busy()) and (not print_queue
...
dequeue()
waiting_times
...
wait_time(current_second))
lab_printer
...
tick()
average_wait = sum(waiting_times) / len(waiting_times)
print("Average Wait %6
...
"
%(average_wait, print_queue
...
randrange(1, 181)
if num == 180:
return True
else:
return False
for i in range(10):
simulation(3600, 5)
When we run the simulation, we should not be concerned that the results are different each
time
...
We are interested in the
trends that may be occurring as the parameters to the simulation are adjusted
...
First, we will run the simulation for a period of 60 minutes (3, 600 seconds) using a page
rate of five pages per minute
...
Remember that
because the simulation works with random numbers each run will return different results
...
38 secs 2 tasks remaining
...
07 secs 1 tasks remaining
...
Basic Data Structures
Problem Solving with Algorithms and Data Structures, Release 3
...
05 secs 2 tasks remaining
...
74 secs 1 tasks remaining
...
27 secs 0 tasks remaining
...
61 secs 5 tasks remaining
...
11 secs 1 tasks remaining
...
33 secs 0 tasks remaining
...
31 secs 3 tasks remaining
...
05 secs 1 tasks remaining
...
155 seconds
...
27 seconds and a maximum of 239
...
You may also notice that in only two of
the cases were all the tasks completed
...
>>>for i in range(10):
simulation(3600, 10)
Average
Average
Average
Average
Average
Average
Average
Average
Average
Average
Wait
Wait
Wait
Wait
Wait
Wait
Wait
Wait
Wait
Wait
1
...
7
...
28
...
13
...
12
...
6
...
22
...
12
...
7
...
18
...
The code to run the simulation is as follows:
import Queue
# As previously defined
import random
# Completed program for the printer simulation
class Printer:
def __init__(self, ppm):
self
...
current_task = None
self
...
current_task != None:
self
...
time_remaining - 1
if self
...
5
...
0
self
...
current_task != None:
return True
else:
return False
def start_next(self, new_task):
self
...
time_remaining = new_task
...
page_rate
class Task:
def __init__(self, time):
self
...
pages = random
...
timestamp
def get_pages(self):
return self
...
timestamp
def simulation(num_seconds, pages_per_minute):
lab_printer = Printer(pages_per_minute)
print_queue = Queue()
waiting_times = []
for current_second in range(num_seconds):
if new_print_task():
task = Task(current_second)
print_queue
...
busy()) and (not print_queue
...
dequeue()
waiting_times
...
wait_time(current_second))
lab_printer
...
tick()
average_wait = sum(waiting_times) / len(waiting_times)
print("Average Wait %6
...
"
%(average_wait, print_queue
...
Basic Data Structures
Problem Solving with Algorithms and Data Structures, Release 3
...
randrange(1, 181)
if num == 180:
return True
else:
return False
for i in range(10):
simulation(3600, 5)
3
...
8 Discussion
We were trying to answer a question about whether the current printer could handle the task
load if it were set to print with a better quality but slower page rate
...
The output above shows that with 5 pages per minute printing, the average waiting time varied
from a low of 17 seconds to a high of 376 seconds (about 6 minutes)
...
In addition, in 8 out of 10 runs at 5
pages per minute there were print tasks still waiting in the queue at the end of the hour
...
Students cannot afford to wait that long for their papers, especially when
they need to be getting on to their next class
...
This type of simulation analysis allows us to answer many questions, commonly known as
“what if” questions
...
For example,
• What if enrolment goes up and the average number of students increases by 20?
• What if it is Saturday and students are not needing to get to class? Can they afford to
wait?
• What if the size of the average print task decreases since Python is such a powerful
language and programs tend to be much shorter?
These questions could all be answered by modifying the above simulation
...
Real data about the number of print tasks per hour and the number of students per hour
was necessary to construct a robust simulation
...
You make need to make some reasonable assumptions
about how this simulation was put together but what would you change? Modify the code
...
Change the code to reflect that
change
...
3
...
Queues
93
Problem Solving with Algorithms and Data Structures, Release 3
...
16: A Deque of Python Data Objects Queue)
3
...
However, unlike stack and queue, the deque (pronounced “deck”)
has very few restrictions
...
”
3
...
1 What Is a Deque?
A deque, also known as a double-ended queue, is an ordered collection of items similar to the
queue
...
What makes a deque different is the unrestrictive nature of adding and removing items
...
Likewise, existing items can be removed from
either end
...
Figure 3
...
It is important to note that even though the deque can assume many of the characteristics of
stacks and queues, it does not require the LIFO and FIFO orderings that are enforced by those
data structures
...
3
...
2 The Deque Abstract Data Type
The deque abstract data type is defined by the following structure and operations
...
The deque operations are given below
...
It needs no parameters and returns an empty
deque
...
It needs the item and returns
nothing
...
Basic Data Structures
Problem Solving with Algorithms and Data Structures, Release 3
...
is_empty()
d
...
add_rear('dog')
d
...
add_front(True)
d
...
is_empty()
d
...
4)
d
...
remove_front()
[]
[4]
['dog',4,]
['dog',4,'cat']
['dog',4,'cat',True]
['dog',4,'cat',True]
['dog',4,'cat',True]
[8
...
4
True
Table 3
...
It needs the item and returns
nothing
...
It needs no parameters and returns
the item
...
• remove_rear() removes the rear item from the deque
...
The deque is modified
...
It needs no parameters and returns a
boolean value
...
It needs no parameters and returns an
integer
...
6 shows the results of a sequence of deque operations
...
It is very important to keep track of the front and the rear as you
move items in and out of the collection as things can get a bit confusing
...
6
...
Again, the Python list will provide a very nice set of methods upon
which to build the details of the deque
...
# Completed implementation of a deque ADT
class Deque:
def __init__(self):
self
...
items == []
def add_front(self, item):
3
...
Deques
95
Problem Solving with Algorithms and Data Structures, Release 3
...
items
...
items
...
items
...
items
...
items)
In remove_front we use the pop method to remove the last element from the list
...
Likewise, we need
to use the insert method in add_rear since the append method assumes the addition of a new
element to the end of the list
...
You
are also likely to observe that in this implementation adding and removing items from the front
is 𝑂(1) whereas adding and removing from the rear is 𝑂(𝑛)
...
Again, the important thing is
to be certain that we know where the front and rear are assigned in the implementation
...
6
...
A palindrome is a string that reads the same forward and backward, for
example, radar, toot, and madam
...
The solution to this problem will use a deque to store the characters of the string
...
At this
point, the deque will be acting very much like an ordinary queue
...
The front of the deque will hold the first character of
the string and the rear of the deque will hold the last character (see Figure 3
...
If we can keep matching first and the last items, we will eventually either run out of
characters or be left with a deque of size 1 depending on whether the length of the original
string was even or odd
...
import Deque
# As previously defined
def pal_checker(a_string):
char_deque = Deque()
for ch in a_string:
96
Chapter 3
...
0
Figure 3
...
add_rear(ch)
still_equal = True
while char_deque
...
remove_front()
last = char_deque
...
7 Lists
Throughout the discussion of basic data structures, we have used Python lists to implement
the abstract data types presented
...
However, not all programming
languages include a list collection
...
A list is a collection of items where each item holds a relative position with respect to the
others
...
We can consider
the list as having a first item, a second item, a third item, and so on
...
For simplicity we will
assume that lists cannot contain duplicate items
...
Note that we have written them as comma-delimited values,
3
...
Lists
97
Problem Solving with Algorithms and Data Structures, Release 3
...
Of course, Python would show this list as
[54, 26, 93, 17, 77, 31]
...
8 The Unordered List Abstract Data Type
The structure of an unordered list, as described above, is a collection of items where each item
holds a relative position with respect to the others
...
• List() creates a new list that is empty
...
• add(item) adds a new item to the list
...
Assume the
item is not already in the list
...
It needs the item and modifies the list
...
• search(item) searches for the item in the list
...
• is_empty() tests to see whether the list is empty
...
• size() returns the number of items in the list
...
• append(item) adds a new item to the end of the list making it the last item in the collection
...
Assume the item is not already in the
list
...
It needs the item and returns the index
...
• insert(pos,item) adds a new item to the list at position pos
...
Assume the item is not already in the list and there are enough existing items to
have position pos
...
It needs nothing and returns an item
...
• pop(pos) removes and returns the item at position pos
...
Assume the item is in the list
...
9 Implementing an Unordered List: Linked Lists
In order to implement an unordered list, we will construct what is commonly known as a linked
list
...
However, there is no requirement that we maintain that positioning in contiguous memory
...
18
...
If we can maintain some explicit information in each item, namely
98
Chapter 3
...
0
Figure 3
...
19: Relative Positions Maintained by Explicit Links
...
19, then the relative position of each item can be
expressed by simply following the link from one item to the next
...
Once we know where the first item is, the first item can tell us where the second is, and so on
...
Similarly, the last item needs
to know that there is no next item
...
9
...
Each node object must
hold at least two pieces of information
...
We will
call this the data field of the node
...
To construct a node, you need to supply the initial data value for the node
...
20)
...
21
...
class Node:
def __init__(self, init_data):
self
...
next = None
def get_data(self):
return self
...
9
...
0
Figure 3
...
21: A Typical Representation for a Node
...
next
def set_data(self, new_data):
self
...
next = new_next
We create Node objects in the usual way
...
get_data()
93
The special Python reference value None will play an important role in the Node class and
later in the linked list itself
...
Note in the constructor that a node is initially created with next set to None
...
It is always a good idea to explicitly assign None
to your initial next reference values
...
9
...
As long as we know where to find the first node (containing
the first item), each item after that can be found by successively following the next links
...
The following
code shows the constructor
...
class UnorderedList:
100
Chapter 3
...
0
Figure 3
...
23: Linked List of Integers
def __init__(self):
self
...
The assignment statement
>>> mylist = UnorderedList()
creates the linked list representation shown in Figure 3
...
As we discussed in the Node class,
the special reference None will again be used to state that the head of the list does not refer
to anything
...
23
...
In turn, that node holds a reference to the next node (the next item) and so on
...
Instead it
contains a single reference to only the first node in the linked structure
...
The result of the boolean expression self
...
Since a new list is empty, the constructor and the check for empty must
be consistent with one another
...
In Python, None can be compared to any reference
...
We will use this often in our
remaining methods
...
head == None
So, how do we get items into our list? We need to implement the add method
...
Since this list is unordered, the specific location of the new item with respect to
the other items already in the list is not important
...
With that
in mind, it makes sense to place the new item in the easiest location possible
...
9
...
0
Figure 3
...
All of the other nodes can only be reached by accessing the first node and then following next
links
...
In other words, we will make the new item the first item of the list and the existing
items will need to be linked to this new first item so that they follow
...
23 was built by calling the add method a number of times
...
add(31)
mylist
...
add(17)
mylist
...
add(26)
mylist
...
Also, since 54 is the last item added, it will
become the data value in the first node of the linked list
...
Each item of the list must reside in a node object
...
Now we must complete the process by linking
the new node into the existing structure
...
24
...
Now that the rest of the list has been properly attached to the new node, we can modify the
head of the list to refer to the new node
...
The order of the two steps described above is very important
...
25
...
def add(self, item):
temp = Node(item)
temp
...
head)
self
...
Traversal refers to the process of systematically visiting each
102
Chapter 3
...
0
Figure 3
...
Figure 3
...
node
...
As we visit
each node, we move the reference to the next node by “traversing” the next reference
...
Below we show the Python code for counting the number of
nodes in the list
...
At the start of the process we have not seen any nodes so the count is set to 0
...
As long as the current reference has not seen the
end of the list (None), we move current along to the next node via the assignment statement in
line 6
...
Every time current
moves to a new node, we add 1 to count
...
Figure 3
...
def size(self):
current = self
...
get_next()
return count
Searching for a value in a linked list implementation of an unordered list also uses the traversal
technique
...
In this case, however, we may not have to traverse all the
way to the end of the list
...
Also, if we do find the item, there is no need to continue
...
9
...
0
Figure 3
...
As in the size method, the
traversal is initialized to start at the head of the list (line 2)
...
Since we
have not found the item at the start of the traversal, found can be set to False (line 3)
...
As long as there are
more nodes to visit and we have not found the item we are looking for, we continue to check
the next node
...
If so, found can be set to True
...
head
found = False
while current != None and not found:
if current
...
get_next()
return found
As an example, consider invoking the search method looking for the item 17
...
search(17)
True
Since 17 is in the list, the traversal process needs to move only to the node containing 17
...
This process can be seen in Figure 3
...
First, we need to traverse the list looking for the
item we want to remove
...
The first step is very similar to search
...
Since we
assume that item is present, we know that the iteration will stop before current gets to None
...
When found becomes True, current will be a reference to the node containing the item to be
removed
...
Basic Data Structures
Problem Solving with Algorithms and Data Structures, Release 3
...
The problem with this
approach is the number of nodes will no longer match the number of items
...
In order to remove the node containing the item, we need to modify the link in the previous
node so that it refers to the node that comes after current
...
Since current refers to the node ahead of the node where we would
like to make the change, it is too late to make the necessary modification
...
current will behave just as it did before, marking the current location of the traverse
...
That
way, when current stops at the node to be removed, previous will be referring to the proper
place in the linked list for the modification
...
Lines 2–3 assign initial values to the two
references
...
previous, however, is assumed to always travel one node behind current
...
28)
...
In lines 6–7 we ask whether the item stored in the current node is the item we wish to
remove
...
If we do not find the item, previous and current
must both be moved one node ahead
...
previous must first be moved one node ahead to the location of current
...
This process is often referred to as “inch-worming” as previous
must catch up to current before current moves ahead
...
29 shows the movement of
previous and current as they progress down the list looking for the node containing the value 17
...
head
previous = None
found = False
while not found:
if current
...
get_next()
11
12
13
14
15
if previous == None:
self
...
get_next()
else:
previous
...
get_next())
Once the searching step of the remove has been completed, we need to remove the node from
the linked list
...
30 shows the link that must be modified
...
If the item to be removed happens to be the first item in the
list, then current will reference the first node in the linked list
...
We said earlier that previous would be referring to the node whose next reference
3
...
Implementing an Unordered List: Linked Lists
105
Problem Solving with Algorithms and Data Structures, Release 3
...
28: Initial Values for the previous and current references
Figure 3
...
Basic Data Structures
Problem Solving with Algorithms and Data Structures, Release 3
...
30: Removing an Item from the middle of the list
Figure 3
...
In this case, it is not previous but rather
the head of the list that needs to be changed (see Figure 3
...
If
previous did not move, it will still have the value None when the boolean found becomes True
...
However, if previous is not None, the node
to be removed is somewhere down the linked list structure
...
Line 15 uses the set_next
method from previous to accomplish the removal
...
get_next()
...
We leave that for you to consider
...
Remember that
each of these must take into account whether the change is taking place at the head of the list
or someplace else
...
We will assume that position names are integers starting with 0
...
What is the time complexity of the method
you created? It was most likely 𝑂(𝑛)
...
Modify your append to be 𝑂(1)
...
3
...
Implementing an Unordered List: Linked Lists
107
Problem Solving with Algorithms and Data Structures, Release 3
...
32: An Ordered Linked List
3
...
For example, if the list of integers
shown above were an ordered list (ascending order), then it could be written as 17, 26, 31, 54,
77, and 93
...
Likewise,
since 93 is the largest, it occupies the last position
...
The ordering is typically either
ascending or descending and we assume that list items have a meaningful comparison operation that is already defined
...
• OrderedList() creates a new ordered list that is empty
...
• add(item) adds a new item to the list making sure that the order is preserved
...
Assume the item is not already in the list
...
It needs the item and modifies the list
...
• search(item) searches for the item in the list
...
• is_empty() tests to see whether the list is empty
...
• size() returns the number of items in the list
...
• index(item) returns the position of item in the list
...
Assume the item is in the list
...
It needs nothing and returns an item
...
• pop(pos) removes and returns the item at position pos
...
Assume the item is in the list
...
10
...
The ordered list of integers given above (17, 26,
31, 54, 77, and 93) can be represented by a linked structure as shown in Figure 3
...
Again, the
node and link structure is ideal for representing the relative positioning of the items
...
Basic Data Structures
Problem Solving with Algorithms and Data Structures, Release 3
...
33: Searching an Ordered Linked List
To implement the OrderedList class, we will use the same technique as seen previously with
unordered lists
...
class OrderedList:
def __init__(self):
self
...
Likewise, the remove
method will work just fine since we still need to find the item and then link around the node to
remove it
...
The search of an unordered linked list required that we traverse the nodes one at a time until
we either find the item we are looking for or run out of nodes (None)
...
However, in the case where the item is not in the list, we can
take advantage of the ordering to stop the search as soon as possible
...
33 shows the ordered linked list as a search is looking for the value 45
...
Since 17 is not the
item we are looking for, we move to the next node, in this case 26
...
Now, at this point, something is different
...
However,
due to the fact that this is an ordered list, that will not be necessary
...
There
is no way the item could exist further out in the linked list
...
It is easy to incorporate the new
condition discussed above by adding another boolean variable, stop, and initializing it to False
(line 4)
...
If
any node is ever discovered that contains data greater than the item we are looking for, we will
set stop to True (lines 9 − −10)
...
1
2
3
4
5
def search(self, item):
current = self
...
10
...
0
Figure 3
...
get_data() == item:
found = True
else:
if current
...
get_next()
6
7
8
9
10
11
12
13
14
return found
The most significant method modification will take place in add
...
It was the easiest point
of access
...
It is now necessary that
we discover the specific place where a new item belongs in the existing ordered list
...
The add method must decide that the new item belongs between 26 and 54
...
34 shows the setup that we need
...
We know we have found
that place when either we run out of nodes (current becomes None) or the value of the current
node becomes greater than the item we wish to add
...
As we saw with unordered lists, it is necessary to have an additional reference, again called
previous, since current will not provide access to the node that must be modified
...
Lines 2 − −3 set up the two external references and
lines 9 − −10 again allow previous to follow one node behind current every time through the
iteration
...
In either case, when the iteration
fails, we have found the location for the new node
...
34
...
Again, previous
== None (line 13) can be used to provide the answer
...
head
previous = None
stop = False
110
Chapter 3
...
0
5
6
7
8
9
10
while current != None and not stop:
if current
...
get_next()
11
12
13
14
15
16
17
18
temp = Node(item)
if previous == None:
temp
...
head)
self
...
set_next(current)
previous
...
10
...
Consider a linked list that has 𝑛 nodes
...
size, on the other hand, will always
require 𝑛 steps since there is no way to know how many nodes are in the linked list without
traversing from head to end
...
Adding an item to an unordered list
will always be 𝑂(1) since we simply place the new node at the head of the linked list
...
Although
on average they may need to traverse only half of the nodes, these methods are all 𝑂(𝑛) since
in the worst case each will process every node in the list
...
This suggests that linked lists are not the way Python
lists are implemented
...
We discuss this in more detail in another chapter
...
11 Summary
• Linear data structures maintain their data in an ordered fashion
...
• The fundamental operations for a stack are push, pop, and is_empty
...
• The fundamental operations for a queue are enqueue, dequeue, and is_empty
...
• Stacks are very useful for designing algorithms to evaluate and translate expressions
...
3
...
Summary
111
Problem Solving with Algorithms and Data Structures, Release 3
...
• Simulations use random number generators to create a real-life situation and allow us to
answer “what if” types of questions
...
• The fundamental operations for a deque are add_front, add_rear, remove_front, remove_rear, and is_empty
...
• A linked list implementation maintains logical order without requiring physical storage
requirements
...
3
...
13 Discussion Questions
1
...
” Show the stack of remainders
...
Convert the following infix expressions to prefix (use full parentheses):
• (𝐴 + 𝐵) * (𝐶 + 𝐷) * (𝐸 + 𝐹 )
• 𝐴 + ((𝐵 + 𝐶) * (𝐷 + 𝐸))
• 𝐴* 𝐵* 𝐶 * 𝐷+ 𝐸+ 𝐹
3
...
4
...
Show the stack as the conversion takes place
...
Evaluate the following postfix expressions
...
112
Chapter 3
...
0
• 23 * 4+
• 12 + 3 + 4 + 5+
• 12345 * + * +
6
...
What would this mean for Big-O performance?
7
...
Explain how the linked list remove method works when the item to be removed is in the
last node
...
Explain how the remove method works when the item is in the only node in the linked
list
...
14 Programming Exercises
1
...
2
...
3
...
Your evaluator should process infix tokens
from left to right and use two stacks, one for operators and one for operands, to perform
the evaluation
...
Turn your direct infix evaluator from the previous problem into a calculator
...
Implement the Queue ADT, using a list such that the rear of the queue is at the end of the
list
...
Design and implement an experiment to do benchmark comparisons of the two queue
implementations
...
It is possible to implement a queue such that both enqueue and dequeue have 𝑂(1) performance on average
...
8
...
Formulate a question and then design a simulation that can
help to answer it
...
3
...
Programming Exercises
113
Problem Solving with Algorithms and Data Structures, Release 3
...
Modify the Hot Potato simulation to allow for a randomly chosen counting value so that
each pass is not predictable from the previous one
...
Implement a radix sorting machine
...
Each
bin acts like a queue and maintains its values in the order that they arrive
...
Then it considers each value digit by
digit
...
For example, if the ones digit is being considered, 534 is placed in digit bin
4 and 667 is placed in digit bin 7
...
The process continues with the tens digit, the hundreds, and so on
...
11
...
In HTML, tags exist in both opening and closing forms and must be
balanced to properly describe a web document
...
Write a program that can check an HTML document for proper opening and closing
tags
...
To implement the length method, we counted the number of nodes in the list
...
Modify the UnorderedList class to include this information
and rewrite the length method
...
Implement the remove method so that it works correctly in the case where the item is not
in the list
...
Modify the list classes to allow duplicates
...
Implement the __str__ method in the UnorderedList class
...
Implement __str__ method so that lists are displayed the Python way (with square brackets)
...
Basic Data Structures
Problem Solving with Algorithms and Data Structures, Release 3
...
Implement the remaining operations defined in the UnorderedList ADT (append, index,
pop, insert)
...
Implement a slice method for the UnorderedList class
...
19
...
20
...
21
...
22
...
23
...
24
...
25
...
An alternative implementation
is known as a doubly linked list
...
The head reference also contains two references, one to the
first node in the linked list and one to the last
...
26
...
3
...
Programming Exercises
115
Problem Solving with Algorithms and Data Structures, Release 3
...
Basic Data Structures
CHAPTER
FOUR
RECURSION
4
...
• To learn how to formulate programs recursively
...
• To understand recursion as a form of iteration
...
• To understand how recursion is implemented by a computer system
...
2 What is Recursion?
Recursion is a method of solving problems that involves breaking a problem down into smaller
and smaller subproblems until you get to a small enough problem that it can be solved trivially
...
While it may not seem like much on the
surface, recursion allows us to write elegant solutions to problems that may otherwise be very
difficult to program
...
2
...
Suppose that you want to calculate the sum of a list of numbers such
as: [1, 3, 5, 7, 9]
...
The function
uses an accumulator variable (the_sum) to compute a running total of all the numbers in the
list by starting with 0 and adding each number in the list
...
0
for i in num_list:
the_sum = the_sum + i
return the_sum
print(list_sum([1,3,5,7,9]))
Pretend for a minute that you do not have while loops or for loops
...
To redefine the
problem from adding a list to adding pairs of numbers, we could rewrite the list as a fully
parenthesized expression
...
In fact, we can use the following sequence of simplifications to
compute a final sum
...
We might say the the sum of the list
num_list is the sum of the first element of the list (num_list[0]), and the sum of
the numbers in the rest of the list (num_list[1 :])
...
This is easily expressed in Python as the following:
def list_sum(num_list):
if len(num_list) == 1:
return num_list[0]
else:
return num_list[0] + list_sum(num_list[1:])
print(list_sum([1,3,5,7,9]))
There are a few key ideas in this to look at
...
This check is crucial and is our escape clause from the function
...
Second, on line 5 our function calls
118
Chapter 4
...
0
Figure 4
...
A recursive function is
a function that calls itself
...
1 shows the series of recursive calls that are needed to sum the list [1, 3, 5, 7, 9]
...
Each time we make a recursive
call we are solving a smaller problem, until we reach the point where the problem cannot get
any smaller
...
Figure 4
...
When list_sum returns from the topmost problem, we have the solution to the whole problem
...
2
...
A recursive algorithm must have a base case
...
A recursive algorithm must change its state and move toward the base case
...
A recursive algorithm must call itself, recursively
...
First, a base case is the condition that allows the algorithm to stop recursing
...
In the list_sum algorithm the
base case is a list of length 1
...
A change of state means that some data that the algorithm is using is modified
...
2
...
0
Figure 4
...
In the list_sum algorithm our primary data structure is a list, so we must focus our state-changing efforts on the list
...
This is exactly what happens on line 5 of the code below when we call list_sum with a
shorter list
...
This is the very definition of recursion
...
As a novice programmer,
you have learned that functions are good because you can take a large problem and break it up
into smaller problems
...
When we talk about recursion it may seem that we are talking ourselves in circles
...
In the remainder of this chapter we will look at more examples of recursion
...
Self Check
How many recursive calls are made when computing the sum of the list [2, 4, 6, 8, 10]?
1
...
5
3
...
3
120
Chapter 4
...
0
Suppose you are going to write a recursive function to calculate the factorial of a number
...
Where the factorial of zero is defined to be 1
...
𝑛 == 0
2
...
𝑛 >= 0
4
...
2
...
For example, convert the integer 10 to its string representation in decimal as “10,” or
to its string representation in binary as “1010
...
Let’s look at a concrete example using base 10 and the number 769
...
It is
easy to convert a number less than 10 to its string equivalent by looking it up in the sequence
...
” If we can arrange to
break up the number 769 into three single-digit numbers, 7, 6, and 9, then converting it to a
string is simple
...
Knowing what our base is suggests that the overall algorithm will involve three components:
1
...
2
...
3
...
The next step is to figure out how to change state and make progress toward the base case
...
The most likely candidates are division and subtraction
...
Integer division with remainders gives
us a clear direction
...
Using integer division to divide 769 by 10, we get 76 with a remainder of 9
...
First, the remainder is a number less than our base that can be converted to a
string immediately by lookup
...
Now our job is to
convert 76 to its string representation
...
Finally, we have reduced the problem to converting 7, which we
can do easily since it satisfies the base case condition of 𝑛
3
...
4
...
What is Recursion?
121
Problem Solving with Algorithms and Data Structures, Release 3
...
3: Converting an Integer to a String in Base 10
The code below shows the Python code that implements the algorithm outlined above for any
base between 2 and 16
...
When we detect the base case, we stop recursing and simply return the string from the
convertString sequence
...
Let us trace the algorithm again; this time we will convert the number 10 to its base 2 string
representation (“1010”)
...
4 shows that we get the results we are looking for, but it looks like the digits are in
the wrong order
...
If we reversed returning the
convertString lookup and returning the toStr call, the resulting string would be backward! But
by delaying the concatenation operation until after the recursive call has returned, we get the
result in the proper order
...
Self Check
Write a function that takes a string as a parameter and returns a new string that is the reverse of
the old string
...
Recursion
Problem Solving with Algorithms and Data Structures, Release 3
...
4: Converting an Integer to a String in Base 10
Write a function that takes a string as a parameter and returns True if the string is a palindrome,
False otherwise
...
for example: radar is a palindrome
...
for example:
madam i’m adam is a palindrome
...
• Able was I ere I saw Elba
• Kanakanak – a town in Alaska
• Wassamassaw – a town in South Dakota
4
...
The code for this modified algorithm is shown in the code below
...
3
...
0
r_stack = Stack()
def to_str(n, base):
convert_string = "0123456789ABCDEF"
while n > 0:
if n < base:
r_stack
...
push(convert_string[n % base])
n = n // base
res = ""
while not r_stack
...
pop())
return res
print(to_str(1453, 16))
Each time we make a call to toStr, we push a character on the stack
...
5
...
”
Figure 4
...
When a function is called in Python, a stack frame is allocated to handle the local variables of the function
...
Figure 4
...
Notice that the call to toStr(2//2, 2) leaves a return value of “” on the stack
...
In this way, the Python call stack takes
the place of the stack we used explicitly earlier
...
The stack frames also provide a scope for the variables used by the function
...
124
Chapter 4
...
0
Figure 4
...
4
...
This can make recursion difficult for people to grasp
...
As you
watch these pictures take shape you will get some new insight into the recursive process that
may be helpful in cementing your understanding of recursion
...
The
turtle module is standard with all versions of Python and is very easy to use
...
You can create a turtle and the turtle can move forward, backward, turn left, turn
right, etc
...
When the turtle’s tail is down and the turtle
moves it draws a line as it moves
...
Here is a simple example to illustrate some turtle graphics basics
...
The code below shows how it is done
...
When the turtle is created it also creates a window for itself
to draw in
...
The base case for this simple function
is when the length of the line we want to draw, as given by the len parameter, is reduced to
zero or less
...
The recursive step is when we call drawSpiral
again with a reduced length
...
exitonclick(), this is a handy little method of the window that puts the turtle
into a wait mode until you click inside the window, after which the program cleans up and exits
...
4
...
0
import turtle
my_turtle = turtle
...
Screen()
def draw_spiral(my_turtle, line_len):
if lineLen > 0:
my_turtle
...
right(90)
draw_spiral(my_turtle, line_len - 5)
draw_spiral(my_turtle, 100)
my_win
...
For our next program we are going to draw a fractal tree
...
The definition of a
fractal is that when you look at it the fractal has the same basic shape no matter how much you
magnify it
...
The fractal nature of many of these natural phenomenon makes it
possible for programmers to generate very realistic looking scenery for computer generated
movies
...
To understand how this is going to work it is helpful to think of how we might describe a tree
using a fractal vocabulary
...
If we translate this to trees and shrubs we
might say that even a small twig has the same shape and characteristics as a whole tree
...
If you think of this definition recursively it means that we will
apply the recursive definition of a tree to both of the smaller left and right trees
...
The code below shows how we can use our turtle
to generate a fractal tree
...
You will see that on lines 5
and 7 we are making a recursive call
...
Then in line 7 the turtle
makes another recursive call, but this time after turning left by 40 degrees
...
Also notice
that each time we make a recursive call to tree we subtract some amount from the branchLen
parameter; this is to make sure that the recursive trees get smaller and smaller
...
def tree(branch_len, t):
if branch_len > 5:
t
...
right(20)
tree(branch_len - 15, t)
126
Chapter 4
...
0
t
...
right(20)
t
...
Before you run the code think
about how you expect to see the tree take shape
...
Will it be drawn symmetrically with the right and left halves of the
tree taking shape simultaneously? Will it be drawn right side first then left side?
import turtle
def tree(branch_len, t):
if branch_len > 5:
t
...
right(20)
tree(branch_len - 15, t)
t
...
right(20)
t
...
Turtle()
my_win = turtle
...
left(90)
t
...
backward(100)
t
...
color("green")
tree(75, t)
my_win
...
You can see this in Figure 4
...
Now, notice how the program works its way back up the trunk until the entire right side of the
tree is drawn
...
8
...
Rather, once again the entire right
side of the left tree is drawn until we finally make our way out to the smallest twig on the left
...
The exercises at the end of the chapter will give you some ideas for how to explore some
interesting options to make your tree look more realistic
...
4
...
0
Figure 4
...
Recursion
Problem Solving with Algorithms and Data Structures, Release 3
...
8: The First Half of the Tree
4
...
Visualising Recursion
129
Problem Solving with Algorithms and Data Structures, Release 3
...
• Modify the color of the branches so that as the branchLen gets very short it is colored
like a leaf
...
For example choose the angle between 15 and 45
degrees
...
• Modify the branchLen recursively so that instead of always subtracting the same amount
you subtract a random amount in some range
...
4
...
An example is shown in Figure 4
...
The Sierpinski triangle illustrates a three-way recursive algorithm
...
Start with a single large
triangle
...
Ignoring the middle triangle that you just created, apply the same procedure to each of
the three corner triangles
...
You can continue to apply this procedure
indefinitely if you have a sharp enough pencil
...
Since we can continue to apply the algorithm indefinitely, what is the base case? We will
see that the base case is set arbitrarily as the number of times we want to divide the triangle
into pieces
...
Each time we make a
recursive call, we subtract 1 from the degree until we reach 0
...
The code that generated the Sierpinski Triangle is shown below
...
fillcolor(color)
my_turtle
...
goto(points[0][0],points[0][1])
my_turtle
...
begin_fill()
my_turtle
...
goto(points[2][0], points[2][1])
my_turtle
...
end_fill()
def get_mid(p1, p2):
return ((p1[0] + p2[0]) / 2, (p1[1] + p2[1]) / 2)
130
Chapter 4
...
0
Figure 4
...
Turtle()
my_win = turtle
...
exitonclick()
4
...
Visualising Recursion
131
Problem Solving with Algorithms and Data Structures, Release 3
...
10: Building a Sierpinski Triangle
main()
The first thing sierpinski does is draw the outer triangle
...
Once again we
make use of the standard turtle module that comes with Python
...
Look at the code and think about the order in which the triangles will be drawn
...
Because of the way the sierpinski function calls
itself, sierpinski works its way to the smallest allowed triangle in the lower-left corner, and
then begins to fill out the rest of the triangles working back
...
Finally, it fills in the lower-right
corner, working its way toward the smallest triangle in the lower right
...
Figure 4
...
The active
functions are outlined in black, and the inactive function calls are in gray
...
10, the smaller the triangles
...
The sierpinski function relies heavily on the getMid function
...
In addition, the code has a function that
draws a filled triangle using the begin_fill and end_fill turtle methods
...
132
Chapter 4
...
0
Figure 4
...
5 Complex Recursive Problems
In the previous sections we looked at some problems that are relatively easy to solve and some
graphically interesting problems that can help us gain a mental model of what is happening in
a recursive algorithm
...
We will finish up by looking at a deceptive problem that at first looks like it has an
elegant recursive solution but in fact does not
...
5
...
He was inspired by a legend that tells of a Hindu temple where the puzzle was presented to
young priests
...
Their assignment was to transfer
all 64 disks from one of the three poles to another, with two important constraints
...
The priests worked very efficiently, day and night, moving one disk every second
...
Although the legend is interesting, you need not worry about the world ending any time
soon
...
At a rate of one move per second, that is 584, 942, 417, 355
years! Clearly there is more to this puzzle than meets the eye
...
11 shows an example of a configuration of disks in the middle of a move from the
first peg to the third
...
If you have not tried to solve this puzzle
before, you should try it now
...
How do we go about solving this problem recursively? How would you go about solving this
problem at all? What is our base case? Let’s think about this problem from the bottom up
...
If you already knew how to
move a tower of four disks to peg two, you could then easily move the bottom disk to peg three,
4
...
Complex Recursive Problems
133
Problem Solving with Algorithms and Data Structures, Release 3
...
But what if you do not know how
to move a tower of height four? Suppose that you knew how to move a tower of height three
to peg three; then it would be easy to move the fourth disk to peg two and move the three from
peg three on top of it
...
This sounds like a base case in the making
...
Move a tower of height-1 to an intermediate pole, using the final pole
...
Move the remaining disk to the final pole
...
Move the tower of height-1 from the intermediate pole to the final pole using the original
pole
...
The only thing missing from the outline above is the identification of a base case
...
In this case, we need move only
a single disk to its final destination
...
In addition, the
steps outlined above move us toward the base case by reducing the height of the tower in steps
1 and 3
...
The key to the simplicity of the
algorithm is that we make two different recursive calls, one on line 3 and a second on line 5
...
The next
line simply moves the bottom disk to its final resting place
...
The base case is detected when the
tower height is 0; in this case there is nothing to do, so the moveTower function simply returns
...
The function moveDisk, shown below, is very simple
...
If you type in and run the moveTower program you can see
that it gives you a very efficient solution to the puzzle
...
134
Chapter 4
...
0
def move_tower(height, from_pole, to_pole, with_pole):
if height >= 1:
move_tower(height - 1, from_pole, with_pole, to_pole)
move_disk(from_pole, to_pole)
move_tower(height - 1, with_pole, to_pole, from_pole)
def move_disk(fp,tp):
print("moving disk from",fp,"to",tp
move_tower(3, "A", "B", "C")
Now that you have seen the code for both moveTower and moveDisk, you may be wondering
why we do not have a data structure that explicitly keeps track of what disks are on what poles
...
The answer is that Python provides the stacks that we
need implicitly through the call stack
...
6 Exploring a Maze
In this section we will look at a problem that has relevance to the expanding world of robotics:
How do you find your way out of a maze? If you have a Roomba vacuum cleaner for your
dorm room (don’t all college students?) you will wish that you could reprogram it using what
you have learned in this section
...
The maze problem has roots as deep as the Greek myth about Theseus
who was sent into a maze to kill the minotaur
...
In our problem we will assume that
our turtle is dropped down somewhere into the middle of the maze and must find its way out
...
12 to get an idea of where we are going in this section
...
” Each square
of the maze is either open or occupied by a section of wall
...
If the turtle bumps into a wall it must try a different direction
...
Here is the procedure:
• From our starting position we will first try going North one square and then recursively
try our procedure from there
...
• If South does not work then we will try a step to the West as our first step and recursively
apply our procedure
...
• If none of these directions works then there is no way to get out of the maze and we fail
...
Suppose we
take our first recursive step by going North
...
6
...
0
Figure 4
...
Recursion
Problem Solving with Algorithms and Data Structures, Release 3
...
But if the North is blocked by a wall we must look at the next step of the
procedure and try going to the South
...
If we apply the recursive procedure from there we will just go
back one step to the North and be in an infinite loop
...
In this case we will assume that we have a bag of bread crumbs we can
drop along our way
...
As we will see when we look at the code for this algorithm, backing up is as
simple as returning from a recursive function call
...
Some of them you may
already have guessed based on the description in the previous paragraph
...
The turtle has run into a wall
...
2
...
We do not want to continue
exploring from this position or we will get into a loop
...
We have found an outside edge, not occupied by a wall
...
4
...
For our program to work we will need to have a way to represent the maze
...
The maze object will provide the following methods for us to
use in writing our search algorithm:
• __init__ Reads in a data file representing a maze, initializes the internal representation
of the maze, and finds the starting position for the turtle
...
• update_position Updates the internal representation of the maze and changes the position
of the turtle in the window
...
The Maze class also overloads the index operator [] so that our algorithm can easily access the
status of any particular square
...
The code is shown
below
...
This is important because as a recursive function the search logically starts
again with each recursive call
...
update_position(start_row, start_column)
# Check for base cases:
# 1
...
6
...
0
7
8
9
10
11
12
13
14
# 2
...
Success, an outside edge not occupied by an obstacle
if maze
...
update_position(start_row, start_column, PART_OF_PATH)
return True
maze
...
update_position(start_row, start_column, PART_OF_PATH)
else:
maze
...
This is simply to help you visualize the algorithm so that you can watch
exactly how the turtle explores its way through the maze
...
You will notice that in the recursive step there are four recursive calls to searchFrom
...
If the first call to searchFrom returns True then none of the last three calls would
be needed
...
If there is not a good path
leading out of the maze to the North then the next recursive call is tried, this one to the South
...
If all four recursive calls return false then we have
found a dead end
...
The code for the Maze class is shown below
...
This file is a text file that represents a maze by using “+” characters for
walls, spaces for open squares, and the letter “S” to indicate the starting position
...
Recursion
Problem Solving with Algorithms and Data Structures, Release 3
...
Each row of the maze_list instance
variable is also a list
...
For the data file shown above the internal representation looks like the
following:
[ ['+','+','+','+',
...
,' ',' ',' ','+',' ',' ',' '],
['+',' ','+',' ',
...
,' ',' ',' ','+',' ','+','+'],
['+','+','+',' ',
...
,'+','+',' ',' ',' ',' ','+'],
['+','+','+','+',
...
,'+','+',' ',' ','+',' ','+'],
['+',' ','+','+',
...
,' ',' ','+',' ','+','+','+'],
['+','+','+','+',
...
It also updates the internal representation with a “
...
In addition,
the update_position method uses two helper methods, move_turtle and drop_bread_crumb, to
update the view on the screen
...
An exit condition is whenever the turtle has navigated to the edge of the maze, either row zero
or column zero, or the far right column or the bottom row
...
maze_list = []
maze_file = open(maze_file_name,'r')
rows_in_maze = 0
for line in maze_file:
row_list = []
col = 0
for ch in line[:-1]:
row_list
...
start_row = rows_in_maze
self
...
maze_list
...
6
...
0
columns_in_maze = len(row_list)
self
...
columns_in_maze = columns_in_maze
self
...
y_translate = rows_in_maze / 2
self
...
5,
- (rows_in_maze - 1) / 2 -
...
5,
(rows_in_maze - 1) / 2 +
...
rows_in_maze):
for x in range(self
...
maze_list[y][x] == OBSTACLE:
self
...
x_translate,
- y + self
...
t
...
t
...
t
...
5,y-
...
t
...
t
...
t
...
t
...
t
...
t
...
t
...
t
...
t
...
t
...
x_translate,
- y + self
...
t
...
x_translate, - y + self
...
t
...
maze_list[row][col] = val
self
...
Recursion
Problem Solving with Algorithms and Data Structures, Release 3
...
drop_bread_crumb(color)
def is_exit(self, row, col):
return (row == 0 or
row == self
...
columns_in_maze - 1)
def __getitem__(self, idx):
return self
...
This program uses the data file maze2
...
# Completed maze program
# Takes maze2
...
'
OBSTACLE = '+'
DEAD_END = '-'
4
...
Exploring a Maze
141
Problem Solving with Algorithms and Data Structures, Release 3
...
maze_list = []
maze_file = open(maze_file_name,'r')
rows_in_maze = 0
for line in maze_file:
row_list = []
col = 0
for ch in line[: -1]:
row_list
...
start_row = rows_in_maze
self
...
maze_list
...
rows_in_maze = rows_in_maze
self
...
x_translate = - columns_in_maze / 2
self
...
t = turtle
...
t
...
wn = turtle
...
wn
...
5,
- (rows_in_maze - 1) / 2 -
...
5,
(rows_in_maze - 1) / 2 +
...
t
...
rows_in_maze):
for x in range(self
...
maze_list[y][x] == OBSTACLE:
self
...
x_translate,
- y + self
...
t
...
t
...
t
...
t
...
5, y -
...
t
...
t
...
t
...
t
...
t
...
Recursion
Problem Solving with Algorithms and Data Structures, Release 3
...
t
...
t
...
t
...
t
...
t
...
t
...
x_translate,
- y + self
...
t
...
x_translate, - y + self
...
t
...
maze_list[row][col] = val
self
...
drop_bread_crumb(color)
def is_exit(self, row, col):
return (row == 0 or
row == self
...
columns_in_maze - 1)
def __getitem__(self,idx):
return self
...
# base Case return values:
# 1
...
update_position(start_row, start_column)
if maze[start_row][start_column] == OBSTACLE :
return False
# 2
...
6
...
0
if maze[start_row][start_column] == TRIED or
maze[start_row][start_column] == DEAD_END:
return False
# 3
...
is_exit(start_row,start_column):
maze
...
update_position(start_row, start_column, TRIED)
# Otherwise, use logical short circuiting to try each direction
# in turn (if needed)
found = search_from(maze, start_row-1, start_column) or \
search_from(maze, start_row+1, start_column) or \
search_from(maze, start_row, start_column-1) or \
search_from(maze, start_row, start_column+1)
if found:
maze
...
update_position(start_row, start_column, DEAD_END)
return found
my_maze = Maze('maze2
...
draw_maze()
my_maze
...
start_row, my_maze
...
start_row, my_maze
...
Watch
the program run
...
7 Summary
In this chapter we have looked at examples of several recursive algorithms
...
The key points to remember from this chapter are as follows:
• All recursive algorithms must have a base case
...
• A recursive algorithm must call itself (recursively)
...
• Recursive algorithms often map very naturally to a formal expression of the problem you
are trying to solve
...
Recursion
Problem Solving with Algorithms and Data Structures, Release 3
...
Sometimes a recursive solution may be more computationally expensive than an alternative algorithm
...
4
...
9 Discussion Questions
1
...
Assume that you start with a stack of
three disks
...
Using the recursive rules as described, draw a Sierpinski triangle using paper and pencil
...
10 Programming Exercises
• Write a recursive function to compute the factorial of a number
...
• Modify the recursive tree program using one or all of the following ideas:
– Modify the thickness of the branches so that as the branchLen gets smaller, the line
gets thinner
...
– Modify the angle used in turning the turtle so that at each branch point the angle is
selected at random in some range
...
Play around to see what looks good
...
If you implement all of the above ideas you will have a very realistic looking tree
...
Hint: One approach to this
uses triangles again
...
How does the performance of the recursive function compare to that of an iterative version?
• Implement a solution to the Tower of Hanoi using three stacks to keep track of the disks
...
• Using the turtle graphics module, write a recursive program to display a Koch snowflake
...
8
...
0
• Write a program to solve the following problem: You have two jugs: a 4-gallon jug and
a 3-gallon jug
...
There is a pump that can
be used to fill the jugs with water
...
• Write a program that solves the following problem: Three missionaries and three cannibals come to a river and find a boat that holds two people
...
However, if the cannibals ever outnumber the missionaries on either bank, the missionaries will be eaten
...
• Modify the Tower of Hanoi program using turtle graphics to animate the movement of
the disks
...
• Pascal’s triangle is a number triangle with numbers arranged in staggered rows such that
𝑛!
𝑎 𝑛 𝑟 = 𝑟!(𝑛−𝑟)! This equation is the equation for a binomial coefficient
...
An example of Pascal’s triangle is shown below
...
Your program should accept a parameter
that tells how many rows of the triangle to print
...
Recursion
CHAPTER
FIVE
SORTING AND SEARCHING
5
...
• To be able to explain and implement selection sort, bubble sort, merge sort, quick sort,
insertion sort, and shell sort
...
• To introduce the map abstract data type
...
5
...
In this section we will study searching
...
Searching is the algorithmic process of finding a particular item in
a collection of items
...
On occasion it may be modified to return where the item is found
...
In Python, there is a very easy way to ask whether an item is in a list of items
...
>>> 15 in [3,5,2,4,1]
False
>>> 3 in [3,5,2,4,1]
True
>>>
Even though this is easy to write, an underlying process must be carried out to answer the
question
...
What we are
interested in here is how these algorithms work and how they compare to one another
...
0
Figure 5
...
2
...
Each data item is stored in a position relative to the others
...
Since these index
values are ordered, it is possible for us to visit them in sequence
...
Figure 5
...
Starting at the first item in the list, we simply move
from item to item, following the underlying sequential ordering until we either find what we
are looking for or run out of items
...
The Python implementation for this algorithm is shown below
...
The boolean
variable found is initialized to False and is assigned the value True if we discover the item in
the list
...
Recall
that this is typically the common step that must be repeated in order to solve the problem
...
Each comparison
may or may not discover the item we are looking for
...
The list of items is not ordered in any way
...
Sorting and Searching
Problem Solving with Algorithms and Data Structures, Release 3
...
1: Comparisons Used in a Sequential Search of an Unordered List
Figure 5
...
In other words, the probability that the item we are looking for is in any particular position
is exactly the same for each position of the list
...
If there are 𝑛 items, then the sequential search requires 𝑛 comparisons to discover that the item
is not there
...
There are actually three different scenarios that can occur
...
We will need only one comparison
...
What about the average case? On average, we will find the item about halfway into the list;
that is, we will compare against 2𝑛 items
...
Table 5
...
We assumed earlier that the items in our collection had been randomly placed so that there is
no relative order between the items
...
If the item we are looking for is present in the list, the chance of it being in
any one of the 𝑛 positions is still the same as before
...
However, if the item is not present there is a slight advantage
...
2 shows this process as the algorithm looks for the item 50
...
At this point, however, we know something extra
...
In this case, the algorithm does not have to continue looking through all of the
items to report that the item was not found
...
def ordered_sequential_search(a_list, item):
pos = 0
found = False
stop = False
while pos < len(a_list) and not found and not stop:
if a_list[pos] == item:
found = True
else:
5
...
Searching
149
Problem Solving with Algorithms and Data Structures, Release 3
...
2: Comparisons Used in Sequential Search of an Ordered List
if a_list[pos] > item:
stop = True
else:
pos = pos+1
return found
test_list = [0, 1, 2, 8, 13, 17, 19, 32, 42,]
print(ordered_sequential_search(test_list, 3))
print(ordered_sequential_search(test_list, 13))
Table 5
...
Note that in the best case we might discover that the item is
not in the list by looking at only one item
...
However, this technique is still 𝑂(𝑛)
...
Self Check
Suppose you are doing a sequential search of the list [15, 18, 2, 19, 18, 0, 8, 14, 19, 14]
...
5
2
...
4
4
...
How many comparisons would you need to do in order to find the key 13?
1
...
5
3
...
6
5
...
2 The Binary Search
It is possible to take greater advantage of the ordered list if we are clever with our comparisons
...
Instead of searching the
150
Chapter 5
...
0
Figure 5
...
If that item is the one
we are searching for, we are done
...
If the item we are searching for is greater than
the middle item, we know that the entire lower half of the list as well as the middle item can be
eliminated from further consideration
...
We can then repeat the process with the upper half
...
Again, we either find it or split the list in half, therefore
eliminating another large part of our possible search space
...
3 shows how this
algorithm can quickly find the value 54
...
Divide and conquer means that we divide the problem into
smaller pieces, solve the smaller pieces in some way, and then reassemble the whole problem
to get the result
...
If the item we are searching for is less than the middle item, we can simply perform a binary
search of the left half of the original list
...
Either way, this is a recursive call to the binary search function
passing a smaller list
...
2
...
0
Comparisons
1
2
3
···
𝑖
Approximate Number Of Items Left
𝑛
2
𝑛
4
𝑛
8
···
𝑛
2𝑖
Table 5
...
What is the maximum number of comparisons
this algorithm will require to check the entire list? If we start with 𝑛 items, about 2𝑛 items will
be left after the first comparison
...
Then 8𝑛 ,
𝑛
, and so on
...
3 helps us to see the answer
...
Either that is
the item we are looking for or it is not
...
The number of comparisons
necessary to get to this point is 𝑖 where 2𝑛𝑖 = 1
...
The maximum
number of comparisons is logarithmic with respect to the number of items in the list
...
One additional analysis issue needs to be addressed
...
uses the slice operator to create the left half of the list that is then passed to the next invocation
(similarly for the right half as well)
...
However, we know that the slice operator in Python is actually
𝑂(𝑘)
...
Luckily this can be remedied by passing the list along with the starting and ending indices
...
Sorting and Searching
Problem Solving with Algorithms and Data Structures, Release 3
...
In fact, we
should always consider whether it is cost effective to take on the extra work of sorting to gain
searching benefits
...
However, for large lists, sorting even once can be so expensive that simply
performing a sequential search from the start may be the best choice
...
Which group of numbers correctly shows the sequence of
comparisons used to find the key 8
...
11, 5, 6, 8
2
...
3, 5, 6, 8
4
...
Which group of numbers correctly shows the sequence of
comparisons used to search for the key 16?
1
...
18, 17, 15
3
...
12, 17, 15
5
...
3 Hashing
In previous sections we were able to make improvements in our search algorithms by taking
advantage of information about where items are stored in the collection with respect to one
another
...
In this section we will attempt to go one step further by building a data
structure that can be searched in 𝑂(1) time
...
In order to do this, we will need to know even more about where the items might be when we
go to look for them in the collection
...
We will see, however, that this is
typically not the case
...
Each position of the hash table, often called a slot, can hold an item and is named
by an integer value starting at 0
...
Initially, the hash table contains no items so every slot is empty
...
Figure 5
...
In other words, there are m slots in
the table, named 0 through 10
...
2
...
0
Figure 5
...
4: Simple Hash Function Using Remainders
The mapping between an item and the slot where that item belongs in the hash table is called
the hash function
...
Assume that we have the set of integer items
54, 26, 93, 17, 77, and 31
...
Table 5
...
Note that this remainder method (modulo arithmetic) will typically be present in some form in
all hash functions, since the result must be in the range of slot names
...
5
...
This
is referred to as the load factor, and is commonly denoted by 𝜆 = number_of_items
...
Now when we want to search for an item, we simply use the hash function to compute the slot
name for the item and then check the hash table to see if it is present
...
If everything is where it should be, we have found a constant
time search algorithm
...
For example, if the item 44 had been the next item in our
collection, it would have a hash value of 0 (44%11 == 0)
...
5: Hash Table with Six Items
154
Chapter 5
...
0
0, we would have a problem
...
This is referred to as a collision (it may also be called a “clash”)
...
We will discuss them in detail later
...
If we know the items and the collection will never change, then it is
possible to construct a perfect hash function (refer to the exercises for more about perfect hash
functions)
...
Luckily, we do not need the hash function to be perfect to
still gain performance efficiency
...
This guarantees that each item
will have a unique slot
...
For example, if the items were nine-digit Social
Security numbers, this method would require almost one billion slots
...
Our goal is to create a hash function that minimizes the number of collisions, is easy to compute,
and evenly distributes the items in the hash table
...
We will consider a few of them here
...
These pieces are then added together
to give the resulting hash value
...
After the
addition, 43 + 65 + 55 + 46 + 01, we get 210
...
In this case
210%11 is 1, so the phone number 436-555-4601 hashes to slot 1
...
For the above example, we
get 43 + 56 + 55 + 64 + 01 = 219 which gives 219%11 = 10
...
We first square the item, and then extract some portion of the resulting digits
...
By extracting the middle two
digits, 93, and performing the remainder step, we get 5 (93%11)
...
5 shows items under
both the remainder method and the mid-square method
...
We can also create hash functions for character-based items such as strings
...
>>> ord('c')
99
>>> ord('a')
97
>>> ord('t')
116
5
...
Searching
155
Problem Solving with Algorithms and Data Structures, Release 3
...
5: Comparisons of Remainder and Mid-Square Methods
Figure 5
...
6)
...
def hash(a_string, table_size):
sum = 0
for pos in range(len(a_string)):
sum = sum + ord(a_string[pos])
return sum % table_size
It is interesting to note that when using this hash function, anagrams will always be given
the same hash value
...
Figure 5
...
The
modification to the hash function is left as an exercise
...
The important thing to remember is that the hash function has to be efficient
so that it does not become the dominant part of the storage and search process
...
This would quickly defeat
the purpose of hashing
...
When two items hash to the same slot, we must
have a systematic method for placing the second item in the hash table
...
As we stated earlier, if the hash function is perfect, collisions will never
156
Chapter 5
...
0
Figure 5
...
8: Collision Resolution with Linear Probing
occur
...
One method for resolving collisions looks into the hash table and tries to find another open slot
to hold the item that caused the collision
...
Note that we may need to go back to the first slot (circularly) to
cover the entire hash table
...
By systematically visiting
each slot one at a time, we are performing an open addressing technique called linear probing
...
8 shows an extended set of integer items under the simple remainder method
hash function (54, 26, 93, 17, 77, 31, 44, 55, 20)
...
4 above shows the hash values for the
original items
...
5 shows the original contents
...
Under linear probing, we look sequentially, slot by slot, until we find an
open position
...
Again, 55 should go in slot 0 but must be placed in slot 2 since it is the next open position
...
Since slot 9 is full, we begin to do linear probing
...
Once we have built a hash table using open addressing and linear probing, it is essential that we
utilize the same methods to search for items
...
When
we compute the hash value, we get 5
...
What if we are looking for 20? Now the hash value is 9, and slot 9 is currently holding 31
...
We are now
forced to do a sequential search, starting at position 10, looking until either we find the item 20
or we find an empty slot
...
2
...
0
Figure 5
...
10: Collision Resolution Using “Plus 3”
A disadvantage to linear probing is the tendency for clustering; items become clustered in the
table
...
This will have an impact on other items
that are being inserted, as we saw when we tried to add the item 20 above
...
This cluster is shown in
Figure 5
...
One way to deal with clustering is to extend the linear probing technique so that instead of
looking sequentially for the next open slot, we skip slots, thereby more evenly distributing
the items that have caused collisions
...
Figure 5
...
This
means that once a collision occurs, we will look at every third slot until we find one that is
empty
...
With
simple linear probing, the rehash function is
new_hash_value = rehash(old_hash_value)
where
rehash(pos) = (pos + 1)%size_of_table
...
In general,
rehash(pos) = (pos + skip)%sizeoftable
...
Otherwise, part of the table will be unused
...
This is the reason we have been using 11 in
our examples
...
Instead of using a constant
“skip” value, we use a rehash function that increments the hash value by 1, 3, 5, 7, 9, and so on
...
Sorting and Searching
Problem Solving with Algorithms and Data Structures, Release 3
...
11: Collision Resolution with Quadratic Probing
Figure 5
...
In other words, quadratic probing uses a skip consisting of successive perfect squares
...
11 shows our example values after they are placed using this technique
...
Chaining allows many items to exist at the same location
in the hash table
...
As more and more items hash to the same location, the difficulty of searching for the
item in the collection increases
...
12 shows the items as they are added to a hash table
that uses chaining to resolve collisions
...
Since each slot holds a collection, we use a searching technique to decide
whether the item is present
...
We will look at the analysis
for hashing at the end of this section
...
1, 10
2
...
2
...
0
3
...
2, 3
Suppose you are given the following set of keys to insert into a hash table that holds exactly 11
values: 113, 117, 97, 100, 114, 108, 116, 105, 99
...
100, __, __, 113, 114, 105, 116, 117, 97, 108, 99
2
...
100, 113, 117, 97, 14, 108, 116, 105, 99, __, __
4
...
Recall that a dictionary is an associative data type where you can store key-data pairs
...
We often refer to this idea as a map
...
The structure is an unordered collection of
associations between a key and a data value
...
The operations are given below
...
It returns an empty map collection
...
If the key is already in the map
then replace the old value with the new value
...
• del Delete the key-value pair from the map using a statement of the form del map[key]
...
• in Return True for a statement of the form key in map, if the given key is in the map,
False otherwise
...
In order to provide this fast look up capability, we need an
implementation that supports an efficient search
...
Below we use two lists to create a HashTable class that implements the Map abstract data
type
...
When we look up a key, the corresponding position in the data list will hold
the associated data value
...
Note that the initial size for the hash table has been chosen to be 11
...
160
Chapter 5
...
0
class HashTable:
def __init__(self):
self
...
slots = [None] * self
...
data = [None] * self
...
The collision resolution technique is linear probing with a “plus 1” rehash function
...
1)
assumes that there will eventually be an empty slot unless the key is already present in the
self
...
It computes the original hash value and if that slot is not empty, iterates the
rehash function until an empty slot occurs
...
Listing 5
...
hash_function(key,len(self
...
slots[hash_value] == None:
self
...
data[hash_value] = data
else:
if self
...
data[hash_value] = data #replace
else:
next_slot = self
...
slots))
while self
...
slots[next_slot] != key:
next_slot = self
...
slots))
15
16
17
18
19
20
if self
...
slots[next_slot] = key
self
...
data[next_slot] = data #replace
21
22
23
def hash_function(self, key, size):
return key % size
24
25
26
def rehash(self, old_hash, size):
return (old_hash + 1) % size
Likewise, the get function begins by computing the initial hash value
...
Notice that line 15 guarantees
that the search will terminate by checking to make sure that we have not returned to the initial
slot
...
The final methods of the HashTable class provide additional dictionary functionality
...
” This
means that once a HashTable has been created, the familiar index operator will be available
...
2
...
0
We leave the remaining methods as exercises
...
hash_function(key, len(self
...
slots[position] != None and \
not found and not stop:
if self
...
data[position]
else:
position=self
...
slots))
if position == start_slot:
stop = True
return data
def __getitem__(self, key):
return self
...
put(key, data)
The following session shows the HashTable class in action
...
>>> h=HashTable()
>>> h[54]="cat"
>>> h[26]="dog"
>>> h[93]="lion"
>>> h[17]="tiger"
>>> h[77]="bird"
>>> h[31]="cow"
>>> h[44]="goat"
>>> h[55]="pig"
>>> h[20]="chicken"
>>> h
...
data
['bird', 'goat', 'pig', 'chicken', 'dog', 'lion',
'tiger', None, None, 'cow', 'cat']
>>>
Next we will access and modify some items in the hash table
...
162
Chapter 5
...
0
>>> h[20]
'chicken'
>>> h[17]
'tiger'
>>> h[20]='duck'
>>> h[20]
'duck'
>>> h
...
However, due to collisions, the number of comparisons is typically not so simple
...
The most important piece of information we need to analyze the use of a hash table is the load
factor, 𝜆
...
If 𝜆 is large, meaning that the table is
filling up, then there are more and more collisions
...
With chaining, increased collisions
means an increased number of items on each chain
...
For a successful search using open addressing with linear probing, the average number of comparisons
(︁
(︀
)︀
(︀ 1 )︀2 )︁
1
1
If we are using
is approximately 2 1 + 1−𝜆 and an unsuccessful search gives 1 1 + 1−𝜆
2
chaining, the average number of comparisons is 1 +
comparisons if the search is unsuccessful
...
3 Sorting
Sorting is the process of placing elements from a collection in some kind of order
...
A list of cities could be sorted by
population, by area, or by zip code
...
There are many, many sorting algorithms that have been developed and analyzed
...
Sorting a large number of items
can take a substantial amount of computing resources
...
For small collections, a
complex sorting method may be more trouble than it is worth
...
5
...
Sorting
163
Problem Solving with Algorithms and Data Structures, Release 3
...
In this section we will discuss several sorting techniques and compare them with
respect to their running time
...
First, it will be necessary to compare two values to see which is
smaller (or larger)
...
The total number of comparisons will be
the most common way to measure a sort procedure
...
This exchange is
a costly operation and the total number of exchanges will also be important for evaluating the
overall efficiency of the algorithm
...
3
...
It compares adjacent items and exchanges those that are out of order
...
In essence, each item “bubbles” up to the location where it belongs
...
13 shows the first pass of a bubble sort
...
If there are 𝑛 items in the list, then there are 𝑛 − 1 pairs of items that
need to be compared on the first pass
...
At the start of the second pass, the largest value is now in place
...
Since each pass places the next largest value in
place, the total number of passes necessary will be 𝑛 − 1
...
The
code below shows the complete bubble_sort function
...
def bubble_sort(a_list):
for pass_num in range(len(a_list) - 1, 0, -1):
for i in range(pass_num):
if a_list[i] > a_list[i + 1]:
temp = a_list[i]
a_list[i] = a_list[i + 1]
a_list[i + 1] = temp
a_list = [54, 26, 93, 17, 77, 31, 44, 55, 20]
bubble_sort(a_list)
print(a_list)
The exchange operation, sometimes called a “swap,” is slightly different in Python than in
most other programming languages
...
A code fragment such as
temp = a_list[i]
a_list[i] = a_list[j]
164
Chapter 5
...
0
Figure 5
...
Without the temporary storage, one of the values
would be overwritten
...
The statement a, b = b, a
will result in two assignment statements being done at the same time (see Figure 5
...
Using
simultaneous assignment, the exchange operation can be done in one statement
...
Note that we could also have used the simultaneous
assignment to swap the items
...
Table 5
...
The total number of comparisons is the sum of the first 𝑛 − 1
1
integers
...
The sum of the first 𝑛 − 1
2
1 2
1
1 2
1
integers is 2 𝑛 + 2 𝑛 − 𝑛, which is 2 𝑛 − 2 𝑛
...
In the best case,
if the list is already ordered, no exchanges will be made
...
On average, we exchange half of the time
...
These “wasted” exchange operations are very costly
...
3
...
0
Figure 5
...
𝑛−1
Comparisons
𝑛−1
𝑛−2
𝑛−3
...
6: Comparisons for Each Pass of Bubble Sort
it has the capability to do something most sorting algorithms cannot
...
A bubble sort can
be modified to stop early if it finds that the list has become sorted
...
The code below shows this modification, which is often referred to as the short bubble
...
Sorting and Searching
Problem Solving with Algorithms and Data Structures, Release 3
...
[1, 9, 19, 7, 3, 10, 13, 15, 8, 12]
2
...
[1, 7, 3, 9, 10, 13, 8, 12, 15, 19]
4
...
3
...
In order to do this, a selection sort looks for the largest value as it makes a
pass and, after completing the pass, places it in the proper location
...
After the second pass, the next largest is
in place
...
Figure 5
...
On each pass, the largest remaining item is
selected and then placed in its proper location
...
The function is shown below
...
However, due to the reduction in the number of exchanges, the
5
...
Sorting
167
Problem Solving with Algorithms and Data Structures, Release 3
...
15: Selection Sort
168
Chapter 5
...
0
selection sort typically executes faster in benchmark studies
...
Self Check
Suppose you have the following list of numbers to sort: [11, 7, 12, 14, 19, 1, 6, 18, 8, 20] which
list represents the partially sorted list after three complete passes of selection sort?
1
...
[7, 11, 12, 14, 19, 1, 6, 18, 8, 20]
3
...
[11, 7, 12, 14, 8, 1, 6, 18, 19, 20]
5
...
3 The Insertion Sort
The insertion sort, although still 𝑂(𝑛2 ), works in a slightly different way
...
Each new item is then “inserted” back into the
previous sublist such that the sorted sublist is one item larger
...
16 shows the insertion sorting process
...
We begin by assuming that a list with one item (position 0) is already sorted
...
As we look back into the already sorted sublist, we shift those items that are greater
to the right
...
Figure 5
...
At this point in the algorithm, a sorted sublist of
five items consisting of 17, 26, 54, 77, and 93 exists
...
The first comparison against 93 causes 93 to be shifted to the right
...
When the item 26 is encountered, the shifting process stops and 31 is placed in the
open position
...
The implementation of insertion_sort shows that there are again 𝑛 − 1 passes to sort 𝑛 items
...
Line 8 performs the shift operation that moves
a value up one position in the list, making room behind it for the insertion
...
The maximum number of comparisons for an insertion sort is the sum of the first 𝑛−1 integers
...
However, in the best case, only one comparison needs to be done on each
pass
...
One note about shifting versus exchanging is also important
...
In benchmark studies, insertion sort will show very good performance
...
3
...
0
Figure 5
...
Sorting and Searching
Problem Solving with Algorithms and Data Structures, Release 3
...
17: Insertion Sort: Fifth Pass of the Sort
1
...
[15, 5, 4, 10, 12, 8, 14, 18, 19, 20]
3
...
[15, 5, 4, 18, 12, 19, 14, 8, 10, 20]
5
...
4 Shell Sort
The shell sort, sometimes called the “diminishing increment sort,” improves on the insertion
sort by breaking the original list into a number of smaller sublists, each of which is sorted using
an insertion sort
...
Instead of breaking the list into sublists of contiguous items, the shell sort uses an increment i,
sometimes called the gap, to create a sublist by choosing all items that are i items apart
...
18
...
If we use an increment of three,
there are three sublists, each of which can be sorted by an insertion sort
...
19
...
By sorting the sublists, we have moved the items
closer to where they actually belong
...
20 shows a final insertion sort using an increment of one; in other words, a standard
insertion sort
...
For this case, we need
only four more shifts to complete the process
...
The function shell_sort shown below uses a different set of increments
...
On the next pass, 4𝑛 sublists are sorted
...
3
...
0
Figure 5
...
19: A Shell Sort after Sorting Each Sublist
is sorted with the basic insertion sort
...
21 shows the first sublists for our example using
this increment
...
def shell_sort(a_list):
sublist_count = len(a_list) // 2
while sublist_count > 0:
for start_position in range(sublist_count):
gap_insertion_sort(a_list, start_position, sublist_count)
print("After increments of size", sublist_count, "The list is",
a_list)
sublist_count = sublist_count // 2
def gap_insertion_sort(a_list, start, gap):
for i in range(start + gap, len(a_list), gap):
current_value = a_list[i]
position = i
172
Chapter 5
...
0
Figure 5
...
21: Initial Sublists for a Shell Sort
while position >= gap and a_list[position - gap] >
current_value:
a_list[position] = a_list[position - gap]
position = position - gap
a_list[position] = current_value
a_list = [54, 26, 93, 17, 77, 31, 44, 55, 20]
shell_sort(a_list)
print(a_list)
At first glance you may think that a shell sort cannot be better than an insertion sort, since it
does a complete insertion sort as the last step
...
In other words, each pass produces a
list that is “more sorted” than the previous one
...
5
...
Sorting
173
Problem Solving with Algorithms and Data Structures, Release 3
...
By changing the increment, for example using 2 𝑘 − 1 (1, 3, 7, 15, 31, and so on), a shell
3
sort can perform at 𝑂(𝑛 2 )
...
[5, 3, 8, 7, 16, 19, 9, 17, 20, 12]
2
...
[3, 5, 7, 8, 9, 12, 16, 17, 19, 20]
4
...
3
...
The first algorithm we will study is the merge sort
...
If the list is empty or has one
item, it is sorted by definition (the base case)
...
Once the two halves are sorted, the
fundamental operation, called a merge, is performed
...
Figure 5
...
Figure 5
...
The merge_sort function shown below begins by asking the base case question
...
If, on the other hand, the length is greater than one, then we use the Python
slice operation to extract the left and right halves
...
That does not matter, as the lengths will differ by at most one
...
Sorting and Searching
Problem Solving with Algorithms and Data Structures, Release 3
...
22: Splitting the List in a Merge Sort
Figure 5
...
3
...
0
while i < len(left_half) and j < len(right_half):
if left_half[i] < right_half[j]:
a_list[k] = left_half[i]
i = i + 1
else:
a_list[k] = right_half[j]
j = j + 1
k = k + 1
14
15
16
17
18
19
20
21
22
while i < len(left_half):
a_list[k] = left_half[i]
i = i + 1
k = k + 1
23
24
25
26
27
28
29
30
31
32
while j < len(right_half):
a_list[k] = right_half[j]
j = j + 1
k = k + 1
print("Merging ", a_list)
33
34
35
36
a_list = [54, 26, 93, 17, 77, 31, 44, 55, 20]
merge_sort(a_list)
print(a_list)
Once the merge_sort function is invoked on the left half and the right half (lines 8–9), it is
assumed they are sorted
...
Notice that the merge operation places the items
back into the original list (a_list) one at a time by repeatedly taking the smallest item from
the sorted lists
...
There is also a print statement
(line 32) to show the merging process
...
Note that the list with 44, 55, and 20 will not divide evenly
...
It is easy to see how the splitting process eventually
yields a list that can be immediately merged with other sorted lists
...
First, the list is split into halves
...
The
second process is the merge
...
So the merge operation which results in a list of size 𝑛 requires 𝑛 operations
...
A merge sort is an 𝑂(𝑛 log 𝑛) algorithm
...
In order to guarantee
that merge_sort will be 𝑂(𝑛 log 𝑛) we will need to remove the slice operator
...
We leave this as an exercise
...
Sorting and Searching
Problem Solving with Algorithms and Data Structures, Release 3
...
This additional space can be a critical
factor if the list is large and can make this sort problematic when working on large data sets
...
[16, 49, 39, 27, 43, 34, 46, 40]
2
...
[21, 1, 26, 45]
4
...
[21, 1] and [26, 45]
2
...
[21] and [1]
4
...
3
...
As a trade-off, however, it is possible that the list may not be
divided in half
...
A quick sort first selects a value, which is called the pivot value
...
The role
of the pivot value is to assist with splitting the list
...
Figure 5
...
Since we have looked at this
example a few times already, we know that 54 will eventually end up in the position currently
holding 31
...
It will find the split point and at the same
time move other items to the appropriate side of the list, either less than or greater than the
pivot value
...
24: The First Pivot Value for a Quick Sort
5
...
Sorting
177
Problem Solving with Algorithms and Data Structures, Release 3
...
25)
...
Figure 5
...
We begin by incrementing left_mark until we locate a value that is greater than the pivot
value
...
At this point we have discovered two items that are out of place with respect to the eventual
split point
...
Now we can exchange these two items
and then repeat the process again
...
The position of
right_mark is now the split point
...
26)
...
The list can now be divided at the split point and the quick sort
can be invoked recursively on the two halves
...
quick_sort_helper begins with the same base case as the merge sort
...
If it is greater, then it can be partitioned and
recursively sorted
...
def quick_sort(a_list):
quick_sort_helper(a_list, 0, len(a_list) - 1)
def quick_sort_helper(a_list, first, last):
if first < last:
split_point = partition(a_list, first, last)
quick_sort_helper(a_list, first, split_point - 1)
quick_sort_helper(a_list, split_point + 1, last)
def partition(a_list, first, last):
pivot_value = a_list[first]
left_mark = first + 1
right_mark = last
done = False
while not done:
while left_mark <= right_mark and \
a_list[left_mark] <= pivot_value:
left_mark = left_mark + 1
while a_list[right_mark] >= pivot_value and \
right_mark >= left_mark:
178
Chapter 5
...
0
Figure 5
...
3
...
0
Figure 5
...
In order to find the split
point, each of the 𝑛 items needs to be checked against the pivot value
...
In
addition, there is no need for additional memory as in the merge sort process
...
In this case, sorting a list of 𝑛
items divides into sorting a list of 0 items and a list of 𝑛 − 1 items
...
The result is an 𝑂(𝑛2 ) sort with
all of the overhead that recursion requires
...
In particular, we
can attempt to alleviate some of the potential for an uneven division by using a technique called
median of three
...
In our example, those are 54, 77, and 20
...
The idea is that in the case where the the first item in the list does not belong toward the middle
180
Chapter 5
...
0
of the list, the median of three will choose a better “middle” value
...
We leave the implementation of
this pivot value selection as an exercise
...
[9, 3, 10, 13, 12]
2
...
[9, 3, 10, 13, 12, 14, 17, 16, 15, 19]
4
...
1
2
...
16
4
...
Shell Sort
2
...
Merge Sort
4
...
4 Summary
• A sequential search is𝑂(𝑛) for ordered and unordered lists
...
• Hash tables can provide constant time searching
...
• A shell sort improves on the insertion sort by sorting incremental sublists
...
• A merge sort is 𝑂(𝑛 log 𝑛), but requires additional space for the merging process
...
It does not require additional space
5
...
Summary
181
Problem Solving with Algorithms and Data Structures, Release 3
...
5 Key Terms
binary search
clustering
folding method
hash table
linear probing
median of three
mid-square method
perfect hash function
quick sort
sequential search
slot
bubble sort
collision
gap
hashing
load factor
merge
open addressing
pivot value
rehashing
shell sort
split point
chaining
collision resolution
hash function
insertion sort
map
merge sort
partition
quadratic probing
selection sort
short bubble
5
...
Using the hash table performance formulas given in the chapter, compute the average
number of comparisons necessary when the table is
• 10% full
• 25% full
• 50% full
• 75% full
• 90% full
• 99% full
At what point do you think the hash table is too small? Explain
...
Modify the hash function for strings to use positional weightings
...
We used a hash function for strings that weighted the characters by position
...
What are the biases that exist with these functions?
4
...
Using a list of names (classmates, family members,
etc
...
5
...
Show how this list is sorted by the following algorithms
• bubble sort
• selection sort
• insertion sort
• shell sort (you decide on the increments)
• merge sort
182
Chapter 5
...
0
• quick sort (you decide on the pivot value)
6
...
Show how this list is
sorted by the following algorithms:
• bubble sort
• selection sort
• insertion sort
• shell sort (you decide on the increments)
• merge sort
• quick sort (you decide on the pivot value)
7
...
Show how this list is
sorted by the following algorithms:
• bubble sort
• selection sort
• insertion sort
• shell sort (you decide on the increments)
• merge sort
• quick sort (you decide on the pivot value)
8
...
Show how this list is sorted
using the following algorithms:
• bubble sort
• selection sort
• insertion sort
• shell sort (you decide on the increments)
• merge sort
• quick sort (you decide on the pivot value)
9
...
For example, pick
the middle item
...
Under what criteria does your new strategy perform better or worse than the strategy
from this chapter?
5
...
Set up a random experiment to test the difference between a sequential search and a
binary search on a list of integers
...
7
...
0
(recursive and iterative)
...
What are your results? Can you explain them?
2
...
Recall that you
will need to pass the list along with the starting and ending index values for the sublist
...
3
...
4
...
5
...
6
...
If the
table gets full, this needs to be increased
...
7
...
8
...
Perform a benchmark
analysis using some of the sorting algorithms from this chapter
...
Implement the bubble sort using simultaneous assignment
...
A bubble sort can be modified to “bubble” in both directions
...
” This alternating pattern continues
until no more passes are necessary
...
11
...
12
...
13
...
14
...
Why does this make sense? Re-implement the quick
sort and use it to sort a random list of integers
...
15
...
Run an experiment to compare the two techniques
...
Sorting and Searching
CHAPTER
SIX
TREES AND TREE ALGORITHMS
6
...
• To see how trees can be used to implement a map data structure
...
• To implement trees using classes and references
...
• To implement a priority queue using a heap
...
2 Examples Of Trees
Now that we have studied linear data structures like stacks and queues and have some experience with recursion, we will look at a common data structure called the tree
...
Tree data structures have many things in common with their botanical
cousins
...
The difference between a tree in
nature and a tree in computer science is that a tree data structure has its root at the top and its
leaves on the bottom
...
At each level of the tree we might ask ourselves a question and
then follow the path that agrees with our answer
...
When
we are at the Mammal level we ask, “Is this Mammal a Primate or a Carnivore?” We can keep
following paths until we get to the very bottom of the tree where we have the common name
...
For example, the Genus Felis has the children Domestica and Leo
...
This means that we can change the node that is the child of Musca
without affecting the child of Felis
...
0
Figure 6
...
Trees and Tree Algorithms
Problem Solving with Algorithms and Data Structures, Release 3
...
2: A Small Part of the Unix File System Hierarchy
A third property is that each leaf node is unique
...
Another example of a tree structure that you probably use every day is a file system
...
Figure 6
...
The file system tree has much in common with the biological classification tree
...
That path will uniquely identify that subdirectory (and
all the files in it)
...
For example, we could take the entire
subtree staring with /etc/, detach etc/ from the root and reattach it under usr/
...
A final example of a tree is a web page
...
Figure 6
...
content="text/html; charset=utf-8" />
A simple web page
- List item one
- List item two