Learn C programming | More Info | Notesale | Buy and Sell Study Notes Online | Extra Student Income | University Notes

Search for notes by fellow students, in your own course and all over the country.

Browse our notes for titles which look like what you need, you can preview any of the notes via a sample of the contents. After you're happy these are the notes you're after simply pop them into your shopping cart.

My Basket

Buy These Notes

You have nothing in your shopping cart yet.

Title: Learn C programming
Description: Learn C language from hear. Any one can learn from 1 year to 4 year 1 Introduction 1 1.1 Programming and Programming Languages . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 The C Programming Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3 A First Program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.4 Variants of HelloWorld . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.5 A Numerical Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.6 Another Version of the Conversion Table Example . . . . . . . . . . . . . . . . . . . 6 1.7 Organisation of the Text . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2 Types, Operators, and Expressions 8 2.1 Identifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.2 Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.3 Constants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.4 Symbolic Constants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.5 printf Conversion Specifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.6 Declarations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.7 Arithmetic Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.8 Relational and Logical Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.9 Bitwise Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.10 Assignment Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.11 Type Conversions and Casts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3 Branching and Iteration 17 3.1 If-Else . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.2 ?: Conditional Expression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.3 Switch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.4 While Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.5 Do-While Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.6 For Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.7 Break and Continue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.8 Goto . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 4 Functions 25 4.1 Function Prototypes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 4.2 Function Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 4.3 Benefits of Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 4.4 Designing For Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 4.5 Interface Design. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 4.6 The Standard Library . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 5 Scope and Extent 33 5.1 Local Scope and Automatic Extent . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 5.2 External Scope and Static Extent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 5.3 The static Storage Class Specifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 5.4 Scope Resolution and Name Hiding . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 5.5 Summary of Scope and Extent Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 5.6 Header Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 5.7 Modular Programming: Multiple File Programs . . . . . . . . . . . . . . . . . . . . . 39 6 Software Design 41 6.1 Requirements and Specification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 6.2 Program Flow and Data Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 6.3 Top-down and Bottom-up Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 6.4 Pseudocode Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 6.5 Case Study: A Tic-Tac-Toe Game . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 6.5.1 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 6.5.2 Specification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 6.5.3 Program Flow and Data Structures . . . . . . . . . . . . . . . . . . . . . . . . 45 6.5.4 Bottom-Up Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 6.5.5 Top-Down Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 6.5.6 Benefits of Modular Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 7 Pointers 49 7.1 What is a Pointer? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 7.2 Pointer Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 7.3 Pass By Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 7.4 Pointers and Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 7.5 Pointer Arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 7.6 Return Values and Pointers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 7.7 Pointers to Pointers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 7.8 Function Pointers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 8 Arrays and Strings 59 8.1 Array Initialisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 8.2 Character Arrays and Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 8.3 Strings and the Standard Library . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 8.4 Arrays of Pointers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 8.5 Multi-dimensional Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 9 Dynamic Memory 68 9.1 DifferentMemory Areas in C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 9.2 Standard Memory Allocation Functions . . . . . . . . . . . . . . . . . . . . . . . . . 69 9.3 Dynamic Memory Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 9.4 Example: Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 9.5 Example: An Expandable Array . . . . . . . . . . . . . . . . . . . . . . . . . More topics are ther

Buy These Notes Preview

Document Preview

Extracts from the notes are below, to see the PDF you'll receive please use the links above

An Introduction to the C Programming Language
and Software Design
Tim Bailey

Preface
This textbook began as a set of lecture notes for a ﬁrst-year undergraduate software engineering
course in 2003
...
The intention
of this text is to cover topics on the C programming language and introductory software design in
sequence as a 20 lecture course, with the material in Chapters 2, 7, 8, 11, and 13 well served by
two lectures apiece
...
In particular, for the practicing programmer,
the best available tutorial and reference is Kernighan and Ritchie [KR88] and the best in-depth
reference is Harbison and Steele [HS95, HS02]
...

What sets this book apart from most introductory C-programming texts is its strong emphasis
on software design
...
5), ﬁle modularity
(Section 5
...
6)
...
4)
...
Chapter 14 shows how to write generic software (i
...
, code designed to work with a
variety of diﬀerent data types)
...

The course for which this textbook was originally written was prerequisite to an embedded systems
course, and hence required an introduction to bitwise manipulations suitable for embedded systems
programming
...

The full source code for all signiﬁcant programs in this text can be found on the web at the
address www
...
usyd
...
au/homepages/academic/tbailey/index
...
Given the volatile
nature of the web, this link may change in subsequent years
...
usyd
...
au and I will attempt to rectify the problem
...
No
doubt there are errors and inconsistencies—both technical and grammatical—although hopefully
nothing too seriously misleading
...
Also, any interesting or clever code snippets that might be incorporated
in future editions are most welcome
...

Draft 0
...
1 Programming and Programming Languages
...
2 The C Programming Language
...
3 A First Program
...
4 Variants of Hello World
...
5 A Numerical Example
...
6 Another Version of the Conversion Table Example
1
...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...
1 Identiﬁers
...
2 Types
...
3 Constants
...
4 Symbolic Constants
...
5 printf Conversion Speciﬁers
...
6 Declarations
...
7 Arithmetic Operations
...
8 Relational and Logical Operations
2
...

2
...

2
...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

8
8
8
10
11
12
13
13
14
15
15
16

3 Branching and Iteration
3
...

3
...
3 Switch
...
4 While Loops
...
5 Do-While Loops
...
6 For Loops
...
7 Break and Continue
...
8 Goto
...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...
1 Function Prototypes
4
...

4
...
4 Designing For Errors

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

25
25
25
28
29

...

...

...

...

...

...

ii

4
...
6

Interface Design
...

31
32

5 Scope and Extent
5
...

5
...

5
...

5
...

5
...

5
...

5
...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

33
33
34
35
36
38
38
39

6 Software Design
6
...

6
...

6
...

6
...

6
...

6
...
1 Requirements
...
5
...

6
...
3 Program Flow and Data Structures
...
5
...

6
...
5 Top-Down Design
...
5
...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...
1 What is a Pointer?
...
2 Pointer Syntax
...
3 Pass By Reference
...
4 Pointers and Arrays
...
5 Pointer Arithmetic
...
6 Return Values and Pointers
7
...

7
...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

49
49
50
52
53
54
56
57
57

8 Arrays and Strings
8
...

8
...

8
...
4 Arrays of Pointers
...
5 Multi-dimensional Arrays
...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

59
59
60
62
63
65

9 Dynamic Memory
9
...

9
...
3 Dynamic Memory Management
...
4 Example: Matrices
...
5 Example: An Expandable Array
...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

68
68
69
70
72
75

...

...

...

...

...

...

...

...

iii

10 The
10
...
2
10
...

Symbolic Constants
...

10
...
1 Macro Basics
...
3
...

10
...
3 More Complex Macros
10
...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

79
79
79
80
81
82
83
84

11 Structures and Unions
11
...

11
...

11
...

11
...

11
...

11
...
7 Expandable Array Revisited
...
8 Unions
...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...
1 Binary Representations
...
2 Bitwise Operators
...
2
...
2
...

12
...
3 Operator Precedence
...
3 Common Bitwise Operations
...
4 Bit-ﬁelds
...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

99
99
100
100
101
102
102
103

13 Input and Output
13
...

13
...
1 Formatted Output: printf()
...
1
...

13
...
3 String Formatting
...
2 File IO
...
2
...

13
...
2 Standard IO
...
2
...

13
...
4 Random Access File Operations
13
...

13
...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...
1 Basic Generic Design: Typedefs, Macros, and Unions
14
...
1 Typedefs
...
1
...

14
...
3 Unions
...
2 Advanced Generic Design: void *
...
2
...

14
...
2 Type Speciﬁc Wrapper Functions
...
2
...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

115
115
115
116
116
117
117
121
123

...

...

...

...

...

...

...

...

...

...

...

...

...

...

iv

15 Data Structures
15
...
2 Arrays
...
3 Linked Lists
...
4 Circular Buﬀers
...
5 Stacks
...
6 Queues
...
7 Binary Trees
...
8 Hash Tables
...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...
1 Further ISO C Topics
...
2 Traditional C
...
3 Make Files
...
4 Beyond the C Standard Library
...
5 Interfacing With Libraries
...
6 Mixed Language Programming
...
7 Memory Interactions
...
8 Advanced Algorithms and Data Structures

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

138
138
139
139
139
140
140
140
141

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

A Collected Style Rules and Common Errors
142
A
...
142
A
...
142
B The Compilation Process

143

Bibliography

144

Index

146

v

Chapter 1

Introduction
This textbook was written with two primary objectives
...
C is a practical and still-current software tool; it remains one of the most popular
programming languages in existence, particularly in areas such as embedded systems
...
Also, there is an enormous code-base of C programs developed
over the last 30 years, and many systems that will need to be maintained and extended for many
years to come
...
At one-level this
is C-speciﬁc: to learn to design, code and debug complete C programs
...
This involves
learning to decompose large problems into manageable systems of modules; to use modularity and
clean interfaces to design for correctness, clarity and ﬂexibility
...
1

Programming and Programming Languages

The native language of a computer is binary—ones and zeros—and all instructions and data must
be provided to it in this form
...
The earliest digital
electronic computers were programmed directly in binary, typically via punched cards, plug-boards,
or front-panel switches
...
Developing correct programs in machine language is tedious and
complex, and practical only for very small programs
...
These languages have simple mnemonic instructions that directly map to a sequence of machine language
operations
...
Programs written in assembly language are translated to
machine code using an assembler program
...
Furthermore,
since each processor provides its own assembler dialect, assembly language programs tend to be
non-portable; a program must be rewritten to run on a diﬀerent machine
...

These languages provide mechanisms, such as subroutines and conditional looping constructs, which
greatly enhance the structure of a program, making it easier to express the progression of instruction
execution; that is, easier to visualise program ﬂow
...

Thus, ideally, a program written in a high-level language may be ported to a diﬀerent machine and

1

run without change
...

Compiled code is not the only way to execute a high-level program
...
g
...
Given a text-ﬁle
containing a high-level program, the interpreter reads a high-level instruction and then executes the
necessary set of low-level operations
...

Another alternative, intermediate between compiled and interpreted code, is provided by a virtual
machine (e
...
, the Java virtual machine), which behaves as an abstract-machine layer on top of a
real machine
...
Interpreting byte
code is usually much faster than interpreting high-level code directly
...

The primary purpose of a high-level language is to permit more direct expression of a programmer’s design
...
High-level code modules can be designed to “plug” together
piece-by-piece, allowing large programs to be built out of small, comprehensible parts
...
Thus, a programmer’s focus should be on modularity and
readability rather than speed
...
1

1
...
It is a
small language, with just 32 keywords (see [HS95, page 23])
...

Since C is relatively small, it can be described in a small space, and learned quickly
...

C achieves its compact size by providing spartan services within the language proper, foregoing
many of the higher-level features commonly built-in to other languages
...
There are no memory
management facilities apart from static deﬁnition and stack-allocation of local variables
...

Much of the functionality of C is provided by way of software routines called functions
...
For example, the standard function printf() prints text to the screen (or, more
precisely, to standard output—which is typically the screen)
...

1 Of course, eﬃciency is also the programmer’s responsibility, but it should not be to the detriment of clarity, see
Section 15
...

2

1
...
A function contains
statements that specify the computing operations to be done, and variables store values
used during the computation [KR88, page 6]
...

1
2
3
4
5
6
7

1

/* First C program: Hello World */
#include ...
They can span multiple lines and are not
nestable
...
*/

2

Inclusion of a standard library header-ﬁle
...
Headerﬁles contain the information necessary to use these libraries, such as function declarations and
macros
...
This function comes in two forms:
int main(void)
int main(int argc, char *argv[])
The ﬁrst takes no arguments, and the second receives command-line arguments from the environment
in which the program was executed—typically a command-shell
...
4
...
e
...
2

5 and 7

6

The braces { and } delineate the extent of the function block
...
In the case of main(), the program terminates and control
returns to the environment in which the program was executed
...

This program contains just one statement: a function call to the standard library function printf(),
which prints a character string to standard output (usually the screen)
...
h)
...
In this case, the printf() function takes one argument (or input parameter): the string
constant "Hello World!\n"
...
Escape characters provide a mechanism for representing hard-to-type or invisible characters
(e
...
, \t for tab, \b for backspace, \" for double quotes)
...
C is a free-form language, with program meaning unaﬀected by whitespace in most
circumstances
...

2 You may notice in the example program above, that main() says it returns int in its interface declaration, but
in fact does not return anything; the function body (lines 5–7) contains no return statement
...

3

1
...
It shows that a new line
is not automatic with each call to printf(), and subsequent strings are simply abutted together
until a \n escape character occurs
...
h>
int main(void)
{
printf("Hello ");
printf("World!");
printf("\n");
}

The next program also prints “Hello World!” but, rather than printing the whole string in one
go, it prints it one character at a time
...

This may seem a lot, but don’t worry—you don’t have to understand it all now, and all will be
explained in subsequent chapters
...

1
2
3
4
5
6
7
8
9
10
11
12
13
14

6–7

/* Hello World version 3 */
#include ...
They must be declared at the top of a block
before any statements; (a block is a section of code enclosed in brackets { and })
...

6

The variable with identiﬁer i is of type int, an integer, initialised to zero
...
In this case,
str refers to the characters in a string constant
...
The loop
executes while ever the expression (str[i] != ’\0’) is non-zero
...
) The operator != means NOT EQUAL TO
...
All string constants are implicitly appended with a
NUL character, speciﬁed by the escape character ’\0’
...
In this
case, the printf() takes two arguments—a format string "%c" and a parameter str[i++]—and
prints the i-th character of str
...

4

13

Unlike the previous versions of this program, this one includes an explicit return statement for the
program’s exit status
...
Throughout this text take notice of the formatting style used in the example code,
particularly indentation
...
The
compiler does not care about indentation, but it makes the program easier to read for programmers
...
5
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22

6–7

9–10
11–13

A Numerical Example

/* Fahrenheit to Celcius conversion table (K&R page 12) */
#include ...
0/9
...
0);
printf("%3
...
1f\n", fahr, celsius);
fahr += step;
}

This program uses several variables
...
Variables are speciﬁed types, which are int and float in this example
...

These ﬁrst three statements in the program initialise the three integer variables
...
Notice that the two variables are of diﬀerent type (int
and ﬂoat)
...

17–21

The while-loop executes while ever the expression (fahr <= upper) is TRUE
...
This loop executes a compound statement enclosed in braces—
these are the three statements on lines 18–20
...

19

The printf() statement here consists of a format string and two variables fahr and celcius
...
0f and %6
...
(The conversion speciﬁer %6
...
See Section 13
...
1 for
more information on printf() and conversion speciﬁers
...

5

Style note
...
They should explain
intent and point-out algorithm subtleties
...
Careful choice
of identiﬁers (i
...
, variable names, etc) can greatly reduce the number of comments required to
produce readable code
...
6

Another Version of the Conversion Table Example

This variant of the conversion table example produces identical output to the ﬁrst, but serves to
introduce symbolic constants and the for-loop
...
h>
#deﬁne LOWER 0 /* lower limit of temp
...
1f\n", fahr, (5
...
0) * (fahr−32
...
These are speciﬁed by #define,
and mean that we can avoid littering our code with numbers
...
(There are rare exceptions where a literal
constant is okay; the most common example is the number 0 to begin a loop over an array
...
The ﬁrst initialises the loop,
the second tests the condition (identical to the while-loop), and the third is an expression executed
after each loop iteration
...

Style note
...
Symbolic constants should always be UPPERCASE to
distinguish them from variables
...
7

Organisation of the Text

This text is organised in a sequential fashion—from fundamentals to higher-level constructs and
software design issues
...
(The material
required to understand the examples in this chapter is covered in Chapters 2 and 3, and Sections
7
...
2, and 8
...
)
Throughout the text, design techniques and good programming practice are emphasised to encourage a coding style conducive to building large-scale software systems
...
The basic process of program design is presented
in Chapter 6
...
Chapter 14 discusses generic programming, which

6

is the design of functions that can operate on a variety of diﬀerent data types
...

Chapter 16 provides a context for the book by describing how the ISO C language ﬁts into
the wider world of programming
...
Chapter 16 gives a taste of some of the issues
...
Declarations
list the variables to be used, and state what type they have and perhaps what their initial
values are
...
Expressions combine variables
and constants to produce new values
...

2
...
e
...
The ﬁrst character of an identiﬁer must be a letter, which includes underscore (_)
...
Furthermore, it is a good idea to avoid redeﬁning identiﬁers used by the C standard
library (such as standard function names, etc)
...
Use lowercase for variable names and uppercase for symbolic constants
...
Variable names
can begin with an underscore (_), but this should be avoided as such names, by convention, are
reserved for library implementations
...
2

Types

C is a typed language
...
By forcing
the programmer to explicitly deﬁne a type for all variables and interfaces, the type system enables
the compiler to catch type-mismatch errors, thereby preventing a signiﬁcant source of bugs
...

The numerical types come in several of sizes
...
1 shows a list of C types and their typical
1 The ISO standard states that identiﬁers for internal names (i
...
, names with ﬁle-scope or less, see Chapter 5)
must recognise at least the ﬁrst 31 characters as signiﬁcant—including letter case
...
e
...
2) must consider at least the ﬁrst 6 characters as signiﬁcant, and
these case insensitive
...
In practice, most
implementations recognise far more characters of an external identifer than the standard minimum
...
(Notice is also given that future versions of
the standard could increase this limit
...

8

char
int
short int
long int
float
double
long double

C Data Types
usually 8-bits (1 byte)
usually the natural word size for a
machine or OS (e
...
, 16, 32, 64 bits)
at least 16-bits
at least 32-bits
usually 32-bits
usually 64-bits
usually at least 64-bits

Table 2
...

sizes, although the sizes may vary from platform to platform
...
The size of an int generally represents the
natural word-size of a machine; the native size with which the CPU handles instructions and data
...

A program to print the range of values for certain data types is shown below
...
h and float
...

1
2
3
4
5
6
7
8
9
10
11
12
13
14

#include ...
h> /* integer speciﬁcations */
#include <ﬂoat
...
The size of a type in number of characters (which is usually equivalent to number of bytes)
can be found using the sizeof operator
...
It returns an unsigned integer of type size_t, which is deﬁned in header-ﬁle
stddef
...

1
2
3
4
5
6
7
8

#include ...
(The qualiﬁer long may also be applied to type double
...
Other type qualiﬁers2 are signed,
unsigned, const, and volatile
...
A signed type may represent negative values; the most-signiﬁcant-bit (MSB) of the
number is its sign-bit, and the value is typically encoded in 2’s-complement binary
...
For example, a 16-bit signed short
can represent the numbers −32768 to 32767 (i
...
, −215 to 215 − 1), while a 16-bit unsigned short
can represent the numbers 0 to 65535 (i
...
, 0 to 216 −1)
...
1
...
Integer types are signed by default (e
...
, writing short is equivalent to writing signed
short int)
...

The qualiﬁer const means that the variable to which it refers cannot be changed
...
This is useful for, say, multi-threaded programming or interfacing to hardware; topics which are beyond the scope of this text
...

Finally, there is a type called void, which speciﬁes a “no value” type
...

2
...
This section presents various constant types
by example
...
An constant of type long int is suﬃxed
by an L, 1234L; (integer constants too big for int are implicitly taken as long)
...

Integer constants may also be speciﬁed by octal (base 8) or hexadecimal (base 16) values, rather
than decimal (base 10)
...
Thus, 1234 in decimal
is equivalent to 02322 and 0x4D2
...
For example, the following code
int x = 1234, y = 02322, z = 0x4D2;
printf("%d\t%o\t%x\n", x, x, x);
printf("%d\t%d\t%d\n", x, y, z);
2 To be strictly correct, only const and volatile are actually type qualiﬁers
...
However,
they are actually type speciﬁers; (the basic types int, double, char, etc are also type speciﬁers)
...
However, the hex form is very useful
in practice as it breaks down binary into blocks of four bits (see Section 12
...

Floating-point constants are speciﬁed by a decimal point after a number
...
and
1
...
14f and 2
...
L is of type long double
...
65e-2 (which is equivalent to
0
...
Constant expressions, such as 3+7+9
...
2
...

Character constants, such as ’a’, ’\n’, ’7’, are speciﬁed by single quotes
...
Thus, sizeof(’Z’) will equal
4 on a 32-bit machine, not one
...
g
...
Tables of the ASCII character set are readily found (see, for example, [HS95,
page 421])
...
It is important to recognise that these escape characters still represent single
characters
...

String constants, such as "This is a string" are delimited by quotes (note, the quotes are
not actually part of the string constant)
...
Thus, in memory, the above string constant would comprise the following character
sequence: This is a string\0
...
It is important to diﬀerentiate between a character constant (e
...
, ’X’) and a NUL terminated
string constant (e
...
, "X")
...
Note also that
sizeof(’X’) is four (on a 32-bit machine) while sizeof("X") is two
...
4

Symbolic Constants

Symbolic constants represent constant values, from the set of constant types mentioned above, by a
symbolic name
...
7183

Wherever a symbolic constant appears in the code, it is equivalent to direct text-replacement with
the constant it deﬁnes
...
The reason for using symbolic constants rather than constant values
directly, is that it prevents the proliferation of “magic numbers”—numerical constants scattered
throughout the code
...
Symbolic constants keep constants
together in one place so that making changes is easy and safe
...
5 and 1
...
The ﬁrst example uses magic
numbers, while the second uses symbolic constants
...
The #define symbol, like the #include symbol for ﬁle inclusion, is a preprocessor command
(see Section 10
...
As such, it is subject to diﬀerent rules than the core C language
...

Another form of symbolic constant is an enumeration, which is a list of constant integer values
...

enum Boolean x = FALSE;
If an enumeration list is deﬁned without an explicit tag, it assumes the type int
...
g
...
List members can also be given explicit integer values,
and non-speciﬁed members are each one greater than the previous member (e
...
, RED is 2, GREEN
is 3, BLUE is 4, YELLOW is 4, and BLACK is 5)
...
Symbolic constants and enumerations are by convention given uppercase names
...
Variables qualiﬁed by const behave like constants5 and so should also
be identiﬁed with uppercase names, or with the ﬁrst letter uppercase
...
5

printf Conversion Speciﬁers

The standard function printf() facilitates formatted text output
...

printf("Character values %c %c %c\n", ’a’, ’b’, ’c’);
printf("Some floating-point values %f %f %f\n", 3
...
1f);
printf("Scientific notation %e %e %e\n", 3
...
1f);
printf("%15
...
1
...

Important
...
If they
are not, the program will either print garbage or crash
...
For example, int j = TRUE; is valid, as is enum Boolean k = -4;
...
2
...
6

Declarations

All variables must be declared before they are used
...
They may be initialised by a
constant or an expression when declared
...

{ /* bracket signifies top of a block */
int lower, upper, step; /* 3 uninitialised ints */
char tab = ’\t’;
/* a char initialised with ’\t’ */
char buf[10];
/* an uninitialised array of chars */
int m = 2+3+4;
/* constant expression: 9 */
int n = m + 5;
/* initialised with 9+5 = 14 */
float limit = 9
...
1416;
The general form of a declaration6 is
= , = ,
...
5)
...
7

Arithmetic Operations

The arithmetic (or numerical) operators come in two varieties: unary and binary
...
The ﬁrst four
operators can be used on integer or ﬂoating-point types, although it is important to notice that
integer division truncates any fractional part (e
...
, 17/5 is equal to 3)
...
g
...
g
...

Note
...
e
...
A portable
work-around for this is shown in Section 10
...
2
...

int ispositive = +34;
double isnegative = -56
...
It exists only for symmetry
with the unary - operator
...
These
operators add 1 to a variable and subtract 1 from a variable, respectively
...
An unusual quality of ++ and -- is that they may be used preﬁx ++x or
postﬁx x++ with diﬀerent characteristics
...
2;
double y = ++x;
double z = x++;
6 A variable deﬁnition is usually synonymous with its declaration
...
2
...
2 and then assigned to y, which
then also equals 4
...
In the second case, called postincrement, the value of x is ﬁrst assigned to z,
and subsequently increased by 1; so, z equals 4
...
2
...

int a=2, b=7, c=5, d=9;
printf("a*b + c*d = %d\n", a*b + c*d); /* prints a*b + c*d = 59 */
Two common errors can occur with numerical operations: divide-by-zero and overﬂow
...
Divide-by-zero errors can also occur with the modulus operator if the second
operand is 0
...
For example,
int z = x + 1;
will overﬂow if the value of x is the largest representable value of type int
...

2
...
Relational expressions evaluate to 1 if they
are TRUE and 0 if they are FALSE
...
1 < 7 evaluates to one, and x != x evaluates
to zero
...
A very common programming error is to mistakenly type = (assignment) for == (equality)
...
If it is written as
while (x = 3) {
/* various statements here */
}
then x will be assigned the value 3 and this value will be the loop conditional, which is always
non-zero (and therefore TRUE) resulting in an inﬁnite loop
...
All the relational and logical
operators are binary except the !, which is unary
...
They can be used to chain together multiple expressions, as in
the following example where, given the integer values a=1, b=2, c=3, d=3,
(a < b && b < c && c < d)
/* FALSE */
(a < b && b < c && c <= d) /* TRUE */
((a < b && b < c) || c < d) /* TRUE */
The order of evaluation of && and || is left-to-right, and evaluation stops as soon as the truth
or falsehood of the result is known—leaving the remaining expressions unevaluated
...
For example, given an array of length SIZE, it is
incorrect to evaluate array[SIZE], which is one-beyond the end of the array
...

The unary operator ! simply converts a non-zero expression to zero and vice-versa
...
The unary ! tends to be used infrequently
as it can lead to obscure code, and typically == or != provide a more readable alternative
...
Of the others, >, <, >=,
and <= have highest precedence; followed by == and !=; then &&; and ﬁnally, ||
...
C has precedence rules for all its operators (e
...
, see the precedence tables in [KR88,
page 53])
...
g
...

The following example is a segment of code where the intuitive precedence is not correct, and
the code is faulty
...

while (s[i] = t[i] != ’\0’)
++i;
However, the != has higher precedence than the =, and so s[i] will not be assigned t[i] but the
result of t[i] != ’\0’, which is 1 except for the ﬁnal iteration when it will be 0
...

while ((s[i] = t[i]) != ’\0’)
++i;

2
...
e
...
These are essential for low-level programming, such as controlling hardware
...

The operators are the bitwise AND &, bitwise OR |, bitwise exclusive OR ^, left shift <<, right
shift >>, and one’s complement operator ~
...
The purpose and usage of the logical and bitwise operators
are quite disparate and may not be used interchangeably
...
10

Assignment Operators

Expressions involving the arithmetic or bitwise operators often involve the assignment operator =
(for example, z = x + y)
...
g
...
These types of expression can be written in the compressed form x += y, where the operator += is called an assignment operator
...
Thus, we can write x *= y + 1 rather than x = x * (y + 1)
...
We
return to the bitwise operators in Chapter 12
...
11

Type Conversions and Casts

When an operator has operands of diﬀerent types, they are converted to a common type
according to a small number of rules [KR88, page 42]
...

• Otherwise, if either operand is double, convert the other to double
...

• Otherwise, convert char and short to int, and, if either operand is long, convert the other
to long
...

A simple example of type promotion is shown in the following code
...
1f;
double d = c + a*b;
Here the multiply is performed ﬁrst, so a is promoted to int and multiplied with b
...
This result is then promoted to
double and assigned to d
...
The promotion from char to int is implementation-dependent, since whether a plain char
is signed or unsigned depends on the compiler
...

Assignment to a “narrower” operand is possible, although information may be lost
...
Conversion from a larger integer
to a smaller one results in truncation of the higher-order bits, and conversion from ﬂoating-point to
integer causes truncation of any fractional part
...
5 + 3/5
...
0 is promoted to type double so that the ﬁnal summation equals 1
...
The result
then is truncated to 1 in the assignment to iresult
...

Narrowing conversions should be avoided
...
For example,
int iresult = (int)(0
...
0);
Casts can also be used to coerce a conversion, such as going against the promotion rules speciﬁed
above
...
0 + 3
...

16

Chapter 3

Branching and Iteration
The C language provides three types of decision-making constructs: if-else, the conditional expression ?:, and the switch statement
...
And it has the infamous goto, which is capable of both non-conditional branching and
looping
...
1

If-Else

The basic if statement tests a conditional expression and, if it is non-zero (i
...
, TRUE), executes
the subsequent statement
...
The else statement deals with the alternative
case where the conditional expression is 0 (i
...
, FALSE)
...
Statements so grouped are called a compound statement, or block, and they are syntactically equivalent
to a single statement
...
On the other hand, if the conditional is FALSE, the next
if-conditional is tested
...
(Note, the ﬁnal else is optional and, if it is missing, the default action is no
action
...
This code segment performs integer division on the
ﬁrst k elements of an array of integers num[SIZE]
...
Notice that the else is a compound statement, and
that a variable (int i) is declared there; variables may be declared at the top of any block, and
their scope is local to that block
...
\n");
else if (denom == 0)
printf("Error: Denominator is zero
...
A common mistake with if-else blocks is the “dangling else problem”
...

if (a < b)
if (m != 0)
b = a;
else
a = m;
The intention of the programmer, as indicated by the indentation, is that the else corresponds to
the outer if statement
...
Desired association
can be enforced by brackets
...
Rather, it intends only to show a basic if-else chain
...
2

?: Conditional Expression

The conditional expression is a ternary operator; that is, it takes three operands
...
e
...
Thus, the result of the ternary expression will be the result of either the second
or third expressions, respectively
...

3
...

The general form of the switch statement is as follows
...
Otherwise
it jumps to the default label and executes its statements
...

Note
...
While this is good practice in
general, it is not mandatory, and case labels may appear below default
...
However, if a break is not encountered, execution
will ﬂow on through to the next cases until the end of the block
...
Fall through is rarely used because it is diﬃcult to code correctly;
it should be used with caution
...
It is generally good practice to have a default label even when it is not necessary,
even if it just contains an assert to catch logical errors (i
...
, program bugs)
...
Finally, it is wise to put a break
after the last case in the block, even though it is not logically necessary
...

2 Generally speaking, program execution will ﬂow through, past the lower case labels, unless a branch out of the
switch is encountered, such as break, return or goto
...
The switchstatement is no exception, and the statements following a case label may include a switch or other
control structure
...

if (expression)
while (expression)
switch(integer expression) {
case A1:
switch(integer expression) {
case B1: statements
case B2: statements
case B3: statements
}
case A2: statements
default: statements
}
The following example converts the value of a double variable angle to normalised radians (i
...
,
−π ≤ angle ≤ π)
...

switch (angletype)
{
case DEG:
angle *= PI / 180
...
0*PI;
< -PI)
2
...
4

While Loops

The while loop has the general form
while (expression)
statement;
If the conditional expression is TRUE, then the while will execute the following statement, after
which it will reevaluate the conditional
...
As with the if-else statement, the while loop can execute multiple statements as a block
by enclosing them in braces
...
e
...
The loop iterates
until the value of n becomes 0, at which point the GCD is the value of m
...
5

Do-While Loops

The do-while loop has the general form
do
statement;
while (expression);
Its behaviour is virtually the same as the while loop except that it always executes the statement
at least once
...
Thus, the body of a while loop is executed zero or more times, and
the body of a do-while loop is executed one or more times
...
It is good form to always put braces around the do-while body, even when it consists
of only one statement
...

The following code example takes a non-negative integer val and prints it in reverse order
...

do
{
printf("%d", val % 10);
val /= 10;
} while (val != 0);

3
...

expr1;
while (expr2) {
statement;
expr3;
}
In the for loop, expressions 1, 2, and 3 are optional, although the semicolons must remain
...
If expression 2 is omitted, then the conditional is always TRUE, and an inﬁnite
loop results
...
It is possible to stack several expressions in the various parts of the for loop using the
comma operator
...
However, it should be used sparingly, and is most
suited for situations like the following example
...

The ﬁrst loop ﬁnds the end of the string, and the second loop performs the reversing operation by
swapping characters
...
7

Break and Continue

As seen previously, break can be used to branch out of a switch-statement
...
Thus, a break may be used to terminate the execution
of a switch, while, do-while, or for
...

For example, consider a nested while-loop,
while (expression) {
while (expression) {
if (expression)
break;
statements
}
statements
}
Here the break will terminate the inner while-loop and proceed to execute the statements of the
outer while-loop
...
A break is used to terminate the inﬁnite outer while-loop
...
For while
and do-while it causes transfer of program-control immediately to the conditional test, which is
reevaluated for the next iteration
...
e
...
Note that, as with break, continue acts on the inner-most
enclosing block of a nested loop
...

The following example shows the outline of a code-segment that performs operations on the
positive elements of an array but skips the negative elements
...

for (i = 0; iif (array[i] < 0) /* skip -ve elements */
continue;
/* process +ve elements */
}
Note
...
A common misconception is
that break can be used to jump out of an if compound statement
...

3
...
It is almost never necessary to use one, and they should be avoided in general
...
A goto statement provides the ability to jump to a
named-label anywhere within the same function
...
A break statement cannot do this as it can only break out
of one level at a time
...

1
2
3
4
5
6
7

#include ...
h> /* for rand() */
#include ...
This example is contrived, but the idea is to
* ﬁnd a common element in two arrays
...
*/
{
char a[SIZE], b[SIZE];
int i, j;
/* Initialise arrays so they are diﬀerent from each other */
memset(a, VAL1, SIZE);
memset(b, VAL2, SIZE);
/* Set a random element in each array to VALUE */
a[rand()%SIZE] = VAL3;
b[rand()%SIZE] = VAL3;
/* Search for location of common elements */
for (i=0; ifor (j=0; jif (a[i] == b[j])
goto found;
/* Error: match not found */
printf("Did not find any common elements!!\n");
return 0;
found: /* Results on success */
printf("a[%d] = %c and b[%d] = %c\n", i, a[i], j, b[j]);
}

The standard function memset() ﬁlls the ﬁrst SIZE characters of array a with the value ’a’
...
h
...
A named-label must always be followed by a statement, even if it is just a null statement
...
Appropriate functions hide
details of operation from parts of the program that don’t need to know them, thus clarifying
the whole, and easing the pain of making changes [KR88, page 67]
...
They provide the basic mechanism for enclosing low-level source code, hiding algorithmic details and presenting instead an interface that
describes more intuitively what the code actually does
...
When combined with
ﬁle-modular design (see Sections 5
...
6), functions make it possible to build and maintain
large-scale software systems without being overwhelmed by complexity
...
1

Function Prototypes

A function must be declared before it is used
...

A function declaration, or prototype, is an interface speciﬁcation
...
It enables the compiler
to perform type-checking—to ensure the argument types being passed to the function match the
interface deﬁnition—which catches many coding errors at compile time
...

void some_procedure(void);
int string_length(char *str);
double point_distance(double, double, double, double);
Notice that the variable names are optional in the declaration, only the types matter
...

4
...
The function is passed a number of input parameters (or arguments) and
may return a value, as speciﬁed by its interface deﬁnition
...
This means that the
function receives a local copy of each input variable, not the variable itself
...
For example,
25

int myfunc(int x, int y)
/* This function takes two int arguments, and returns an int
...
However, the values of a and b are unaﬀected and
d = 1+2 = 3
...
The calling function is free to
ignore the return value,1 but it is good practice to make this explicit by putting a (void) cast in
front of the call
...
*/
void caller_func(void)
{
int a=1, b=2, c;
c = an_algorithm(a,b);
/* use return value */
an_algorithm(a,b);
/* ignore return value (implicitly) */
(void)an_algorithm(a,b); /* ignore return value (explicitly) */
}
int an_algorithm(int x, int y)
{
return x*2 + x/y;
}
The return value can be of any type, but there is a limitation that any function may only have at
most one return value
...
These methods are discussed in Sections 11
...
3, respectively
...

These deﬁne multiple exit points from the function, from which program-control returns to the next
statement of the calling function
...
But, if a function does not return a value, then an
empty return; suﬃces, and this may be omitted altogether for a no-value return occurring at the
end of the function block
...

It returns an int value, which is the number of characters printed, or a negative error value if the print fails
...
Recursion tends to
be less eﬃcient than iterative code (i
...
, code that uses loop-constructs), but in some cases may
facilitate more elegant, easier to read code
...
The ﬁrst calculates the greatest common divisor
of two positive integers m and n, and the second computes the factorial of a non-negative integer n
...
*/
int gcd (int m, int n)
{
while (n) {
int tmp= n;
n= m%n;
m= tmp;
}
return m;
}
/* Recursive GCD */
int gcdr (int m, int n)
{
if (n==0) return m;
return gcdr(n, m%n);
}

Notice that the factorial functions below incorporate argument checking via the standard library
macro assert, which causes the program to terminate with an error message if the conditional
expression is FALSE
...
3

Beneﬁts of Functions

Novice programmers tend to pack all their code into main(), which soon becomes unmanageable
...
Functions are the key to enabling such a division and separation of concerns
...

• Functions allow a program to be split into a set of subproblems which, in turn, may be further
split into smaller subproblems
...

• Functions can wrap-up diﬃcult algorithms in a simple and intuitive interface, hiding the implementation details, and enabling a higher-level view of the algorithm’s purpose and use
...
If a particular segment of code is required in several places,
a function provides a tidy means for writing the code only once
...
2
Consider the following examples
...

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30

int toupper (int c)
/* Convert lowercase letters to uppercase, leaving all other characters unchanged
...
*/
{
if (c >= ’a’ && c <= ’z’)
c += ’A’−’a’;
return c;
}
int isdigit(int c)
/* Return 1 if c represents an integer character (’0’ to ’9’)
...
*/
{
return c >= ’0’ && c <= ’9’;
}
void strcpy (char *s, char *t)
/* Copy character-by-character the string t to the character array s
...
*/
{
int i=0;
while ((s[i] = t[i]) != ’\0’)
++i;
}
double asinh(double x)
/* Compute the inverse hyperbolic sine of an angle x, where x is in radians and -PI <= x <= PI
...
0));
}
2 In practice, function interfaces tend to be much more stable (i
...
, less subject to change) than code internals
...
The adage “code duplication is an error” is true and well worth bearing
in mind
...
3 This function reads a
line of characters from standard-input (usually the keyboard) and stores it in a character buﬀer
...
The relative simplicity of the function interface of getline()
compared to its deﬁnition is immediately apparent
...
Return the length of the line
...
*/
int getline(char s[ ], int len)
{
int i=0, c;
/* Loop until: (i) buﬀer full, (ii) no more input available, or (iii) the end-of-line is reached (marked
* by newline character)
...
4

Designing For Errors

When writing programs, and especially when designing functions, it is important to perform appropriate error-checking
...

The standard library macro assert() is used to catch logical errors (i
...
, coding bugs, errors
that cannot happen in a bug-free program)
...
The form of assert()
is as follows,
assert(expression);
where the expression is a conditional test with non-zero being TRUE and zero being FALSE
...
For example, the expression
assert(idx>=0 && idxwill terminate the program if idx is outside the speciﬁed bounds
...
For example,4
1
2
3
4

int isprime(int val)
/* Brute-force algorithm to check for primeness */
{
int i;
3 The function getline() is similar to the standard library function gets(), but improves on gets() by including
an argument for the maximum-capacity of the character buﬀer
...
(The worm overwrote the stack of a network-querying program called ﬁnger, which
enabled it to plant back-door code on the remote machine
...

4 Notice the use of the numerical constant 2 in the function isprime()
...
The value 2 is, in fact, not an arbitrary magic number, but is intrinsic to the
algorithm, and to use a symbolic constant would actually detract from the code readability
...

switch (expression)
{
case label1: statements; break;
case label2: statements; break;
default: assert(0); /* can’t happen */
}
Being a macro, assert() is processed by the C preprocessor, which performs text-replacement on
the source code before it is parsed by the compiler
...
But, if the build is in release-mode (i
...
, the non-debug version of the
program), then the preprocessor transforms the assert() into nothing—the assertion is ignored
...

Note
...
Use them liberally
...
g
...
g
...
6
int* mem = (int*) malloc(50 * sizeof(int));
if (mem == NULL)
exit(1);
The form of exit() is
void exit(int status);
where status is the exit-status of the program, which is returned to the calling environment
...
(Also, the standard deﬁnes two symbolic constants EXIT_SUCCESS and EXIT_FAILURE for this
purpose
...
For example, requesting dynamic memory or opening a ﬁle,
5 The

C preprocessor and macros are discussed in detail in Chapter 10
...

6 Operations

30

FILE* pfile = fopen("myfile
...
As such, exit() statements will remain in release-version code
...
e
...
Functions designed for reuse should return an error ﬂag to allow the calling function to
determine an appropriate action
...

Note
...
g
...
A stronger form of termination function is abort(), which kills the program without any cleanup; abort() should be avoided
in general
...
The return value
might either be used exclusively as a status value,
int function_returns_status (arguments)
{
statements
if (success) return 0;
return 1;
}
or might return a certain range of values in normal circumstances, and a special value in the case of
an error
...
For example, an appropriate
action might be to ignore bad input, or to print a message and continue, or, in the worst case, to
terminate the program
...
It is common practice
in toy programs to ignore function return values, but production code should always check and
respond suitably
...
Standard functions that
use errno will typically return a value indicating an error has occurred, and the calling function
should check errno to determine the type of error
...
5

Interface Design

Good design of function interfaces is a somewhat nebulous topic, but there are some fundamental
principles that are generally applicable
...
It is usually
bad practice to expose function internals
...
Functions are an abstraction mechanism that
allow code to be understood at a higher level
...
That is, it is desirable to minimise the
eﬀect that changing one function will have upon another
...

• A function should perform a single speciﬁc task
...
Wrapper functions are useful for ensuring that a set of related
functions are called in a speciﬁc sequence
...
It should have only the arguments necessary for its
speciﬁc task, and should avoid extraneous “bells and whistles” features
...

4
...
These functions exist on all standard-conforming systems; they are
portable and correct, so use them before writing implementations of your own
...
Note the use of short, descriptive
function names, and intuitive, minimal interfaces
...
Learn what functions are available and
their various purposes
...

• Mathematical functions
...

• Manipulating characters
...

• Manipulating strings
...

• Formatted input and output
...

• File input and output
...

• Error handling
...

• Time and date functions
...

• Sort and search
...

• Low-level memory operations
...

7 For example, the standard library implementation of toupper() will be correct for the character set of the machine
on which the compiler resides
...
3, which is incorrect
for machines using, say, the EBCDIC character set
...
That is,
it describes the visibility of an identiﬁer within the program
...

The rules of scope and extent aﬀect the way functions and data interact, and are central to
the design of C programs
...
The focus is on the way in which control of scope and extent facilitate the writing of
modular programs, and particularly the implementation of multiple-ﬁle programs
...
1

Local Scope and Automatic Extent

A variable declared within a function has local scope by default
...
Function
arguments also have local scope
...
The visibility of a local variable is the block in
which it is deﬁned
...

A local variable has automatic extent, which means that its lifetime is from the point it is deﬁned
until the end of its block
...
If the variable is not explicitly initialised, then it
will hold an undeﬁned value (e
...
, in the above, val has an arbitrary value, while val2 is initialised
to 5)
...
At the end of the
block, the variable is destroyed and the memory recovered; the variable is said to go “out-of-scope”
...

33

5
...
Functions themselves are always external, because C does not allow
functions to be deﬁned inside other functions [KR88, page 73]
...
External variables
and functions are visible over the entire (possibly multi-ﬁle) program; they have external scope (also
called program scope)
...
However, it is necessary to
ﬁrst declare a variable or function in each ﬁle before it is used
...
Function prototypes may also be preceded by extern, but this is not essential
as they are external by default
...
A declaration refers to the speciﬁcation of a variable or function, in particular its name
and type
...

A variable or function may be declared multiple times in a program (provided the declarations are
non-conﬂicting) but may be deﬁned only once
...

File one
...
c:
double myvariable = 3
...

}
Note
...
e
...
c) is compiled to form an object
module
...
The identiﬁers of external variables and functions are visible to the linker, allowing them to be shared across
separate object modules, and are said to have “external linkage”
...

External variables and functions have static extent
...
External variables that are not initialised explicitly are given the default
value of zero; (this is diﬀerent to local variables, which have arbitrary initial values by default)
...

External variables are sometimes used as a convenient mechanism for avoiding long argument
lists
...
They may also permit more natural semantics if two functions operate on the
same data, but neither calls the other
...

34

The following example (from [KR88, page 79]) shows a situation where global variables might be
convenient
...
However, suppose you read in some characters and decide
you are not yet ready to process them, and wish to push them back onto the input stream for a later
time
...

#define BUFSIZE 100
char buffer[BUFSIZE]; /* buffer for pushed-back characters */
int bufidx = 0; /* buffer index */
int getch(void)
/* Get a character from stdin
...
Two functions are said to be “tightly coupled” if
changes made to one function forces changes on the other
...
A further problem
with external variables is that, since their scope is over the entire multi-ﬁle program, it is easy to
write code where the same identiﬁer is used to deﬁne two diﬀerent external variables
...

File one
...
c:
extern int myvariable;
char myname(int c);
int myfunc(int idx);
As a rule, external variables are easy to overuse and should be avoided where possible
...
3

The static Storage Class Speciﬁer

The keyword static is a storage class speciﬁer, but it is perhaps better viewed as a storage class
qualiﬁer as it imparts diﬀerent properties depending on whether an object is a local variable, an
35

external variable, or a function
...

They are initialised to zero by default and retain their values between function calls
...
*/
}
External variables and functions that are qualiﬁed as static obtain ﬁle scope, which means their
visibility is limited to a single source ﬁle
...
3 This prevents unwanted access by code in other parts
of the program and reduces the risk of naming conﬂicts
...

File one
...
c:
static int myvariable;
/* no conflict with file one
...
The two functions would remain extern as they might be called
from functions in other ﬁles, but the variables buffer and bufidx only require ﬁle-scope
...
As a rule, and
where possible, static functions are preferred over external functions, static variables are preferred
over external variables, and local variables are preferred over static variables
...
Many programs today operate using multiple threads of control, meaning that various parts
of the program operate concurrently, or over a timeslice so as to appear concurrent
...
Temporal
dependencies on the value of external variables may be violated as the diﬀerent threads are switched
in and out
...

5
...
In such situations, one variable will hide the other
...
In
other words, the variable that is “more local” has dominant visibility
...

1
2
3
4
5
6
7

#include ...

36

8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27

{
int x, z=0; /* local scope 1 */
x = y + 1;
while (x++ < 10) {
int x, y = 0; /* local scope 2 */
x = z % 5;
printf("In loop \tx= %d\ty= %d\tz= %d\n", x, y++, z++);
}
printf("Before modify()\tx= %d\ty= %d\tz= %d\n", x, y, z);
z = modify(x, y);
printf("After modify()\tx= %d\ty= %d\tz= %d\n", x, y, z);
}
int modify(int a, int b)
{
z += a + b;
return z;
}

This program produces the following output
...
A brief discussion of the above example may
help clarify their properties
...

12

This line represents one of the more diﬃcult instances of name-hiding: does x refer to the inner local
block or the outer one? In fact, x refers to x-local-scope-1, not to x-local-scope-2, which is declared
within the compound statement following the while, and is not in scope for the while conditional
itself
...

18–20

Identiﬁers x and z refer to the outer local-scope-1, and y refers to the global-scope
...

The problem of name hiding doesn’t end there
...
For example, consider the following code
segment
...
f;
{
int j = i;
int i = 0;

...
However, it
turns out that int i does not exist until it is declared below j, so that j is initialised with the only
i in scope at its declaration: float i = 5
...

Name-hiding issues are easily avoided by deﬁning appropriate variable names
...

Avoid same-name identiﬁers for functions or variables that might share the same scope
...
5

Summary of Scope and Extent Rules

Functions are extern by default, as are variables deﬁned outside of any function
...
Functions and external variables that are declared static have ﬁle scope, which
means their visibility is limited to the source ﬁle in which they are deﬁned
...

Functions and external variables have static extent, meaning that they are created before program
execution and exist until the program terminates
...
Non-static local variables have local or automatic extent and are destroyed when they go outof-scope
...

In general, a variable can be deﬁned with a storage class extern, static, or auto
...
For example,
static const double LightSpeed = 2
...
It is possible to deﬁne variables with dynamic extent
such that their lifetime is managed explicitly by the programmer
...

Some other instances of scope are mentioned here for identiﬁers that are not functions or variables
...
Named labels, such as used by goto (see Section 3
...

5
...
Rather than typing declarations
explicitly in each source ﬁle that uses them, it is generally convenient to collect common declarations
in header ﬁles, and include the relevant headers in the source ﬁles as required
...

The #include preprocessor command causes the entire contents of a speciﬁed source text
ﬁle to be processed as if those contents had appeared in place of the #include command
[HS95, page 53]
...
Collecting these
declarations in header ﬁles avoids code duplication, and so avoids possible typographical errors and

38

declaration mismatches
...

Header ﬁle names are suﬃxed with
...
The standard library headers are included
using angle brackets to enclose the ﬁlename as follows
...
h>
Angle brackets indicate that the header ﬁle is located in some “standard” place according to the
compiler-implementation search rules
...
Header ﬁles from other libraries may also be included using the angle bracket syntax if they
too reside on the compiler search path
...
h"
which indicate that the header ﬁle is located in some “local” place, usually the current directory
...
The general intent of the " " form is to denote headers written by the
application programmer, while the < > form indicates (usually standard) library headers
...
7

Modular Programming: Multiple File Programs
The functions and external variables that make up a C program need not all be compiled at
the same time; the source text of the program may be kept in several ﬁles, and previously
compiled routines may be loaded from libraries [KR88, page 80]
...
Grouping code by source ﬁle is central to C’s compilation model, which
compiles each ﬁle separately to produce individual object modules, and these are later linked to
form the complete program
...

This code organisation strategy works as follows
...
The declarations of functions and variables (and constants and data-types)
to be shared with other modules are stored in an associated header ﬁle; this is called the public
interface
...

Functions deﬁned in a module that are called only by other functions within that module are
declared static
...
Similarly, external variables used only
within the module are declared static
...

The advantages of modular programming are as follows
...
This leads to more intuitive
use of a library of code than just a disorganised set of functions
...

• Implementation details are hidden behind a public interface
...
Changes to the implementation can later
be made without aﬀecting client code (e
...
, using a diﬀerent algorithm, or porting platformspeciﬁc code to another platform)
...
e
...
This minimises the
chances incorrect use
...

• Modules facilitate team program development where individuals can each work on diﬀerent
modules that make up the program
...
First, it is desirable to minimise dependencies between
modules
...
Second, the
public interface should be minimal; it should only contain functions required to use the module, not
functions that are just part of the internal implementation
...
Finally, it is good practice to restrict scope as much
as possible, such that local variables are preferred over static variables which, in turn, are preferred
over external variables, and static functions are preferred over external functions
...
A well designed program is ﬂexible, extensible and maintainable, and the key
to such design is modularity
...
Thus, large-scale systems may be built without an overwhelming growth
of complexity induced by dependencies between subtasks
...
A typical progression is as follows
...
The required program operation is described at a general and
then a detailed level
...
The ﬂow of steps, decisions, and loops are planned, usually in the form of a
diagram
...
That is, it deﬁnes the
sequence of operations, and the requirements for communication of data
...
The format of variable types for passing data between functions must be
chosen in order to design function interfaces
...
The structure and components of the program have to
be designed
...

• Coding
...
The process of coding often uncovers ﬂaws in the original design,
or suggests improvements or additional features
...

• Testing and debugging
...
Methods for systematic
testing and debugging are beyond the scope of this text (for more information see, for example,
[KP99, Ben00])
...
Software design is
in part principles and formal methodologies, and in part an artform requiring experience and taste
...
1

Requirements and Speciﬁcation

In order to write a program, it is ﬁrst necessary to know what function the program is to perform
...
For small projects, this might be suﬃcient and programming can commence immediately
...

A program speciﬁcation is a more formal and detailed description of the program’s operation
...
A speciﬁcation may evolve during the program’s implementation, as
diﬃculties and new possibilities come to light
...

6
...
Various formalisms exist for describing
program ﬂow, such as ﬂow diagrams and state-transition diagrams
...

Having deﬁned dependencies, the variable types used to communicate data between diﬀerent
parts of the program should be speciﬁed
...
1 Following the deﬁnition of key data-structures, it becomes
possible to start designing function interfaces
...
3

Top-down and Bottom-up Design

The top-down approach to design is to start with a set of high level tasks that will be called from
main(), and recursively splitting each task into subtasks until a level of complexity is attained that
permits the deﬁnition of reasonably simple function modules
...
1, which shows
a hierarchy where the higher-level functions are implemented in terms of the lower-level functions
...
Usually, the functions are ﬁrst implemented as interfaces only, with do-nothing
internals (also called “dummy” internals)
...
Thus,
the entire program shell is laid out at the start, and the function internals are subsequently built
and debugged one-at-a-time
...
e
...
They tend to be non-reusable, and the top-down design of
each new project must start from scratch
...

Assuming the appropriate argument types are known, the function interfaces can be designed, after
which the internal implementations are usually straight-forward
...
They tend to be less tied to the program at hand and more
amenable to reuse
...
2 Such
1 Data-structures

are discussed further in Chapters 11 and 15
...
These functions simply accept the interface required by the program and convert their arguments to the
form required by the component function and then invoke this function
...
1: The top-down design approach
...
For example, the task of function f3() is implemented in
part by the functions f3a() and f3b(), which each perform a particular subtask
...
The C standard library is a good example of reusable bottom-up design
...
It does not present the “big picture”, the overall
structure of the program with its dependencies and function-call hierarchy
...
These drivers are small test programs that allow one to check the response of a
component to various inputs
...

Top-down and bottom-up design are complementary, and practical design tends to use both
strategies, working at both ends until they meet in the middle
...

6
...

This form of function-level design concentrates on the algorithm structure without getting bogged
down in syntactical details
...
For example, consider the following pseudocode to get values from the user and
compute their factorial
...

In sections where the code intent is not obvious, it is good practice to leave the original pseudocode
in place as a comment
...

43

6
...
This program is simple enough to
describe in a reasonable period of time, and complex enough to demonstrate the concepts required
for larger-scale programs
...
5
...
To begin, the user
is welcomed to the game and asked if he wishes to have ﬁrst go
...
After
each turn, the result is printed in ASCII and, if there is a winner or a draw, the user is asked if he
wishes to play again
...
5
...

Welcome to TIC-TAC-TOE
...
A line may
be across, down, or diagonal
...

Do you wish to go first (1-Yes, 0-No) ?
Each time the user is to make a move, a request is made for which location he wishes to choose,
designated by a number between 1 and 9
...
For
example, after 5 turns, the board might look like
X| |
----X|O|
----|X|O
The game will terminate with a winner or a draw, and a comment is to be printed on account of the
user’s win, loss or draw, respectively
...
Congratulations!!
You lose
...

Its a draw
...

Finally, the user is asked if he wishes to play again and, if so, the game returns to the request of
whether he wishes to have ﬁrst go
...

Do you wish to play again (1-Yes, 0-No) ?

6
...
3

Program Flow and Data Structures

A rough sketch of the ﬂow and dependencies of the program is shown in Figure 6
...
Such diagrams
are very useful for getting an initial impression of modular composition and interaction
...
Figure 6
...
The game loop takes each player’s decision in turn,
prints the resulting game-board, and determines if the game continues or is over
...

The internals of this program are very simple, but it is necessary to specify the main variable
types as they will aﬀect the design of the function interfaces
...
The ﬁrst is the
current state of the game for each of the nine locations on the Tic-Tac-Toe board
...
It was decided to make the game states an
integer array, and the possible states enumerated constants
...
These variables were chosen
to be enumerated types, and their possible states given by enumerated constants
...
5
...
Consider the following examples
...
Therefore it is worthwhile
to write a function that extracts an integer from the keyboard input, and does appropriate input
validation and error handling
...

int getint_from_user(char* message);
In the case of bad input, an error message would be printed and the prompt message reapplied
...
This function would have the following interface
...
2: Initial sketch of program ﬂow for the Tic-Tac-Toe game
...

46

void plot_state(int state[]);
We assume the number of states (9 for a 3x3 game) is known and deﬁned by a symbolic constant,
so it need not be passed as a function argument
...
This function encapsulates the algorithm for choosing the computer’s next move
...
Because the algorithm implementation is
hidden, diﬀerent algorithms can be trialed and debugged without change to the rest of the program
...
This function requires only the current game state, but may be made more eﬃcient if given the most-recent
move as well
...

enum Result compute_result(int state[], enum Turn turn);

6
...
5

Top-Down Design

The structure of the program is reﬁned by a top-down process, starting from the highest level
functions called from main()
...
The functions all specify high-level operations and no lowlevel details are present to confuse the intent of the program
...
Implementation of these functions begins by writing all the
deﬁnitions without any internals (except a return value, if required)
...

The top-down heirarchy of this program is shown in Figure 6
...
All of the top-level functions are
quite trivial, except for play_game(), which is split recursively into smaller subtasks
...

47

Figure 6
...

6
...
6

Beneﬁts of Modular Design

This design encloses each operation within a function
...
The beneﬁt of this modularity is that
the code is ﬂexible; changes and extensions are simple to implement
...
One, it is possible to change the welcome and goodbye messages
easily
...
And three, if the game is ported to an environment with a
graphical interface, only the input-output functions need to be revised
...
To get the computer to make good choices is not trivial
...
Once the rest of the program was fully tested, it was straightforward to write
cleverer decision-making code
...

48

Chapter 7

Pointers
A pointer is a variable that contains the address of a variable [KR88, page 93]
...
They are arguably the most
powerful, and the most dangerous, feature of the C programming language
...
But with a little experience, they can be used to produce concise and eﬃcient code
...
This is certainly true when they are used carelessly, and it is
easy to create pointers that point somewhere unexpected
...

7
...
Figure 7
...
A typical machine has an array of consecutively numbered memory cells
...
Each cell consists of a set of bits, and the cell bit-pattern is the cell’s value
...
Thus, the variable has a value
and an address for where that value resides
...
Consider an example
...
) Let x be deﬁned and initialised to the value 3
...
A pointer px is subsequently deﬁned, assume it is
stored at address 25, and initialised with the address of x as follows
...
Notice that a pointer is just another type of variable; it, also, has
an address and may in turn be pointed-to by a pointer-to-a-pointer variable
...
On most machines,
a cell is 8-bits long (i
...
, one-byte)
...
Each type (e
...
, a double) has an associated pointer type (e
...
, a double *), which is
aware of the number of cells that the type occupies, and enables the compiler to behave appropriately
with sequences of a particular type (e
...
, an array of doubles)
...
However, it is
unwise to make assumptions about the size of a pointer type as such code is likely to be non-portable
...
1: A simple memory model, where memory is an array of cells with consecutive addresses
...
e
...
A
variable x is deﬁned that refers to the cell at address 62, and holds the value 3
...

Aside
...
The void* pointer type is
a special generic object pointer; it is designed to facilitate advanced techniques, which we examine
in Chapter 14
...
For example, a 16-bit pointer can only handle addresses
between 0 and 216 − 1 (i
...
, 65535)
...
e
...

7
...
For example,
int i;
int *j = &i;
deﬁnes a pointer-to-int variable j and initialises it with the address of i
...
For example, in the following,
int* i, j, * k;
i and k are pointers-to-int, while j is a plain int
...

int *i, *k, j;
The value of the variable to which a pointer points can be obtained using the indirection or
dereferencing operator *
...
*/
int x = *j; /* x is assigned the value of i (that is, 2)
...
The
declaration *, meaning “is a pointer-type variable” occurs only in variable or function declarations,
and in all other circumstances the * means dereference or “access the pointed-to object”
...

50

char c = ’A’;
char *pc = &c; /* pc points to c */
double d = 5
...
*/
pd1 points to d */
pd2 and pd1 now both point to d
...
0; /* Equivalent to d = d * 2
...
It is
an error to assign a pointer to an object of a diﬀerent type without an explicit cast
...
f;
unsigned long *p1 = &i; /* Error: type mismatch, won’t compile
...
*/
The exception to this rule is the void* pointer which may be assigned to a pointer of any type
without a cast
...
If a
pointer is supposed to point nowhere, it should do so explicitly via the NULL pointer
...
h and stddef
...
It is usually deﬁned as
#define NULL

((void*) 0)

The constant values 0 or 0L may be used in place of NULL to specify a null-pointer value, but the
symbolic constant is usually the more readable option
...
The ﬁrst, and most
common, is to declare the pointer const so that the object to which it points cannot be changed
...
Cannot change i via p
...

int i = 5, j = 6;
const int *p = &i;
p = &j; /* Valid
...
*/
*p = i; /* Invalid
...
*/
The second form of const declaration speciﬁes a pointer that may only refer to one ﬁxed address
...

int i = 5, j = 6;
int * const p = &i;
*p = j; /* Valid
...
p must always point to i
...

int i = 5, j = 6;
const int * const p = &i;
*p = j; /* Invalid
...
*/
p = &j; /* Invalid
...
*/
1 Casting a pointer from one type to another (say int * to char *) is an operation that should only be performed
if you know what you are doing
...
On the
other hand, converting from void* to another pointer type (and vice-versa) is a common and useful operation; when
doing so, it is not necessary to use an explicit cast, but a cast may be useful for expressing intent
...
3

Pass By Reference

When a variable is passed to a function, it is always passed by value
...
As a result, any changes made to the local
variables within the function will not aﬀect the variables of the calling function
...

swap(a, b); /* Pass values of a and b, respectively
...
*/
{
int tmp = x; /* The variable x is unrelated to the variable a */
x = y;
/* so this operation does not affect a
...

The desired eﬀect of this function can be achieved by using pointers
...

swap(&a, &b); /* Pass pointers to a and b, respectively
...
*/
{
int tmp = *px; /* The value of px is still the address of a */
*px = *py;
/* so this dereferencing operation is equivalent to a = b
...
This is why the * operator is called the indirection
operator
...

Pass by reference semantics is useful for implementing functions, such as swap() above, that
require multiple return values
...
(Arrays are a good example2 of this and, in C, arrays are passed by reference by default
...
) It is possible to prevent unwanted change to a pass-by-reference argument by
declaring the parameter const
...
2; /* Invalid
...
*/
array[i] = 5
...
*/
printf("%f ", array[i]); /* Valid
...

52

A const-pointer declaration has two purposes
...
e
...
e
...

7
...
This is frequently the case, but not always
...
For example,
unsigned buffer[256];
unsigned *pbuff1 = buffer;
/* Buffer converted to pointer, & not required
...
*/
Here pbuff1 points to element 0 of the array, and pbuff2 points to element 5
...
Thus, in the following example,
pdouble and darray are equivalent; they are both pointers
...
Consider the following example, where, within each commented group, the
statements perform exactly the same operation
...
*/
char *pc2 = &letters;
char *pc3 = &letters[0];
letters[4] = ’e’; /* Equivalent indexes
...
*/
&pc1[10];
letters + 10;
pc2 + 10;

Notice that, for an array, its name (e
...
, letters) when used in an expression is equivalent to its
address (e
...
, &letters), which is equal to the address of its ﬁrst element (e
...
, &letters[0])
...
g
...
g
...
And the address of an array element can be obtained using the
address-of operator (e
...
, &letters[10]) or directly from the pointer oﬀset (e
...
, letters + 10)
...

First, an array is not a variable; its value cannot be changed
...
*/
53

a1++;
pa++;

/* Error: won’t compile
...
*/

Second, an array name always refers to the beginning of a section of allocated memory, while a
pointer may point anywhere at all (e
...
, allocated memory, free memory, NULL)
...

double
double
size_t
size_t

7
...
This allows the compiler to automatically
calculate the byte-oﬀset required for indexing an array of that type
...
The address now held by pd is 8-bytes beyond the original address, so that it points to
the next double
...

So, in general, if p is a pointer to some element of an array, then p++ increments p to point to the
next element, and p += n increments it point n elements beyond where it did originally
...

float fval, array[10];
float *p1, *p2, *p3 = &array[5];
int i=2, j;
p1 = NULL;
/* Assignment to NULL (or to 0 or 0L)
...
*/
p1 = p2;
/* Assignment to another pointer (of same type)
...
*/
p2 += i;
/* Another pointer-offset expression
...
*/
i = p2 < p3; /* Relational operations <, >, ==, !=, <=, >= */
At the end of the sequence of operations above, p1 points to the address of fval, p2 points to the
fourth element of array, p3 points to the sixth element of array, i equals one, and j equals 2
...
e
...
Relational comparisons (i
...
, ==, <, >=, etc) are for
determining whether the position of an array element is before or after another
...
3
Note
...
Their behaviour is undeﬁned if applied to pointers from diﬀerent arrays
...
Addition or subtraction
by a ﬂoating-point value, and multiplication or division by a value of any type
...
However,
the standard actually speciﬁes a special type, ptr diff, for the result of pointer subtraction, to cater for the case
where the number of elements in an array is too large to be stored in an int
...
h
...
Also, while subtraction
of two pointers is valid, addition of two pointers is not
...
The ﬁrst implementation uses array indexing
...
g
...
The next implementation increments the pointers directly
...
The ﬁnal implementation below is a common variant
of the second one
...
The loop will now terminate
when the value of *s is 0
...
Learning the idioms of experienced C programmers is part of the process of
becoming a proﬁcient C programmer
...
The eﬃciency gains of pointers over array indexes is largely irrelevant with modern
optimising compilers
...
It is usually bad practice to write obscure pointer-based code
solely for the sake of eﬃciency; pointer arithmetic is best used when it makes code simpler and more
readable
...
First, an index can be negative so long
as it still refers to an element within the array
...

55

int array[10];
int *pa, *end = &array[10];
for (pa = array; pa != end; ++pa)
*pa = end - pa; /* Values of array elements will be: 10,9,8,7,6,
...
A pointer one-before-the-beginning of an array may
not, in general, be used for pointer arithmetic; it is non-standard but works on many (perhaps most)
machines
...
*/

Important
...
For exam-

int array[10];
int *pa = array + 10;
int val = *++pa; /* Invalid, but will compile OK
...
The problem
with faulty pointer arithmetic is that the error might not have an obvious eﬀect
...
More likely, the program will continue to run,
may produce occasional weird results, and will crash unexpectedly at some critical moment (such as
while demonstrating the program to a customer)
...

7
...
For example,
int* func_returns_pointer(void);
However, in many circumstances returning a pointer is a mistake and will not do what is expected
...
The local
object will be destroyed when the function ﬁnishes, and the returned address will point to an invalid
location
...
In the following example,
int* misguided(void)
{
int array[10], i; /* array has local extent: destroyed at end-of-block
...

Returning pointers from functions is ﬁne provided the object pointed-to remains in existence
...

double* geometric_growth(void)
{
static double grows = 0
...
*/
grows *= 1
...
e
...
Pointers may also be returned if they refer to a function input argument
which is itself a pointer
...

char* find_first(char* str, char c)
/* Return pointer to first occurrence of c in str
...
*/
{
while(*str++ != ’\0’)
if (*str == c) return str;
return NULL;
}

7
...
This can be done using a pointer-to-a-pointer as in the following
(rather contrived) example
...
But such constructions are rarely necessary
...

More frequently they appear in relation to arrays of pointers, which are addressed in Section 8
...

7
...

Function pointers are a very useful mechanism for selecting, substituting or grouping together
functions of a particular form
...
Or, they may be collected into an array of function pointers as a “dispatch table”,
where a certain function is invoked based on an array index
...
For example, the following declaration is a pointer to a function that
has a double and an int argument and returns a double
...

Notice the use of parentheses around the identiﬁer (*pf)
...

int (*pf)(char); /* pointer-to-function: input char, return int */
int *f(char); /* function: input char, return pointer-to-int */
In a function declaration, it is possible to declare a function pointer without using an identiﬁer; (this
is the case also with any other function argument)
...

void myfunc(char *message, int nloops, unsigned (*convert)(int));
The following example program uses function pointers to pass functions as arguments to another
function
...
(Notice that, like arrays, function names are automatically converted to pointers
without using the address-of operator &
...
h>
#include ...
0); return a / b; }

void execute operation(double (*f)(double,double), double x, double y)
{
double result = f(x,y);
printf("Result of operation on %3
...
2f is %7
...
3, val2=5
...
This allows diﬀerent sub-algorithms to be swapped in or out at run
time
...

void generic_sort(int *array, int len, int (*compare)(int, int));
int greater_than(int a, int b) { return a > b; } /* Possible sorting criterion
...
*/
generic_sort(values, 20, less_than); /* Using the sort algorithm
...
In
C, array elements are numbered from 0, so that an array of size N is indexed from 0 to N − 1
...

double empty[0]; /* Invalid
...
*/

8
...
Arrays with
static extent have their elements initialised to zero by default, but arrays with local extent are not
initialised by default, so their elements have arbitrary values
...
This is
a list of values of the appropriate type enclosed in braces and separated by commas
...
Thus, to initialise the elements of an array with local extent to
zero, it is suﬃcient to write
int localarray[SIZE] = {0};
It is an error to have more initialisers than the size of the array
...
For example,
int days[] = { 31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31 };
the size of this array will be twelve
...
A common C idiom is
to use sizeof to determine the number of elements in an array as in the following example
...
\n", i+1, days[i]);

59

This idiom is invariant to changes in the array size, and computes the correct number of elements
even if the type of the array changes
...

Note
...
An array
name is automatically converted to a pointer in an expression, so that any other reference to the
array will not be an array name but a pointer
...

int count_days(int days[], int len)
{
int total=0;
/* assert will fail: sizeof(days) equals sizeof(int *) and len equals 12 */
assert(sizeof(days) / sizeof(days[0]) == len);
while(len--)
total += days[len];
return total;
}

8
...
They have certain initialisation properties not shared with other array
types because of their relationship with strings
...

char letters[] = { ’a’, ’b’, ’c’, ’d’, ’e’ };
But they may also be initialised using a string constant, as follows
...
It is equivalent to writing,
char letters[] = { ’a’, ’b’, ’c’, ’d’, ’e’, ’\0’ };
Thus, writing
char letters[5] = "abcde"; /* OK but bad style
...
That is, the value of
the expression is known at compile time, and evaluated by the compiler as a constant
...

60

while not an error, is very poor style, as the size of the array is too small for its initialiser list
...
For constants of any other type, it is not
possible to assign a pointer because these constants are not stored in memory and do not have an
address
...

double *pval = 9
...
Won’t compile
...
Won’t compile
...

char *str = "Hello World!\n"; /* Correct
...
*/
This is because a string constant has static extent—memory is allocated for the array before the
program begins execution, and exists until program termination—and a string constant expression
returns a pointer to the beginning of this array
...
A string constant is a constant array; the memory of the array is read-only
...
For example,
char *str = "This is a string constant";
str[11] = ’p’; /* Undefined behaviour
...

For example, it is legitimate for a function to return a pointer to a string constant; the string is not
destroyed at the end of the function block
...

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

void print base b (unsigned x, unsigned b)
/* Convert decimal value x to a representation in base b
...
However, for a character
array initialised by a string constant, the result is read-writable
...
Note, the only time a string is copied automatically
by the compiler is when a char array is initialised
...

A collection of valid operations on various array types is shown below
...
0, 2
...
0 };
double *parray = array;
/* OK */
char str[] = "Hello World!\n"; /* Correct
...
*/
str[1] = ’a’;
/* OK */

8
...
h
...

• size_t strlen(const char *s)
...
The special unsigned type size_t is used instead of plain int to
cater for the possibility of arrays that are longer than the maximum representable int
...
Copies the string t into character array s, and
returns a pointer to s
...
Performs a lexicographical2 comparison of
strings s and t, and returns a negative value if s < t, a positive value if s > t, and zero if
s == t
...
Concatenates the string t onto the end of string
s
...

• char *strchr(const char *s, int c)
...
If c is not present, then NULL is returned
...
Performs the same task as strchr() but starting
from the reverse end of s
...
Searches for the ﬁrst occurrence of substring t in string s
...

The functions strncpy(), strncmp(), and strncat() perform the same tasks as their counterparts strcpy(), strcmp(), and strcat(), respectively, but include an extra argument n, which
limits their operations to the ﬁrst n characters of the right-hand string
...
It is a general purpose string formatting function that behaves identically to
printf(), but copies the resulting formatted string to a character array rather than sending it to
stdout
...

2 Strings are compared up to their ﬁrst diﬀering character, and string s is lexicographically less than string t if its
character is of a lower value (i
...
, the value of its character code, say ASCII)
...
e
...

62

Aside
...

However, string constants may be concatenated at compile time by placing them adjacent to one
another
...
Compiletime concatenation is useful for writing long strings, since typing a multi-line string constant like
"this is
a string"
is an error
...
(This is one occasion where white-space matters in a C program
...

8
...
For
example, an array of N pointers to ints has the following syntax
...
They might point
to an object, to NULL, to an illegal memory location, or to an array
...
7;
double array[] = { 3
...
3, 5
...
The dereferenced *pa[0] is equal to 9
...
3, but pa[2] is equal to
NULL and may not be dereferenced
...
Consider the following example
...
*/
/* Pointer-to-a-pointer holds address of beginning of pa
...
*/

/* equivalent operations: val = 6 */
1);
1);
+ 1));

Notice that in an expression pa and pp are equivalent, but the diﬀerence is that pa is an array name
and pp is a pointer
...

Arrays of pointers are useful for grouping related pointers together
...

63

For the ﬁrst two categories,3 most interesting applications of pointer arrays involve the use of dynamic
memory, which is discussed in Chapter 9
...

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

#include ...
\n", i, months[i]); /* print string */
printf("The letters of the month are: ");
for (j = 0; months[i][j] != ’\0’; ++j) /* access elements using [ ][ ] */
printf("%c ", months[i][j]);
}

A common application for an array of function pointers is the construction of a “dispatch” table,
which is used to invoke a particular function depending on the value of an array index
...
The following example shows the syntax for deﬁning an
array of function pointers
...

The example program below shows the use of an array of function pointers to perform simple
arithmetic operations
...

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21

#include ...
h>
double
double
double
double

add(double a, double b) { return a + b; }
sub(double a, double b) { return a − b; }
mult(double a, double b) { return a * b; }
div(double a, double b) { assert(b != 0
...
2f and %3
...
2f\n",
i, val1, val2, pf[i](val1, val2));
}
3 The ﬁrst two categories, pointers to large objects and pointers to arrays, are actually of the same kind, where
arrays are just one form of large object
...
A good mechanism for simplifying declarations is to break them up using typedef
...
For example, in the following
declaration,
typedef char * String;
the name String becomes a synonym for char *, and can be used to deﬁne variables of this type
...
";
With regard to complicated function pointer declarations, diﬀerent parts of the declaration can
be given a name using typedef, so that the combined whole is more readable
...
We discuss further uses of typedef in Sections 11
...
1
...
5

Multi-dimensional Arrays
C provides rectangular multi-dimensional arrays, although in practice they are much less
used than arrays of pointers [KR88, page 110]
...

float
{
{
{
};

matrix[3][4] =
2
...
7, 9
...
2, 4
...
1,
7
...
6, 4
...
3 },
8
...
6 }

The braces of the initialisation list are nested such that the inner braces enclose each row of the twodimensional array
...
For example, the layout in memory of matrix is equivalent to the following 1-D array
...
4, 8
...
5, 2
...
2, 4
...
1, 8
...
2, 1
...
4, 3
...

However, only the left-most subscript (i
...
, the number of rows) is free, and the other dimensions
must be given a deﬁnite value
...
*/
{ 2
...
7, 9
...
3 },
{ 6
...
8, 5
...
9 }
};
To access an element of a multi-dimensional array, the correct notation is to enclose each subscript
in a separate pair of square braces
...

float a = matrix[1][2]; /* Correct
...
*/
65

A multi-dimensional array may have any number of dimensions, and higher-dimensional initialiser
lists involve a deeper level of nested braces for each dimension
...

short
{
{
{
{
};

array3d[4][2][3] = {
{ 0, 1, 2 }, { 3, 4, 5 }
{ 6, 7, 8 }, { 9, 10, 11 }
{ 12, 13, 14 }, { 15, 16, 17 }
{ 18, 19, 20 }, { 21, 22, 23 }

},
},
},
}

Note, the above array may have been deﬁned with the left-most subscript unspeciﬁed,
short array3d[][2][3] =
...

An example program using multi-dimensional arrays is given below
...

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34

#include ...

* The result matrix, r, must be zeroed before-hand
...

30 36 42
66 81 96
102 126 150

66

Notice, on line 5, the function interface for multiply() shows three equivalent declarations:
a pointer to an array, a 2-D array with an unspeciﬁed left subscript, and a fully speciﬁed 2-D
array, respectively
...
In addition, notice the parentheses about the pointer-to-an-array identiﬁer,
int (*r)[SIZE], this is to distinguish it from an array-of-pointers, int *r[SIZE]
...
This
can be a source of confusion to novice C programmers
...

However, multi-dimensional arrays and arrays of pointers are diﬀerent in both their representation
and application
...
Second, a multi-dimensional
array is rectangular—each row is the same length, while an array of pointers may refer to arrays
of diﬀerent length (including nothing, NULL)
...

In summation, arrays of pointers are usually the more ﬂexible (and often more eﬃcient) option,
and so are used far more frequently than multi-dimensional arrays
...
Setting aside “enough”
memory, such as deﬁning a char array of suﬃcient size to hold the largest permitted string, is not
convenient in many situations and may be diﬃcult to manage
...
Memory is requested at runtime as required,
and the lifetime of each memory allocation is controlled directly by the programmer
...
1

Diﬀerent Memory Areas in C

C has four distinct areas of memory: the constant data area, the static-extent data area, the stack,
and the heap
...
This area of memory is read-only, and the results of trying to modify it are undeﬁned
...
These variables are read-writable
...

The memory area known as the stack is used to store local variables (i
...
, variables with automatic
extent)
...
The stack behaves like a last-in-ﬁrst-out
(LIFO) queue
...
And at the end of a block, if several variables go out-of-scope at once, the variables are
destroyed, or “popped oﬀ the stack”, in reverse order to their allocation
...

The heap memory area is for dynamically allocated storage and is managed, not by the compiler, but explicitly by the programmer
...
The ﬂexibility and control provided by heap-allocated
memory comes at the price of added responsibility on behalf of the programmer
...

It is worth noting that stack memory allocation is generally much faster than heap memory
allocation, as stack allocation involves only an increment of the stack pointer, while heap allocation
involves much more complex operations
...
For example, an over-size char array,
allocated on the stack, is often used to perform string operations rather than an exact-size array
created on demand on the heap
...
2

Standard Memory Allocation Functions

The two key standard library functions for dynamic memory allocation and de-allocation, respectively, are malloc() and free()
...
Notice that the returned pointer is of type void*, which speciﬁes a generic
pointer, and can represent a pointer of any type
...
It is common to write
an explicit cast to the desired type so as to clarify intent
...
However, it is possible for the request to fail (e
...
, if the available heapspace is full; a rare event on modern machines, but still possible), and when this happens malloc()
returns a NULL pointer
...

The de-allocation function free() releases memory allocated by malloc(), and returns it to the
heap
...

free(p);
The conversion of a pointer of type int* to a pointer of type void* does not require an explicit cast
(in C or C++), and so casts of this variety never appear in practice
...
The ﬁrst, calloc(), behaves almost the same as malloc() but, where malloc() returns
a block of memory that is uninitialised (i
...
, the cells contain arbitrary values), calloc() initialises
the block with zeros
...
For example, the following statement allocates an array of 10 integers initialised
to zero
...

The ﬁnal memory allocation function, realloc(), is used to change the size of an existing block of
dynamically allocated memory
...
The interface of realloc() is as follows
1 Another reason for an explicit cast is for compatibility with C++, which requires a cast for conversions from
void* to a pointer of another type
...

69

void *realloc(void *p, size_t size)
where p is a pointer to the current block of memory and size is the new requested size
...
Also, if realloc() is
passed a size request of 0, then the memory pointed to by p is released, and realloc() returns
NULL
...

Calling realloc() does not change the existing contents of a memory block
...

If, on the other hand, the block is increased, the old values are retained and the new cells will
contain uninitialised (i
...
, arbitrary) values
...
If this can be done in-place, then realloc() might simply adjust certain bounds records
and return but, if there is insuﬃcient space in the current region within the heap, then realloc()
will allocate a new block of memory of the appropriate size, copy across the values of the previous
block, and release the old block
...

9
...
The programmer must keep track of various quantities such as object
lifetimes, pointers to diﬀerent memory locations, lengths of arrays, and so forth
...

Consider, for example, the following function, which makes a copy of the string passed to it and
returns a pointer to the copy
...
User must remember to free this memory
...
The expression strlen(s)+1 is a common idiom for this operation with +1 to
cater for the ’\0’ character
...
However, this function is ﬂawed because it fails to check the return value of
malloc(), which, if NULL, will crash the program during the copy operation
...

1
2
3
4
5
6
7
8
9
10

char *string duplicate(char *s)
/* Dynamically allocate a copy of a string
...
*/
{
char *p;
p = (char *) malloc(strlen(s) + 1); /* +1 for ’\0’ */
if (p != NULL)
strcpy(p, s);
return p;
}

This version detects NULL and passes it back to the calling function to perform error handling
...

70

char *s;
s = string_duplicate("this is a string");

...
*/
Neglecting the free(s) statement means that the memory is not released even when s goes out-ofscope, after which the memory becomes non-recoverable
...

Some common errors related to dynamic memory management are listed below
...
If a pointer is not initialised when it is deﬁned,
it will contain an arbitrary value, such that it points to an arbitrary memory location
...
g
...
Writing to memory via an invalid pointer is known as
“memory corruption”
...
This is a special case of the previous error
...

• Dereferencing a NULL pointer
...
(For
many compilers, dereferencing a NULL pointer is a simple bug to ﬁnd as it causes the program
to crash immediately, but this behaviour is not standard
...
Passing a previously freed pointer to free() will
cause the function to dereference an invalid address
...
Memory that is not allocated
on the heap, but on the stack or constant data area, say, cannot be released by free()
...

• Failing to free dynamically allocated memory
...
Failing to do so results in a “memory leak”
...
Out-of-bounds errors
occur if arrays are not properly bounds checked
...

Good programming practices exist avoid these memory management errors, and following these
rules will greatly reduce the risk of dynamic memory related bugs
...
To avoid memory leaks and memory
corruption, there should be a one-to-one mapping of calls to malloc() and calls to free()
...
Alternatively, one might write a create() function that
allocates memory for an object and a companion destroy() function that frees it
...
A pointer should never hold an arbitrary value,
but should be initialised with either a valid address or NULL, which explicitly marks a pointer
as “points nowhere”
...
A pointer that has the value NULL cannot
be accidentally freed twice, as free(NULL) has no eﬀect
...

free(p); /* OK, no effect
...
4

Example: Matrices

This section presents a more complex example of dynamic memory, which involves functions for
creating and destroying matrices
...
5, an example program shows the creation and use of
a ﬁxed-size (3 × 3) matrix, and the matrices created in the following example may be used similarly
...

The ﬁrst function allocates memory for a matrix with m rows and n columns
...

1
2
3
4
5
6
7
8
9
10
11
12

double **create matrix(int m, int n)
/* Dynamically allocate an (m x n) matrix
...
This function does not check for memory-allocation errors
...

double **matrix = create_matrix(2,3);
matrix[0][2] = 5
...

The above function will operate correctly unless malloc() returns NULL
...
A function that performs appropriate error checking, and
returns NULL if malloc() fails, is shown below
...
Returns a pointer to the
* beginning of the matrix, and NULL if allocation fails
...
*/
p = (double **)malloc(m * sizeof (double *));
if (p == NULL) return p;
/* Allocate rows
...
*/
failed:
for (−−i; i >= 0; −−i)
free(p[i]);
free(p);
return NULL;
}

7

The assert() checks for logical errors in the program
...

10–11

The ﬁrst memory allocation is the the array of m pointers
...

14–19

The next set of allocations are for each row of the matrix
...
In the event of an error, the program jumps out of the allocation loop to the error-handling
code
...

22–26

The error-handling code ﬁrst loops through each successfully allocated matrix row and frees them
...
It is critical that deallocation is performed in this order as the
row pointers would be invalid if the pointer array was freed ﬁrst
...
This function takes as arguments
a pointer-to-a-pointer reference to the matrix, and the matrix dimensions m and n
...

1
2
3
4
5
6
7
8
9
10
11

void destroy matrix(double **p, int m, int n)
/* Destroy an (m x n) matrix
...
*/
{
int i;
assert(m>0 && n>0);
for (i = 0; i < m; ++i)
free(p[i]);
free(p);
}

Notice that the argument n is not actually required by the function, and is part of the interface
only to match the interface of create_matrix()—to make the function more intuitive to use
...

A neat alternative implementation of create_matrix() and destroy_matrix() is shown below,
which is both simpler and less error-prone than the previous solutions
...

73

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25

1–11

double **create matrix(int m, int n)
/* Dynamically allocate an (m x n) matrix
...
*/
{
double **p, *q;
int i;
assert(m>0 && n>0);
/* Allocate pointer array
...
*/
q = (double *)malloc(m * n * sizeof (double));
if (q == NULL) {
free(p);
return NULL;
}
/* Assign pointers into appropriate parts of matrix
...

14–18

Rather than allocate each row separately, memory for the entire matrix is allocated as a single
block
...
This implementation
completely bypasses the goto and its more complex error-handling code
...
The pointers in the pointer array are assigned to elements in the double array at n element
intervals—thus, deﬁning the matrix
...
First, the size of the matrix is not required, removing the possibility of passing incorrect
dimension values
...
Note, these
lines must occur in the right order as p[0] is invalid if p is freed ﬁrst
...
Notice, due to the method by which this matrix
* was created, the size of the matrix is not required
...

double **m1, **m2;
m1 = create_matrix(2,3);
m2 = create_matrix(3,2);
/* Initialise the matrix elements
...
*/
destroy_matrix(m1);
destroy_matrix(m2);

9
...
realloc() is a versatile function; it can perform the tasks of both
malloc() and free() and, in addition, can adjust the size of an existing dynamically allocated
memory block
...
Even dynamically allocated
arrays have ﬁxed size once they have been created
...
That is, to have an array where
elements may be added to the end one-by-one, and the array grows to match the number of elements
it contains; thus, the array can be built-up without knowing the number of elements in advance
...
The public interface for this array
is exported in a header ﬁle vector
...

1
2
3
4
5
6
7
8
9
10
11
12
13

/* Expandable vector, grows in-place as new elements are added to the back
...
*/
/* Vector access operations
...
*/
int get size(void);
int set size(int size);
int get capacity(void);
int set capacity(int size);

5–7

The vector provides three functions for element access
...
The
get_element() function permits the expandable vector to be accessed via pointer arithmetic like
an ordinary C array
...
The ﬁrst of these, get_size() and set_size(), obtain and set the size of
the vector, respectively
...
These functions are explained in more detail below
...
c
...
h (for realloc()) and assert
...

Arrays by deﬁnition occupy a contiguous region of memory, and this creates an issue for developing an expandable array
...
2 At this point,
2 The heap permits blocks of memory to be allocated and deallocated in any order, and the dynamic allocation
functions internally manage the location and size of each block
...

75

if realloc() is asked to further extend the memory block, it will allocate a completely new block
and copy the data from the old block across to the new one
...
However, with an appropriate allocation scheme, an expandable array can be
made to have constant-time complexity on average
...
This means that the frequency of allocation, and hence the cost of copying data, decreases
with array size at the expense of some wasted space
...
h"
#include ...
h>
static const int StartSize = 1; /* initial vector size */
static const ﬂoat GrowthRate = 1
...
Return index of new element if successful, and -1 if fails
...
*/
if (vectorsize == capacity) {
int newsize = (capacity == 0) ? StartSize : (int)(capacity*GrowthRate + 1
...
*/
data[vectorsize] = item;
return vectorsize++;
}

8–10

The external variables data, vectorsize, and capacity deﬁne the current state of the vector;
data points to the array’s dynamic memory block; vectorsize deﬁnes the current array size; and
capacity records the amount of memory allocated
...
Notice that these variables are all given initial values, even though they
would be initialised to these values by default
...

16

When a new element is added to the end of the array, its size increases by one
...

17

There are two cases where new memory must be allocated
...
In the
ﬁrst case, the amount of memory requested will be some initial value StartSize and, in the second,
the available memory will be increased by a multiplicative factor GrowthRate
...

3 Many expanding vector implementations use a geometric factor of two, thereby doubling the available space each
allocation, but a smaller value may provide a more optimal space-time tradeoﬀ
...

76

18–23

The behaviour of realloc() has a few subtle points
...
If the ﬁrst argument is not NULL, it adjusts
the passed memory block to the speciﬁed size and returns a pointer to the adjusted block
...

For this reason, the return value is assigned to a temporary pointer p, rather than
data = (int *)realloc(data, newsize*sizeof(int));
as, in the latter case, the original memory would be lost if realloc() failed
...

27–28

30
31
32
33
34
35
36
37
38
39
40
41
42

Once suﬃcient space is known to be available, the new element is added to the array and the array
size is incremented
...

int pop back(void)
/* Return element from back of vector, and remove it from the vector
...
*/
{
assert(index >= 0 && index < vectorsize);
return data + index;
}

30–35

The function pop_back() returns the last element of the array and then deletes it
...

37–42

The function get_element() returns a pointer to the element speciﬁed by index
...
However, pointers
returned by this function should be used with caution if the array capacity subsequently changes
...
At this instance, any pointers referring to the old memory block are invalidated and
should not be dereferenced
...

43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61

/* Manual size operations
...
Return 0 if successful, -1 if fails
...

* A size of 0 deletes the array
...
*/
{
if (size != capacity) {
int *p = (int *)realloc(data, size*sizeof (int));
if (p == NULL && size > 0)
return −1;

}

}

capacity = size;
data = p;

if (size < vectorsize)
vectorsize = size;
return 0;

44–45

These two trivial functions return the current state of the vector in terms of its size and available
space, respectively
...
If this size is greater
than the current capacity, then extra memory is allocated to suit
...

63–79

The function set_capacity() provides direct access to the memory allocation of the vector
...
If the requested size is less than the current
vector size, the vector is truncated to ﬁt
...
Notice
that a request of size zero, will cause data to become NULL (line 73), and vectorsize and capacity
to become zero—thus, returning the vector to its original state
...
First, it provides a single instance
of the vector only, and we cannot have more than one in any given program
...
To create a vector for other types would require a new, virtually
identical, implementation for each type
...
The application of these features to writing generic code, is the subject of Chapter
14
...

The preprocessor is controlled by special preprocessor command lines, which are lines of
the source ﬁle beginning with the character #
...

The preprocessor removes all preprocessor command lines from the source ﬁle and makes
additional transformations on the source ﬁle as directed by the commands, such as expanding macro calls that occur within the source program text
...

The syntax of preprocessor commands is completely independent of (though in some ways
similar to) the syntax of the rest of the C language [HS95, page 39]
...
These operations include replacing symbolic names with constants,
expansion of macros, and inclusion of header-ﬁles
...

The operations of the preprocessor are not subject to the syntactical rules of the C language, and
so should be used sparingly and with care
...

10
...

These are usually header ﬁles, but may be any text ﬁle
...
The result of this command after
preprocessing is as if the command were replaced by the text of the speciﬁed ﬁle
...
6
...
2

Symbolic Constants

The preprocessor may be used to deﬁne symbolic constants via the #define command
...
Given this
command, the preprocessor will replace all subsequent occurrences of the token name in the source
ﬁle by the sequence replacement text
...
14159

A name created by #define can be undeﬁned using the directive #undef
...

Style Note
...
The #define command is also used to deﬁne macros, and these names
are also usually uppercase
...
The preprocessor directive #define can create names for constants of any type; the type qualiﬁer const can
deﬁne constant variables of any type; and the keyword enum can deﬁne constants of integer type
...
The #define command is most
powerful, and can be used in any situation where the other two might be used, but it is also the
most dangerous as the preprocessor does not respect C syntax rules
...
1
#define ARRAYSIZE
10
const int ArraySize = 10;
double array1[ARRAYSIZE]; /* Valid
...
*/
An enumeration constant does not suﬀer this limitation and, between them, const and enum can
satisfy all symbolic constant operations for which a #define might be used
...
For example,
#define PI
3
...
14159; /* Preferred */
enum { ARRAYSIZE=10 };
/* Preferred */

10
...
The C preprocessor provides a
more sophisticated type of macro deﬁnition by allowing the macro name to be followed by a set of
arguments enclosed in parentheses
...
The C++ language rectiﬁes this
oversight and, in C++, a const int may deﬁne an array size
...
The preprocessor
replaces the macro name with the deﬁned replacement text, and substitutes the argument variables
in the speciﬁed locations
...
In a macro deﬁnition, the parentheses immediately following the macro name must be
directly adjacent to the name without whitespace
...

Macros are typically used for one of two reasons
...
Macros can perform functionlike operations without the overhead of a function call because the code is expanded inline
...
The second use of
macros is to implement a kind of generic function
...
For example, the macro
MAX will work correctly if a, b, and c were type double, or any other type, where as an equivalent
function
int max(int x, int y)
{
return x > y ? x : y;
}
will only accept integer parameters
...
3
...

#define
#define
#define
#define
#define

SQR(x)
SGN(x)
ABS(x)
ISDIGIT(x)
NELEMS(array)

((x)*(x))
(((x)<0) ? -1 : 1)
(((x)<0) ? -(x) : (x))
((x) >= ’0’ && (x) <= ’9’)
(sizeof(array) / sizeof(array[0]))

SQR calculates the square of its argument, SGN calculates the sign of its argument, ABS converts its
argument to an absolute value, ISDIGIT equals 1 if the argument value is between the character
code for 0 and the character code for 9, and NELEMS computes the number of elements in an array
...
The preprocessor is a powerful but rather blunt instrument,
and it is easy to use macros incorrectly
...
The ﬁrst is that
the passed arguments may have surprising precedence after macro expansion
...

The second danger is that arguments with side-eﬀects may be evaluated multiple times after macro
expansion
...
To avoid these sort of problems,
it is good practice to never use expressions with side-eﬀects2 as macro arguments
...
It permits greater
ﬂexibility, but also prevents the compiler from catching some type-mismatch bugs
...
However, with a little
care, macros can be used without signiﬁcant trouble, when required
...
3
...
The following two examples are simple and clever
...
5) : -(int)(0
...
The second macro, ROUND, rounds a ﬂoating-point value to the nearest integer
...
The truncation by casting to int is straightforward if the value is positive, but
machine dependent if the value is negative (see Section 2
...
ROUND gets around this problem by
subtracting the negative value from 0
...

Another clever macro trick is used to make macros behave more like functions
...

#define SWAP(x,y,tmp)

{ tmp=x; x=y; y=tmp; }

This operation might be used as in the next example
int a=4, b=-1, temp;
SWAP(a, b, temp);
However, this macro will not behave in a function-like manner if used in an if-statement
if (a > b) SWAP(a, b, temp); /* Won’t compile */
else a = b;
as this code will be expanded to incorrect C syntax
...
For example, a++ and b *= n, are expressions with side-eﬀects
...

#define SWAP(x,y,tmp)

do { tmp=x; x=y; y=tmp; } while (0)

An alternative solution is to wrap the macro in an if-else statement
...

#define SWAP(x,y,type)

do { type tmp=x; x=y; y=tmp; } while (0)

This might be used as
SWAP(a, b, double);
Finally, a very tricky bitwise technique allows us to perform the swap operation without any temporary variable at all
...
)
#define SWAP(x,y)

10
...
3

do { x^=y; y^=x; x^=y; } while (0)

More Complex Macros

Normally a macro deﬁnition extends to the end of the line beginning with the command #define
...
For example,
#define ERROR(condition, message) \
if (condition) printf(message)
A more interesting example, adapted from [KP99, page 240], performs a timing loop on a section of
code using the standard library function clock(), which returns processor time in milliseconds
...

TIMELOOP(y = sin(x));
It is possible to convert a token to a string constant by writing a macro that uses the # symbol
in the manner of the following example
...
Thus, reasonable measurements require running the function
multiple times and timing the period of the entire loop
...
For example,
PRINT_DEBUG(x/y);
is expanded to
printf("x/y" " = %g\n", x/y);
Another rather obscure preprocessor operator is ##, which provides a way to concatenate two
tokens
...
These are __LINE__, __FILE__,
__DATE__, __TIME__, __STDC__, and __STDC_VERSION__
...
Determining the meaning of each of these macros is
left as an exercise to the reader
...
g
...
Indeed
a clever and useful debugging macro
...
4

Conditional Compilation

The C preprocessor provides a series of directives for conditional compilation: #if, #elif, #else,
#ifdef, #ifndef, and #endif
...
Conditional compilation
is used for three main purposes: to optionally include debug code, to enclose non-portable code, and
to guard against multiple inclusion of header ﬁles
...
This code should be present in the program during a debug build but
removed for the ﬁnal release build
...
Typically, during debug builds, one deﬁnes a symbolic constant DEBUG so that debug code
is included
...
The diﬀerent code sections for diﬀerent machines are then
enclosed in preprocessor conditions so that only the code for the speciﬁed machine is compiled
...
*/
return WaitForSingleObject(handle, 0) == WAIT_OBJECT_0;
#elif defined(__QNX__) || defined(__linux__) /* Code specific to QNX or Linux
...
Otherwise certain symbols might obtain
multiple deﬁnitions, which would result in a compilation error
...
To prevent the problem of multiple inclusion, the following preprocessor idiom is applied
...
h, which begins and ends with the preprocessor commands below
...
*/
#endif
By preﬁxing the ﬁle with #ifndef A_HEADER_H_, the header is included the ﬁrst time (presuming
A_HEADER_H_ is not deﬁned previously), but A_HEADER_H_ is subsequently deﬁned in the next line
...
This idiom is known as
a “header guard”
...
Structures help to organise
complicated data, particularly in large programs, because they permit a group of related
variables to be treated as a unit instead of as separate entities [KR88, page 127]
...
These types are
aggregates of basic types and other user-deﬁned types, and permit a logical grouping of related
data
...

11
...
Consider the following example, which declares a
structure for representing two-dimensional points
...
By convention, structures should always be named with an uppercase ﬁrst letter
...

The variables x and y are called members of the structure named Point
...
*/
or as subsequent deﬁnitions using the tag “struct Point”
...
*/
When a structure is deﬁned, its members may be initialised using brace notation as follows
...
In general, the values within the
initialiser list correspond to the structure member variables in order of their declaration
...
For example,
struct Point topleft;
topleft
...
y = 0;
is an alternative way to initialise the variable topleft
...

struct Point delta = { pt1
...
x, pt1
...
y };
double distance = sqrt((double)delta
...
x + (double)delta
...
y);
Structures can be nested, such that the deﬁnition of one structure type may be composed, in part, of
another structure type
...

struct Rectangle {
struct Point topleft;
struct Point bottomright;
};
To access the lowest-level members of a variable of type Rectangle, therefore, requires two instances
of the member operator
...
topleft
...
2

Operations on Structures

The operations permitted on structures are a subset of the operations permitted on basic types
(such as int, double, etc)
...

struct Point p1 = { 0, 0 };
struct Point p2;
p2 = p1;
/* Valid
...
*/
if (p1 == p2)
/* Invalid
...
*/
printf("Points are equal\n");
if (p1
...
x && p1
...
y) /* Valid
...
*/
printf("Points are equal\n");
A structure may be passed to a function and may be returned by a function
...
*/
{
p2
...
x;
p2
...
y;
return p2;
}
87

As with any other variable, structures are passed by value
...

struct Point a = {5,10}, b = {20,30}, c;
c = point_difference(a, b); /* c = {15,20}, b is unchanged
...
Deﬁning structure pointers and
obtaining the address of a struct variable is the same as for basic types
...
x = 100; /* pt
...
*/
Note, the parentheses about (*pp)
...
To avoid this rather
complicated dereferencing syntax, an alternative notation is provided to access structure members
via a pointer
...
x to be rewritten more simply as
pp->x
...

rect
...
x = 50;
(*pr)
...
x = 50;
pr->topleft
...
3

Arrays of Structures

As for basic types, it is possible to create an array of structures
...
The example below deﬁnes an array of four elements
...
The compiler knows the size of
the structure type and determines the appropriate address oﬀsets accordingly
...
The size of a structure might not equal the sum of its constituent parts
...
Most real machines require objects to satisfy certain alignment
restrictions (e
...
, integers must be located at even addresses), and the compiler will pad the structure
with unnamed “holes” to ensure each member is properly aligned
...
The sizeof operator returns the correct value
...
4

Self-Referential Structures

A structure deﬁnition may not contain an object of its own type,
struct Node {
int item;
struct Node next; /* Invalid
...
*/
}
but it may refer to a pointer of its own type
...
May define a pointer to an incomplete type
...
These include linked-lists, binary
trees, graphs, hash tables, and many more
...
Linked-lists come in
two main varieties: singly-linked and doubly-linked
...
Each node is deﬁned as follows
...
The set of nodes are connected such that the next pointer for each node points to the
address of the next node in the list as shown in Figure 11
...
The end of the list is marked by a node
whose next pointer is NULL
...
1: Singly linked-list
...
A pointer to the ﬁrst node is called the head
...

1 Several

of these data-structures are presented in Chapter 15
...
The subsequent nodes may be accessed by iterating through
each node in turn
...

struct List *node = head;
while (node->next != NULL)
node = node->next;
Linked-lists are useful for their ability to be grown and shrunk easily without issues of memory
reallocation and data copying
...
The function below demonstrates a list-growing implementation
that adds a new node to the end of the list
...
If the function is passed NULL, it creates a new head node and returns a pointer to
it
...
Return pointer to the new List node
...
*/
struct List *newnode = (struct List *) malloc(sizeof (struct List));
if (newnode == NULL)
return NULL; /* allocation failed */
newnode−>item = item;
newnode−>next = NULL;
/* If list is not empty, ﬁnd end of list and attach new node
...
A pointer to the node
one before the insertion is passed to the function, and its next pointer is used to splice the new node
into the next position in the list, as shown in Figure 11
...

Figure 11
...
Prior to
insertion, the next pointer of node A points to node B
...

1
2
3

struct List *insert after(struct List *node, int item)
/* Add new item to the next position in the list after node
...
*/
struct List *newnode = (struct List *) malloc(sizeof (struct List));
if (newnode == NULL)
return NULL; /* allocation failed */
/* If list is not empty, splice new node into list
...

struct List {
int item;
struct List *next;
struct List *prev;
};
The enables the list to be constructed as shown in Figure 11
...
In particular, doubly linked-lists make the deletion of elements simpler and more
eﬃcient than is possible with a singly linked-list
...
3: Doubly linked-list
...
The ends of the list in both directions are marked by NULL
pointers
...
5

Typedefs

In eﬀect typedef is like #define, except that since it is interpreted by the compiler, it
can cope with textual substitutions beyond the capabilities of the preprocessor [KR88, page
147]
...
It does not create a
new type, but merely adds a new name for some existing type
...
The type Length may then be used wherever the type
int might be used
...
For example, the following declaration

91

typedef struct Point {
int x;
int y;
} Point;
permits subsequent deﬁnitions of type Point to be written as
Point pt1, pt2;
without the preceding struct keyword
...

typedef struct list_t List;
struct list_t {
int item;
List *next;
};
Link-list nodes are subsequently deﬁned as type List
...

There are four main reasons for using typedef
...
This is perhaps best exempliﬁed by the ability of typedef
to break up complicated declarations such as arrays-of-pointers-to-functions (as shown in Section
8
...
The second reason for using typedef is that type-synonyms often provide better in-code
documentation than a plain type
...
The third use of typedef
is to shield a program from portability problems
...
One common situation is to use typedef names
for various integer quantities, then make an appropriate set of choices of short, int,
and long for each host machine
...
[KR88, page 147]
The typedef names localise the deﬁnition of non-portable types to a single location, rather than
having them sprinkled throughout the code
...
In the
linked-list examples above, each node contains an item of type int
...

typedef int ValueType;
typedef struct List {
ValueType item;
struct List *next;
} List;
92

The algorithms that use the link-list must also use the typedefed name
...
2

11
...
The synergy of these features permits a more modular
coding style than is possible with either feature in isolation
...
3 The client knows what a module does
but does not need to know how it does it
...
Structures perform a similar task by hiding data representation
details within a higher-level type deﬁnition
...

This design principle is evident in the previous linked-list example, where the client need only be
aware of a List type object, and various functions for adding elements to the list
...

The following is another simple example of object-oriented program design
...

typedef struct {
double real;
double imag;
} Complex;
The set of functions shown below perform a variety of operations on these complex number variables
...
In each case, the client does not need
to know how the functions perform their tasks, but simply what type of result is expected when
they return
...
*/
{
Complex c;
c
...
imag = i;
return c;
}
Complex add complex(Complex a, Complex b)
/* Add two complex numbers and return the result
...
The algorithm will work with
diﬀerent types, but not at the same time
...
2
...
Often the
modules are a library written by another programmer, and the client incorporates these modules into his own code
...
real = a
...
real;
sum
...
imag + b
...
*/
{
Complex prod;
prod
...
real*b
...
imag*b
...
imag = a
...
imag + a
...
real;
return prod;
}
void addequal complex(Complex* a, const Complex* b)
/* Add two complex numbers and store the result in a
...
*/
{
a−>real = a−>real*b−>real − a−>imag*b−>imag;
a−>imag = a−>real*b−>imag + a−>imag*b−>real;
}
int is equal(Complex a, Complex b)
/* Return TRUE if the values of a and b are equal
...
real == b
...
imag == b
...
7

Expandable Array Revisited

In Section 9
...
The code presented was modular and ﬂexible, but has two major limitations
...
Using structures and typedef, we can overcome these two problems, allowing us to create any
number of vectors, and use the same code to produce vectors of diﬀerent types
...
h for the expandable vector is shown below
...
By simply changing line 4 to
typedef char ValueType;
the vector can be made to contain elements of type char, and similarly for other types
...

It is important to realise that because Vector objects are manipulated exclusively by the associated functions declared in lines 13 to 25, the user does not actually need to know how the Vector
type is composed
...
Access to the Vector internals is managed by functions like get_element()
4 In Section 14
...
1, we show how to hide the composition of Vector by making it an opaque type, which enforces
correct use
...
Because of this decoupling, the internal make up of struct Vector may be
changed and, similarly, the function algorithms may be changed, all without aﬀecting client code
...

*/
typedef double ValueType;
typedef struct {
ValueType *data; /* pointer to vector elements */
int size;
/* current size of vector */
int capacity; /* current reserved memory for vector */
} Vector;
/* Vector creation and destruction
...
*/
int push back(Vector *v, ValueType item);
ValueType pop back(Vector *v);
ValueType* get element(Vector *v, int index);
/* Manual resizing operations
...
Two new functions
Vector create_vector(int capacity);
void release_vector(Vector *v);
are used to initialise the internal data of a vector object and to release the data of a vector object,
respectively
...

The implementation of the expandable vector involves only minor modiﬁcations from the original
implementation
...
The code of source ﬁle vector
...

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

#include "vector
...
h>
#include ...
6f; /* geometric growth of vector capacity */
Vector create vector(int capacity)
/* Initialise a vector with the speciﬁed capacity
...
size = 0;
v
...
capacity = (v
...
*/

95

20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84

{

}

free(v−>data);
v−>size = 0;
v−>capacity = 0;

int push back(Vector *v, ValueType item)
/* Add element to back of vector
...
*/
{
/* If out-of-space, allocate more
...
0);
ValueType *p = (ValueType *)realloc(v−>data, newsize*sizeof (ValueType));
if (p == NULL)
return −1;

}

v−>capacity = newsize; /* allocate succeeds, update data-structure */
v−>data = p;

/* We have enough room
...
*/
{
assert(v−>size > 0);
return v−>data[−−v−>size];
}
ValueType* get element(Vector *v, int index)
/* Return pointer to the element at the speciﬁed index
...
*/
int get size(Vector *v) { return v−>size; }
int get capacity(Vector *v) { return v−>capacity; }
int set size(Vector *v, int size)
/* Set vector size
...
*/
{
if (size > v−>capacity) {
ValueType *p = (ValueType *)realloc(v−>data, size*sizeof (ValueType));
if (p == NULL)
return −1;

}

}

v−>capacity = size; /* allocate succeeds, update data-structure */
v−>data = p;

v−>size = size;
return 0;

int set capacity(Vector *v, int size)
/* Shrink or grow allocated memory reserve for array
...
Return 0 if successful, -1 if fails
...
These vectors can contain elements of type ValueType which is speciﬁed at compile
time
...

The client is shielded from the data representation and algorithmic details; these aspects may be
changed without aﬀecting client code
...

This code still has one limitation remaining
...
In a given program, all vectors must contain the same type
...
We return to the
topic of generic programming in Chapter 14
...
8

Unions

A union is a variable that may hold (at diﬀerent times) objects of diﬀerent types and sizes,
with the compiler keeping track of size and alignment requirements
...

The declaration of a union type is similar to the declaration of a struct type
...
*/
Accessing members of a union type is also the same as for structures, with the
...

However, the behaviour of union variables is quite diﬀerent to structures
...
In the above example, the compiler will
allocate suﬃcient memory to store the largest of the types int, float, and char *
...
The type retrieved must be the same as the
type most recently stored, and the result of retrieving a diﬀerent type is implementation dependent
...
fval = 56
...
*/
printf("%f\n", x
...
*/
printf("%d\n", x
...
*/
97

Unions are usually used for one of three purposes
...
This can be performed with a reasonable degree of type
safety by wrapping the union within a structure which records the union variable’s current type
...
*/
int ival;
float fval;
} Utype;
enum { INT, FLOAT }; /* Define type tags
...
*/
Utype val; /* Storage for variant type
...
*/
array[0]
...
ival = 56; /* Assign value
...
type = INT;
/* Mark type
...

for (i = 0; i < sizeof(array)/sizeof(array[0]); ++i) /* Print array
...
type == INT)
printf("%d\n", array[i]
...
ival);
else if (array[i]
...
val
...

The second use of a union is to enforce the alignment of a variable to a particular address
boundary
...
And the
third key use of a union is to get “under the hood” of C’s type system to discover something
about the computer’s underlying data representation
...

void print_representation(float f)
/* Print internal representation of a float (adapted from H&S page 145)
...
f, fi
...

98

Chapter 12

Bitwise Operations
C provides operators for direct manipulation of bits and bytes
...
With care, many of these low-level functions can be implemented in a portable
fashion
...
g
...
In these cases, it is critical to
be able to control the individual bits corresponding to the device pins
...
1

Binary Representations

Computers store and manipulate information as bits—ones and zeros
...

In C, the various types each specify a certain number of bytes, and variables of a particular type
are allocated the required amount of memory
...

Consider a 16-bit (two-byte) integer with the decimal value 22425
...

0101 0111 1001 1001
Since the number is in base 2, each digit from right-to-left represents a successive power of two
(20 , 21 ,
...
The right-most bit is called the least-signiﬁcant bit (LSB) and the left-most bit is
the most-signiﬁcant bit (MSB)
...
Most machines store negative values in 2’s-complement form, such that the
value −22425 is represented as
1010 1000 0110 0111
The conversion of a negative number to 2’s-complement form is straightforward
...

1
...

2
...

3
...

For unsigned types, all bits, including the MSB, contribute to the magnitude of a positive number
...

In the example above, the value of the bit-pattern for an unsigned type is 43111
...
For most purposes, decimal numbers are most
convenient, but for bitwise operations, a hexadecimal representation is usually more appropriate
...
With experience, hex tends to be more comprehensible than plain binary
...
2

Bitwise Operators

C provides six bitwise operators: AND &, OR |, exclusive OR (XOR) ^, NOT ~, left-shift <<, and
right-shift >>
...
The &,
|, ^, <<, and >> operators each have corresponding assignment operators &=, |=, ^=, <<=, and >>=
...
For example,
z &= x | y;
is equivalent to
z = z & (x | y);
Important
...
9, the bitwise operators, &, |, and ~, are diﬀerent from
the logical operators AND &&, OR ||, and NOT !, and should not be confused with them
...

12
...
1

AND, OR, XOR, and NOT

The &, |, ^, and ~ operators permit Boolean logic operations on individual bits
...
The other three are binary operators,
which compare two bits according to the rules in the following table
...
We deﬁne three unsigned variables
unsigned char x = 55; /* 55 (dec) = 0x37 (hex) = 0011 0111 (binary)
...
*/
unsigned char z;
and perform a series of operations on them, storing the result in z
...

0011 0111
0001 1001 &
0001 0001
/* result is 17 (dec) = 0x11 (hex) */
The bitwise & (AND) operator sets a bit in z to 1 only if both corresponding bits in x and y are
one
...

The second operation, z = x | y, makes z equal to 63
...
It is typically used to set selected bits to one
...

0011 0111
0001 1001 ^
0010 1110
/* result is 46 (dec) = 0x2e (hex) */
The bitwise ^ (exclusive OR) operator sets a bit to 1 if either, but not both, corresponding bits in
x and y are one
...

The fourth operation, z = ~x, makes z equal to 200
...
It is commonly
used to negate a group of bits, and is often used in conjunction with the shift operators
...
2
...

For a left-shift operation x << n, the bits in x are shifted n bit positions to the left, with the
shifted-out bits being lost and the vacated bits being ﬁlled with zeros
...
*/
/* Equals 0110 0100 binary (100 decimal)
...

The right-shift operator is a little more complicated
...
However, if x is a signed variable, the expression is implementation dependent
...

For example,
signed char x = -75;
/* Equals 1011 0101 binary
...
*/
/* Equals 1110 1101 if arithmetic shift
...
In general, it is usually preferred to use
unsigned types for bitwise operations to avoid non-portable behaviour
...
g
...

Similarly, right-shift by n is equivalent to division by 2n
...
) Bitwise operations are sometimes used to perform
arithmetic expressions if the right-hand operand is a power of two
...

x * 2 is equivalent to x << 1
x / 16 is equivalent to x >> 4
x % 8 is equivalent to x & 7
101

Bitwise expressions tend to be faster than integer arithmetic, but such optimisations are generally
redundant as modern compilers will tend to replace “power of two” arithmetic with bitwise operations
automatically
...
2
...
The left and
right shift operators have greater precedence than the relational operators < and >, but &, |, and ^
have lower precedence
...
All bitwise
operators have greater precedence than the logical operators && and ||
...
Such practice tends to produce obscure and error-prone code
...

12
...
The ﬁrst is to conserve space by
packing several variables into a single byte (e
...
, binary “ﬂags” that need only represent the values 0
or 1)
...
In both cases we want to be able to manipulate and test individual bits
or groups of bits
...
These techniques are collectively known as “masking”
as they enable particular bits to be selected according to a speciﬁed bit-mask
...

enum {
FIRST
SECND
THIRD
FORTH
ALL =
};

= 0x01,
= 0x02,
= 0x04,
= 0x08,
0x0f

/*
/*
/*
/*
/*

0001
0010
0100
1000
1111

binary
binary
binary
binary
binary

*/
*/
*/
*/
*/

The constants are each powers of two, so that just one bit is set and the rest are zeros
...
An equivalent way to deﬁne the above constants
is to use the left-shift operator as follows
...
The key is to realise that ~0 creates a value where all bits are 1’s
...
By combining masks,
we can select speciﬁc groups of bits to set, reset, or toggle, or to test as a conditional expression
...
*/
/* Reset bits 1,3 (1010)
...
*/

if ((flags & (FIRST | FORTH)) == 0) /* TRUE if bits 1 and 4 are off
...
*/
There are several points to note from this example
...
Second, notice the use of the various assignment operators, which apply the masks to the
left-hand operand, flags
...
e
...
e
...
These operations determine the state of the speciﬁed bits
in flags regardless of their current state
...
Finally, notice in the secondlast line how the bitwise operators can be used to create conditional expressions based on the state
of particular bits
...

12
...
This facility is called a bit-ﬁeld, and it uses structure syntax to deﬁne several objects within
a single variable
...
The number following the colon represents the ﬁeld width in bits
...

flags
...
third = flags
...
first = flags
...
third = !flags
...
forth = !flags
...
first == 0 && flags
...
first = flags
...
third = flags
...

To novice C programmers, bit-ﬁelds may seem a more natural representation than the previous
bit-masking operations, but this is a superﬁcial advantage and bit-ﬁelds should be used with caution
...
For example, diﬀerent machines apply diﬀerent
restrictions on ﬁeld widths; some machines impose a maximum width of 16-bits while others permit
32-bits
...
e
...
Byte-order dependencies are another issue
...
They permit direct
control of bit packing and, with a little experience, the masking idioms tend to be more convenient
for managing groups of bits
...
h in the
Snippets source repository, www
...
org)
...
They might be
used as in the examples below
...
*/
flags = BitFlp(flags, THIRD); /* Toggle third bit
...
*/
flags = 0;

104

Chapter 13

Input and Output
The C language provides no direct facilities for input and output (IO), and, instead, these operations
are supplied as functions in the standard library
...
It also discusses the topic of command-shell redirection which, while non-standard, is
widely supported, and the topic of command-line arguments, which enables a program to receive
instructions from its calling environment
...
More complete
information can be found in, for example, [KR88, pages 241–248] and [HS95, pages 343–383]
...
1

Formatted IO

The standard functions for formatted IO, printf() and scanf(), have been mentioned at a cursory
level in previous chapters
...

However, for most common purposes, they are intuitive, ﬂexible and simple to use
...
1
...
The general
interface for printf() is
int printf(const char *format, arg1, arg2,
...
This is followed by
zero or more optional arguments, with the number of arguments, and their type, being determined
by the contents of the format string
...

The format string is composed of ordinary characters and conversion speciﬁcation characters
...
Conversion speciﬁcations are identiﬁed by a % character
followed by a number of optional ﬁelds and terminated by a type conversion character
...
\n", 10, "bottles");
where the ordinary characters “green” and “sitting on a wall
...
The
105

type conversion character must match its associated argument type; in the example, the %d indicates
an integer argument and the %s indicates a string argument
...
Details of these may be found in any C reference text
...

Between the % and the type conversion character there may exist a number of optional ﬁelds
...
Consider, for example, the conversion
speciﬁer %-#012
...

The ﬁrst ﬁeld is a set of ﬂags, which modify the meaning of the conversion operation (e
...
, make
the argument left-justiﬁed, or pad with zeros)
...
The
third ﬁeld is a precision speciﬁcation, which has various diﬀerent meanings for integers, ﬂoating
point values and strings
...

Again, a good reference text will have more information regarding these conversion speciﬁcations
...
3f radians
...
14159, would print
Value = 3
...

where the converted ﬂoating point value is left-justiﬁed (due to the - ﬂag), is padded with suﬃcient
space for 10 characters, and displays 3 digits after the decimal point
...
However, there is no need for
conversion characters for types float and short, etc, as these types are automatically converted to
double and int, respectively, by the usual argument promotion rules
...
The ability of write functions with variable-length argument lists is not restricted to implementers of the standard library
...
The standard header stdarg
...
The declaration of a
variable-length argument list is marked by ellipsis (
...

int varfunc(char *format,
...
The implementation
of such functions using the macros from stdarg
...
2
1 It is permitted, but bad practice, to call a function without it being previously declared
...
As a result, the compiler has diﬀerent type conversion rules for arguments
passed to functions that have explicit prototypes, and functions that don’t
...

2 See [KR88, page 156] for a nice example, which shows an implementation of a simpliﬁed version of printf()
...
1
...
It obtains data
from standard input, which is typically the keyboard
...
);
This is identical to printf() in form, with a format string and a variable argument list, but an
important diﬀerence is that the arguments for scanf() must be pointer types
...
For
example,
double fval;
scanf("%lf", fval); /* Wrong */
scanf("%lf", &fval); /* Correct, store input in fval
...
It stops when it exhausts the format string, or when some input fails to match
a conversion speciﬁcation
...
3 If a conﬂict occurs between the a conversion speciﬁcation and the
actual input, the character causing the conﬂict is left unread and is processed by the next standard
input operation
...
Most of the conversion characters for printf()—d, i, o, x, c, u, f, e, g, s, p, etc—have similar
meanings for scanf(), but there are certain diﬀerences, some subtle
...
Some of these diﬀerences are as follows
...
It has the width and size
modiﬁer ﬁelds but not the ﬂags and precision ﬁelds
...
e
...
4
• An asterisk character (*) may be used in place of the width ﬁeld for both printf() and
scanf(), but with diﬀerent meanings
...

• The conversion character [ is not valid for printf(), but for scanf() it permits a scanset of
characters to be speciﬁed, which allows scanf() to control exactly the characters it reads in
...
For
example, to read a float, one uses the conversion speciﬁer %f
...

Of the above, the third and fourth points are rather advanced features that we will not dwell
on further
...
The conversion speciﬁer and size modiﬁer must match the associated argument type
or the result is undeﬁned
...
For this case, or if there
is an error, scanf() returns EOF
...
For example, the conversion %10c reads 10 characters into a character array
...

107

The scanf() format string consists of conversion speciﬁers, ordinary characters, and white-space
...
For example, the following statement is used to read a date of the form dd/mm/yy
...
Exceptions to this rule arise with the %c and %[ conversion
speciﬁers, which do not skip white-space
...

char s[10], c;
scanf("%s%c", s, &c); /* s = "one", c = ’ ’ */
scanf("%s %c", s, &c); /* s = "one", c = ’t’ */
In the ﬁrst case, the %c reads in the next character after %s leaves oﬀ, which is a space
...

While the many details of scanf() formatting complicates a complete understanding, its basic
use is quite simple
...
A string is read up
to the ﬁrst white-space character unless terminated early by a width ﬁeld
...
To prevent overﬂow,
a string conversion speciﬁcation should always include a width ﬁeld
...

char s1[10], s2[10], s3[10];
scanf("%9s %9s %9s", s1, s2, s3);
Notice the width ﬁelds are one-less than the array sizes to allow room for the terminating \0
...

A few ﬁnal warnings about scanf()
...
Second, when there is a conﬂict between a conversion speciﬁcation and the actual
input, the oﬀending character is left unread
...
Third, while scanf() is a good choice when
the exact format of the input is known, other input techniques may be better suited if the format
may vary
...
The fgets() function reads a line
of characters into a buﬀer, and sscanf() extracts the data, and can pick out diﬀerent parts using
multiple passes if necessary
...
1
...
They present the following interfaces
...
);
int sscanf(const char *buf, const char *format,
...
It returns the number of characters stored (excluding
\0)
...
For example, the
following code segment creates a format string at runtime, which prevents scanf() from overﬂowing
its character buﬀer
...
*/
scanf(format, buf);
/* Get string from stdin
...

sscanf() extracts values from the string buf according to the format string, and stores the results
in the additional argument list
...
An attempt to read beyond the end of string buf for sscanf() is equivalent to
reaching the end-of-ﬁle for scanf()
...

char buf[100];
double dval;
fgets(buf, sizeof(buf), stdin); /* Get a line of input, store in buf
...
*/

13
...
6 Thus, much of the standard C library is modelled on
UNIX facilities, and in particular the way it performs input and output by reading or writing to
ﬁles
...
This
means that a single homogeneous interface handles all communication between a program
and peripheral devices [KR88, page 169]
...
2
...
h
...
All these
6 An

interesting discussion on the history and philosophy of C is given in [Rit93], available online
...
The FILE data type is an
opaque type, only ever referred to by a pointer, and algorithms operating on it are hidden within associated standard
functions
...

7 The

109

implementation details are hidden from users of the standard library via the FILE type-name and
the associated library functions
...
The second is a mode
string, which determines how the ﬁle may be used
...
The ﬁrst opens an existing ﬁle for reading, and fails if the ﬁle does not exist
...
Opening an
existing ﬁle in "w" mode, ﬁrst clears the ﬁle of its existing data (i
...
, overwrites the existing ﬁle)
...

Each of these modes may include an additional “update” speciﬁcation signiﬁed by a + character
(i
...
, "r+", "w+", "a+"), which enables the ﬁle stream to be used for both input and output
...
2
...

Some operating systems treat “binary” ﬁles diﬀerently to “text” ﬁles
...
) The standard C library
caters for this variation by permitting a ﬁle to be explicitly marked as binary with the addition of
a b character to the ﬁle-open mode (e
...
, "rb" opens a binary ﬁle for reading)
...
If there is an error, it
returns NULL (e
...
, attempting to open a ﬁle for reading that does not exist, or attempting to open
a ﬁle without appropriate permissions)
...

To close a ﬁle, the ﬁle pointer is passed to fclose(), which has the interface
int fclose(FILE *fp);
This function breaks the connection with the ﬁle and frees the ﬁle pointer
...
However, fclose() is called automatically
for each open ﬁle when a program terminates
...
2
...
These are
standard input (stdin), standard output (stdout) and standard error (stderr)
...
Their input and output streams are usually buﬀered, which means that
characters are accumulated in a queue and sent in packets, minimising expensive system calls
...
The stderr stream is reserved
for sending error messages
...

13
...
3

Sequential File Operations

Once a ﬁle is opened, operations on the ﬁle—reading or writing—usually negotiate the ﬁle in a
sequential manner, from the beginning to the end
...

The simplest functions process a ﬁle one character at a time
...
The functions putc() and
fputc() are identical, but putc() is typically implemented as a macro for eﬃciency
...
g
...

To read a character, there are the functions
int fgetc(FILE *fp);
int getc(FILE *fp);
int getchar(void);
which are analogous to the character output functions
...
8 These functions return
the next character in the character stream unless either the end-of-ﬁle is reached or an error occurs
...
It is possible to push a character c back onto an input
stream using the function
int ungetc(int c, FILE *fp);
The pushed back character will be read by the next call to getc() (or getchar() or fscanf(), etc)
on that stream
...
The symbolic constant EOF is returned by standard IO functions to signal either end-of-ﬁle
or an IO error
...
Two standard functions, feof() and ferror(), are provided for this task and, respectively,
they return non-zero if the prior EOF was due to end-of-ﬁle or an output error
...
);
int fscanf(FILE *fp, const char *format,
...
) and fscanf(stdin, format,
...

Characters can be read from a ﬁle a line at a time using the function
char *fgets(char *buf, int max, FILE *fp);
which reads at most max-1 characters from the ﬁle pointed to by fp and stores the resulting string
in buf
...
The function returns when
it encounters a \n character (i
...
, a newline), or reaches the end-of-ﬁle, or has read the maximum
number of characters
...

Character strings may be written to a ﬁle using the function
int fputs(const char *str, FILE *fp);
which returns a non-negative value if successful and EOF if there was an error
...

8 While putc() is equivalent to fputc() and getc() is equivalent to fgetc(), it is important to note that the line
IO functions puts() and gets() are not equivalent to their counterparts fputs() and fgets()
...

111

For reading and writing binary ﬁles, a pair of functions are provided that enable objects to be
passed to and from ﬁles directly without ﬁrst converting them to a character string
...
For
example, if a structure called Astruct were deﬁned, then an array of such structures could be
written to ﬁle as follows
...
2
...
The standard library also provides
a means to move back and forth within a ﬁle to any speciﬁed location
...
For binary ﬁles this value is
the number of characters preceding the current position
...
In both cases the value is in a form suitable for the second argument of fseek(), and the
value 0L represents the beginning of the ﬁle
...

This parameter is an oﬀset, which shifts the ﬁle position relative to a given reference location
...
These specify the beginning of the ﬁle, the
current ﬁle position, and the end of ﬁle, respectively
...

For binary ﬁles, fseek() may be used to move the ﬁle position to any chosen location
...

fseek(fp,
fseek(fp,
fseek(fp,
fseek(fp,

0L, SEEK_SET);
0L, SEEK_CUR);
0L, SEEK_END);
pos, SEEK_SET);

/*
/*
/*
/*

Move
Move
Move
Move

to
to
to
to

beginning of file
...
*/
end of file
...
*/

In the last case, the value pos must be a position returned by a previous call to ftell()
...
*/
The program below shows an example of ftell() and fseek() to determine the length of a ﬁle
in bytes
...
9
1
2
3

/* Compute the length of a ﬁle in bytes
...
c) */
long ﬂength(char *fname)
{
9 This code may not be entirely portable, as the ISO standard does not require compilers to “meaningfully” support
the reference SEEK END for binary ﬁles
...
However, in practice, SEEK END is supported by most compilers
...
Calling rewind(fp)
is equivalent to the statement fseek(fp, 0L, SEEK_SET)
...

These perform essentially the same tasks as ftell() and fseek(), respectively, but are able to handle ﬁles too large for their positions to be representable by a long integer
...
3

Command-Shell Redirection

Often programs are executed from a command-interpreter environment (also called a shell)
...
For example, Win32 has a DOS-shell and UNIX-like
systems have various similar shell environments such as the C-shell, the Bourne-shell, the Korn-shell,
etc
...

Redirection is not part of the C language, but an operating system service that supports the C inputoutput model
...
h>
/* Write stdin to stdout */
int main(void)
{
int c;
while ((c = getchar()) != EOF)
putchar(c);
}

Consider the example program above
...
Normally this means the characters typed at the keyboard are echoed on the screen after
the user hits the “enter” key
...

repeat
type some text 123
type some text 123
However, a ﬁle may be substituted for the keyboard by redirection
...
txt
display contents of infile
...
txt to outfile
...

repeat ...
txt
Further redirection commands are >> and |
...
The latter is called a “pipe”,
and it directs the stdout of one program to the stdin of another
...

The stderr stream is not redirected, and so will still print messages to the screen even if stdout
is redirected
...
4

Command-Line Arguments

The C language provides a mechanism for a program to obtain input parameters from the environment in which it executes
...
Until now we have speciﬁed the interface of main() as
int main(void)
which is one of the two interfaces permitted by the ISO C standard
...

Style Note
...

The value of argc is the number of command-line parameters passed to the program upon
execution
...
The length of this array is argc+1 and the end element,
argv[argc] is NULL
...
If argc equals one, then there are no command-line arguments after the
program name
...
This program simply takes the set of command-line strings
referred to by argv and prints them to stdout
...

1
2
3
4
5
6
7
8
9
10

#include ...
*/
int main (int argc, char *argv[ ])
{
int i;
for (i = 1; i < argc; ++i)
printf("%s ", argv[i]);
printf("\n");
}

When a program is invoked from a command-line, each white-space separated string of characters
including the program name becomes a command-line argument
...
This program then prints all arguments except
argv[0]
...
txt two three 123
will print
one two three 123
in the ﬁle named outfile
...

114

Chapter 14

Generic Programming
Writing generic and reusable software is one of the pinnacles of software design
...

The antithesis of generic design is a set of functions that possess virtually identical code, but
diﬀer in terms of the types they operate on
...
Coding for the general case tends to reduce the overall coding eﬀort, reduce the
likelihood of errors, and increase code modularity and reuse
...
This chapter brings
them together under the guise of writing generic programs
...
This chapter examines the use of
void* in generic software design in terms of two examples: an expandable array and the standard
library function qsort()
...

14
...
Typedefs provide type-name synonyms, macros circumvent C type-checking, and unions deﬁne variant types
...
1
...
These type synonyms permit an algorithm to be written such that it can work with diﬀerent types by simply
changing the appropriate typedefs
...
7, where the variable
type contained by the vector is speciﬁed by a typedef
...

115

14
...
2

Macros

Macros can be used with diﬀerent types in a given program because they avoid the type-constraints of
the C compiler
...
For example, given the macro
#define MAX(x,y)

((x)<(y) ? (y) : (x))

and a set of variables
int x = 1, y = 2, z;
double a = 3
...
1, c;
then the following code
z = MAX(x,y);
c = MAX(a / 2
...
7)<(b + 1) ? (b + 1) : (a / 2
...

#define NELEMS(x)

(sizeof(x)/sizeof(x[0]))

This macro works for arrays of any type including arrays of doubles, pointers, structures, and so
forth
...
To compose any
signiﬁcant algorithm as a macro is both notationally inconvenient and error prone
...

14
...
3

Unions

Unions permit the deﬁnition of variant types—types that may represent one of several diﬀerent
types
...
It is also used to create a collection of diﬀerent types (i
...
, to create a
heterogeneous array)
...
8, where we noted that unions are
usually wrapped in a structure, which deﬁnes the type of the value currently stored by the union
...

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

enum VariantType { INT, FLOAT, STRING };
typedef struct {
enum VariantType type;
union {
int i;
ﬂoat f;
char *s;
} value;
} Variant;
void interpret command(Variant command)
{
switch (command
...
value
...
value
...
value
...
type);

break;
break;
break;
break;

The function interpret_command() might be passed information as follows
...
type = INT;
v
...
i = -5;
w
...
value
...
Notice that the deﬁnition of the structure Variant (lines 3 to 10) contains an unnamed
union type with a single variable value
...

14
...
With these facilities we can implement
algorithms and data-structures that operate on any number of diﬀerent types
...

This discussion is presented in terms of two examples
...
5 and 11
...
The second is the standard library function
qsort(), which is used to sort the elements of an array
...
2
...
It permits any number of Vector objects to be deﬁned, and each of these may contain
elements of a diﬀerent type
...
This increase in ﬂexibility comes at the price of a slight decrease in eﬃciency
and a signiﬁcant loss of compile-time type-safety
...

The ﬁrst thing to notice when reading this source code is how similar it is to the original implementation, which permitted just one vector and could only contain integers (see Section 9
...
A
fully generic implementation is thus possible without any great increase in code complexity
...
h
...
Notice
that the deﬁnition of struct Vector is not exported
...

Particularly, the client never has access to the structure members
...

1 Opaque types, which hide the structure internals from the client, facilitate a valuable form of object-oriented-style
encapsulation that presents a high-level abstraction and hides implementation details
...
e
...
In general,
no design principle should be used blindly, but always as part of an overall plan to improve program functionality,
maintainability, etc
...
h> /* for size t */
typedef struct Vector Vector;
/* Vector creation and destruction
...
*/
int vector push back(Vector *v, const void *item);
void vector pop back(Vector *v, void *item);
void* vector get element(Vector *v, size t index);
void* vector get begin(Vector *v);
void* vector get end(Vector *v);
/* Size operations
...
c
...

1
2
3
4
5
6
7
8
9
10
11
12
13
14

#include
#include
#include
#include

...
h>
...
h"

static const size t StartSize = 5;
/* initial vector capacity */
static const ﬂoat GrowthRate = 1
...
For example,
the structure might one day be changed to
typedef struct Vector {
void *begin;
/* pointer to beginning of allocated memory */
void *end;
/* pointer to one-past-end of allocated memory */
void *last;
/* pointer to one-past-end of last vector item */
size_t itemsize;
/* size of each item */
} Vector;
This structure deﬁnition would permit alternative implementations of the functions described below
that might be slightly more eﬃcient
...

118

The next two functions create and destroy a Vector, respectively
...

The function vector_create() is passed two variables
...
The second is the size of each element which, because the
function is generic, is not known to the compiler and must be stated explicitly
...
*/
{
Vector *v = (Vector *)malloc(sizeof (Vector));
if (v) {
v−>data = malloc(capacity*itemsize);
v−>capacity = (v−>data == NULL) ? 0 : capacity;
v−>size = 0;
v−>itemsize = itemsize;
}
return v;
}
void vector destroy(Vector *v)
/* Release memory owned by vector
...
Of these, the implementation
of vector_push_back() is the most complex, but it is described thoroughly in Section 9
...
The
only new features are the conversions between void* and char* pointer types, and the use of the
standard library function memcpy() (see line 49)
...
Return index of new element if successful, and -1 if fails
...
*/
if (v−>size == v−>capacity) {
size t newsize = (v−>capacity == 0) ? StartSize : (size t)(v−>capacity*GrowthRate + 1
...
*/
memcpy((char*)v−>data + v−>size * v−>itemsize, item, v−>itemsize);
return v−>size++;
}
void vector pop back(Vector *v, void *item)
/* Return element from back of vector, and remove it from the vector
...
*/
{
assert(index >= 0 && index < v−>size);
return (char*)v−>data + index * v−>itemsize;
}
/* Return pointer to beginning of array (ie, to ﬁrst element of array)
...
*/
void* vector get end(Vector *v) { return (char*)v−>data + v−>size*v−>itemsize; }

The conversion to a char* pointer is necessary because it is not possible to perform pointer arithmetic
with a void* pointer
...
Thus, an expression such as ptr + index has no meaning for a
void* pointer as the compiler cannot compute the pointer oﬀset
...

By converting a void* pointer to char*, we are able to perform pointer arithmetic over each byte
of the pointed-to data
...
For example,
consider the statement on line 65, which returns a pointer to the element speciﬁed by index
...
However, if the type of the array elements were known, the array pointer could
be cast to this type and the compiler would compute the oﬀset automatically
...
To copy
bytes of data from one place to another, we use the standard function memcpy(), which has the
following interface
...

The functions vector_get_begin() and vector_get_end() return pointers to the ﬁrst element
in the array and to the one-past-the-end of the array, respectively
...
It permits the implementation of the following
idiom to iterate over an array
...
Their implementations
present no new details not described previously
...
*/
size t vector get item size(const Vector *v) { return v−>itemsize; }
/* Inquire after vector size and capacity */
size t vector get size(const Vector *v) { return v−>size; }
size t vector get capacity(const Vector *v) { return v−>capacity; }
int vector set size(Vector *v, size t size)
/* Set vector size
...
*/
{
if (size > v−>capacity) {
void *p = realloc(v−>data, size*v−>itemsize);
if (p == NULL)
return −1;

}

}

v−>capacity = size; /* allocate succeeds, update data-structure */
v−>data = p;

v−>size = size;
return 0;

int vector set capacity(Vector *v, size t size)
/* Shrink or grow allocated memory reserve for array
...
Return 0 if successful, -1 if fails
...
2
...
The most important is loss of type-safety
...
A related problem is that the compiler will not perform the usual type
conversions when passing variables to these functions, such as converting a float to double for a
Vector of doubles
...

vector_push_back(v, &50);
/* Invalid, cannot take address of a constant
...
*/
A good technique to improve type-safety, and also address the other problems mentioned above,
is to wrap the generic interface in type-speciﬁc “wrapper functions”
...
Consider the following set of wrapper functions for Vectors
of type int
...
h contains the public interface
...
h"
/* Vector creation
...
*/
int vector push back int(Vector *v, int item);
int vector pop back int(Vector *v);
#endif

There are several points to note from this header ﬁle
...

vector_push_back_int(v, 50);
vector_push_back_int(v, i + j);
The second is that items of the wrong type are automatically converted to ints by the usual type
conversion rules
...
*/
Finally, notice that wrappers are not provided for most of the generic public interface
...
For example, vector_destroy(), vector_get_element(), vector_set_size(), etc,
do not rely on type information
...
It is good practice to avoid including header ﬁles in other header ﬁles where possible
...
In the case of vector_int
...
h could be avoided, and replaced with
typedef struct Vector Vector;
as the declarations in vector_int
...
Rather, vector
...
c, and the dependence
between the two headers is reduced to a single declaration
...
h in vector_int
...
We never call vector_create_int() without calling the generic function
vector_destroy()
...

The next set of functions are the contents of the source ﬁle vector_int
...
These functions call
the generic functions to perform the actual operations, but also incorporate some type-checking code
...
They do not check whether
the actual element types are correct, and diﬀerent types of compatible size will not be caught
...
1
...

1

#include "vector_int
...
h>
Vector *vector create int(size t capacity)
{
return vector create(capacity, sizeof (int));
}
int vector push back int(Vector *v, int item)
{
assert(vector get item size(v) == sizeof (int));
return vector push back(v, &item);
}
int vector pop back int(Vector *v)
{
int i;
assert(vector get item size(v) == sizeof (int));
vector pop back(v, &i);
return i;
}

14
...
3

Case Study: qsort()

The standard library function qsort() is declared in header stdlib
...
The function is called qsort() because it is typically implemented using
the algorithm known as “quick-sort”
...
The details and pitfalls of quick-sort
are beyond the scope of this text but are discussed thoroughly in any good algorithms textbook
...

The ability to create specialised sub-algorithms via function pointers greatly enhances the power
of generic code
...

The general interface of qsort() is
void qsort(void *array, size_t nobj, size_t size,
int (*cmp)(const void *, const void *));
where the ﬁrst three arguments are a pointer to the beginning of an array, the number of elements in
the array, and the size of each element, respectively
...
This function is used internally
by qsort(), which passes it pairs of array elements
...

The following example uses qsort() to sort an array of structures of type struct Database
...
The function comp_dbase(), in lines 12 to 23, speciﬁes this
ordering given any two Database variables, and is the custom comparison function for qsort()
...

In main() we ﬁrst create an array of Database elements, and give each element a random key
value (lines 30 to 33)
...

123

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39

#include ...
h>
#include ...
key = rand();
db[i]
...
1f\n", db[i]
...
item);
}

The power of qsort() is that it may be used with arrays of any arbitrary data-type such as,
struct Dictionary {
char *word;
char *defn;
};
and each diﬀerent data type may be compared via its own particular comparison function, as in the
following example
...

qsort(dt, NELEMS(dt), sizeof(dt[0]), comp_dict);

124

Generic code that relies on function pointers typically comes at the price of reduced eﬃciency due
to function-call overhead
...
In C++ the facilities of
templates and inlined functions avoid these runtime penalties and permit the construction of generic
code that is as eﬃcient as custom-built code
...
Data-structures determine how information is stored and exchanged, and
have a signiﬁcant eﬀect on the overall cohesiveness, clarity, and eﬃciency of the program
...
(However, these data-structures are not speciﬁc to C,
and may be implemented similarly in other programming languages
...
For more information, the interested reader is directed to a good algorithms
and data-structures textbook
...

15
...
One is to optimise the
execution of a set of loops and expressions in a process of code tuning
...
The following are
common examples
...

• Replace power-of-two integer multiplication and division with bitwise shift operations
...

• Move non-changing expressions outside of loops to avoid unnecessary re-evaluation
...

The gains for such adjustments are typically small, and often come at the price of considerable loss of
code clarity
...

In fact, manual code tuning may actually result in slower execution than the original (simpler)
code as the more obscure operations might be less amenable to compiler optimisation
...
That is, the program should
be written with a focus on correctness and clarity, and subsequently tested for bottlenecks using a
1 The texts [CLRS01] and [Sed98] are highly recommended
...

For those desiring a rigorous exposition, the seminal three-volume work of Knuth [Knu98a, Knu98b, Knu98c] is the
standard reference
...
A proﬁler is a software tool that measures where a program spends most of its time, and
enables code tuning to be focused where it is actually needed
...

Some programmers pay too much attention to eﬃciency; by worrying too soon about
little “optimizations” they create ruthlessly clever programs that are insidiously diﬃcult to
maintain
...
Good programmers keep eﬃciency in
context: it is just one of many problems in software, but it is sometimes very important
[Ben00, page 87]
...
These decisions tend to have a much greater eﬀect on
program speed, often by orders-of-magnitude
...
Time complexity is usually expressed in terms of O-notation which, in essence, states an
upper-bound on computation to within a constant factor
...
Informally, O(n) may be understood as
meaning “to the order n” or “linear with respect to n”
...

Aside
...
Registers, caches, RAM, and virtual memory all incur diﬀerent access penalties,
with each lower level in the hierarchy often costing several orders-of-magnitude greater runtime
than its predecessor
...
Algorithms that exploit space optimisation, locality of reference, and other
memory-friendly techniques are usually much faster than those that don’t
...

15
...
It is directly supported by the C language and
deﬁnes a contiguous sequence of elements in memory
...
If an array is of ﬁxed size, known at compile-time, it
can be allocated on the stack, otherwise, its size may be determined at run-time and allocated on
the heap via malloc()
...
5, 11
...
2
...

For an array of size n, to access or change the value of an array element is an O(1) operation
...
A linear search of an array is O(n); a binary
search of a sorted array is O(log n)
...
3

Linked Lists

A linked-list is a set of nodes, where each node contains an item of data and a pointer to the next
node in the list
...
The
implementation of singly linked-lists is covered in Section 11
...

127

For simplicity, let the contained item be an integer
...
*/
/* Pointer to next node in list
...
*/

The following functions perform node insertion and deletion operations, which may be applied at
any point along the list
...
Return pointer to new node
...
*/
List *newnode = (List *) malloc(sizeof (List));
if (newnode == NULL)
return NULL; /* allocation failed */
/* If list is not empty, splice new node into list
...
*/
newnode−>next = NULL;
newnode−>prev = NULL;
}
newnode−>item = item;
return newnode;
}
void delete node(List *node)
/* Remove node and reconnect links on either side
...
Existing links are connected to the new node to eﬀectively splice it into the list
...
That is, it creates a new list of length one
...
The if-statements check that the adjacent links were not NULL
...
Thus, linked-lists are a good choice if insertion/removal
operations are frequent into the middle of a sequence
...

128

15
...
The diﬀerence is
that this data-structure has ﬁxed size and its end loops back to its beginning
...
(In the
latter case, the end-node points back to the head-node
...

The basic form of a circular buﬀer is shown in Figure 15
...
The buﬀer is allocated a ﬁxed amount
of space, and may contain no more elements than this maximum
...
The front index points to the empty slot where the
next element will be added
...
Thus, items
are pushed onto the front of the buﬀer and removed from the back
...

(This continuity is represented naturally by a linked-list, with the connection of the end-node back
to the head-node being no diﬀerent to any other link
...
)

Figure 15
...
Items are added to the front and extracted from the
back
...

Circular buﬀers are very common in real-time and embedded systems where there are various
processes that communicate with each other
...
The buﬀer acts as temporary
storage to allow both processes to operate asynchronously, with neither process having to wait for
the other to complete a transaction
...

typedef struct CircBuf_t {
ValueType *array;
/*
int size;
/*
int nelems;
/*
int front;
/*
int back;
/*
} CircBuf;

Pointer to array of items
...
*/
Current number of items in buffer
...
*/
Index to back of buffer
...
Notice
that, once again, the structure CircBuf is an opaque type with its deﬁnition hidden within the
private interface
...

Their implementation is straightforward, with the wrap-around code being the only subtlety
...

* Returns 0 for success, and -1 if buﬀer is full
...

* Returns 0 for success, and -1 if buﬀer is empty
...
Circular buﬀers generally deal with being full in one of two ways
...
2 A second option is to over-write
existing items so that, for a buﬀer of size n, only the most recent n elements are retained
...
If another element is inserted, both
indexes are incremented and the back item is lost
...
A common alternative to this comparison code is the following idiom, which uses
the modulus operator to bound the index
...
A common diﬃculty with all the data-structure implementations presented in this chapter,
and with algorithm design in general, is ensuring correct behaviour for boundary conditions
...
For example, the boundaries
for a circular buﬀer occur when the buﬀer is full or empty, or when the indices wrap around
...
Checking boundary conditions is an important technique for ﬁnding subtle bugs
...
e
...
Concurrent programming, concerned with multiple threads or processes, is beyond
the scope of this text
...
If a piece of code is going to fail, it
will likely fail at a boundary
...

15
...
The basic form is shown in Figure 15
...
The stack grows upwards as items
are pushed onto the top, and shrinks back down as items are popped oﬀ the top
...
If a stack does not have a ﬁxed
maximum, it might be implemented using a dynamic representation such as our expandable vector
...
2: Stack
...

As an example implementation, consider the following data type for a stack with maximum size
MAXSIZE
...
*/
int count;
/* Number of elements in stack
...
(We neglect bounds
checking, for brevity
...
6

Queues

A queue, otherwise known as a ﬁrst-in-ﬁrst-out (FIFO) buﬀer, represents the other side of the coin
to a stack
...
Thus, a queue removes items in the
same order as they were placed onto it
...
3
...
Variable length queues are a little more complicated
...
Another alternative is to use a deque, which is a linked-list of short arrays
and provides certain eﬃciency advantages of both arrays and lists
...
Once again, we neglect error checking for the sake of brevity, but
the basic operations are clear
...
3: Queue
...
Thus, items are extracted in the same order as they were inserted
...
7

Binary Trees

Binary trees are self-referential structures like linked-lists; they consist of nodes that point to other
nodes
...
4
...
If a particular child does not exist, the
pointer is NULL, and if both pointers are NULL, the node is called a leaf node
...

Figure 15
...
The top node is the root node
...
In this ﬁgure, the
bottom and righthand nodes are leaf nodes
...
If the data were to be stored ﬁrst and
sorted second, then an array or list would be the right data-structure for the job
...
For a balanced tree containing n elements,
insertion of a new node in sorted order takes O(log n) time
...
Figure 15
...
) Furthermore, a balanced tree permits O(log n) search for items, and the entire
tree may be examined in sorted order in O(n) operations
...
This structure, and the associated functions, are used to implement a wordcounting algorithm similar to that presented in [KR88, pages 139–143]
...
The Tree structure contains the two data items and
pointers to left and right Tree nodes as follows
...
*/
Count of string appearances
...
*/
Pointer to right-child
...
These perform operations to add new strings, count the occurrences of a speciﬁed
string, print all strings in lexicographic order, and delete the entire tree structure, respectively
...
Its basic operation
is to ﬁrst search for whether the word already exists in the tree
...
Otherwise, it adds the new word to the tree with a count of one
...

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21

Tree *add item(Tree *node, const char *item)
/* Search for whether item already exists in tree
...
Perform
* recursive descent for search, return pointer to current node
...
*/
cmp = strcmp(item, node−>item);
if (cmp < 0)
node−>left = add item(node−>left, item);
else if (cmp > 0)
node−>right = add item(node−>right, item);
else
++node−>count; /* item already in tree, increment count */
return node;
}

8–9

If the passed pointer is NULL, a new node is created by calling make_node() and a pointer to this
node is returned
...

12

The binary tree stores words in lexicographic order
...

13–14

If the word is less than the node word, we recurse using the node’s left-hand child
...

15–16

If the word is greater than the node word, we recurse using the node’s right-hand child
...
e
...

There are three points to note from this function
...
Second, when a new child node is created, it is attached to its parent
node via the return value (lines 9, 14 and 16)
...

The next two functions perform binary search and in-order visitation, respectively
...
(The word count is the number of times a particular word was sent to add_item()
...
e
...
The
second function, print_inorder() visits every node in the tree and prints the stored word and
word count
...
) The recursive
implementation of print_inorder() causes the nodes to be printed in sorted order
...
*/
{
while (root) {
int cmp = strcmp(item, root−>item);
if (cmp < 0)
root = root−>left;
else if (cmp > 0)
root = root−>right;
else
return root−>count;
}
return 0;
}
void print inorder(Tree *node)
/* Print tree in lexicographical order */
{
if (node == NULL) return;
print inorder(node−>left);
print node(node);
print inorder(node−>right);
}

The basic binary tree, as presented here, is suﬃcient for a great many situations
...
However,
it behaves very ineﬃciently if the data does not arrive in random order and the tree becomes
unbalanced
...

Various solutions exist that resolve this problem
...
Also, a data-structure called a skip list, while entirely
diﬀerent to a binary tree, possesses the same insertion and search properties as for a balanced binary
tree
...
Trees are best suited to tasks where the data is to be in sorted
order during the insertion phase
...
With a good sorting algorithm, the time required to sort an array is less than
the time to insert data into a tree
...
Also, arrays consume less space than trees
...

134

15
...
The
most common form of hash table, and the easiest to explain, is one that combines an array of pointers
with a set of linked-lists
...
5
...
A list may be empty, in
which case the pointer is NULL
...

Figure 15
...
An array of pointers (buckets) point to singly linked-lists
(chains)
...
Each chain may be zero length or greater
...
Given an item of data to be stored in the table, the
hash function computes an index based on the value of this item
...
The selected bucket
points to a linked-list, which is searched to check whether the item is already stored, otherwise the
item is added to the front of the list
...
Essentially it should distribute
items evenly between the diﬀerent buckets and not favour any particular bucket over another
...
Also, the length of the array of pointers should not be arbitrary
but itself a prime number
...

A hash table is useful for implementing very fast lookup tables
...
Provided the hash function distributes items evenly, the chains will be short enough so that
the entire operation may be considered O(1)
...

In the following example we use a hash table to implement a dictionary
...
These word-deﬁnition pairs are stored in the hash table
using the word as a search key
...

1
2
3
4
5
6
7
8

typedef struct Dictionary t Dictionary;
Dictionary *create table(void);
void destroy table(Dictionary *);
int add word(Dictionary *, const char *key, const char *defn);
char *ﬁnd word(const Dictionary *, const char *key);
void delete word(Dictionary *, const char *key);

The next section of code shows part of the private interface
...
e
...
Notice this value is prime
...
Each node in a chain contains a word, its deﬁnition, and a pointer
to the next node
...
The hash function (lines 13 to 22) is a complicated device, and
we will not elaborate on its workings here
...
” Notice that it takes a
string argument and converts it to a bucket index
...
*/
{
const int HashValue = 31;
unsigned h;

}

for (h = 0; *str != ’\0’; ++str)
h = *str + HashValue * h;
return h % HASHSIZE;

The two functions that follow demonstrate the main workings of the hash table algorithm
...
If the word is already stored, the old deﬁnition is replaced with the new one, otherwise, if the
word is not found, a new table entry is added
...

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

int add word(Dictionary *dict, const char *key, const char *defn)
/* Add new word to table
...

* Return 0 if successful, and -1 is fails
...
Return NULL if key not found
...
This bucket, in turn,
points to the head of a node chain
...

11–18

If the node pointer is not NULL then a match was found before the end of the chain
...
(Note, the function allocate_string() is part of the module’s
private interface and makes a duplicate of the passed string
...

(Note, the function makenode() is part of the module’s private interface; it creates and initialises a
Nlist node
...

39–41

If the keyword is found, return a pointer to its deﬁnition, otherwise return NULL
...
e
...
Writing a generic version of a hash table is not trivial because diﬀerent data
types typically require diﬀerent hash functions
...
(A default hash function might be
called if the user passes NULL
...
Hash tables provide very fast O(1) add, delete, and ﬁnd operations
on average, if supplied with an eﬀective hash function
...
g
...

137

Chapter 16

C in the Real World
This text has covered most of the core ISO C language and its use
...
This chapter provides a sampling
of the virtually limitless ﬁeld of extensions and related topics with regard to writing C programs in
the real world
...

TODO: complete this chapter
...
1

Further ISO C Topics

There are many details of ISO C and the standard library that are not covered in this text
...
They include:
• Complete rules of operator precedence and order of evaluation
...

• Memory alignment and padding
...
For the most part, this standard to backward compatible with C89, and the older standard currently remains the more important language in
practice
...
These functions are frequently
useful and are worthy of study
...
h), mathematical functions (math
...
h), utilities (stdlib
...
h), ﬂoating-point speciﬁcations
(float
...
h), assertions (assert
...
h), signal handling (signal
...
h), etc
...
A complete and authoritative reference is [HS95, HS02], and is highly recommended for practicing programmers
...
It is worth
noting that many C idioms are not recorded in any textbook and can only be discovered from
practical experience and reading the source code of others
...
Diﬀerent compilers may conform to the standard to diﬀerent extent
...
This is less likely
with modern compilers
...
As a rule, it is wise
to compile code on several diﬀerent compilers to ensure standard conformance
...
2

Traditional C

The C language was created in the early 1970s and, over the next decade or so, grew and evolved
substantially, until ﬁnally being standardised in 1989
...

This version is now called “classic C” or “K&R C”, and has signiﬁcant diﬀerences to ISO C
...

}
would be written as
double func(a, b, c)
int a, b;
char c;
{

...

16
...
Makeﬁles manage compilation dependencies
and linking, and permit partial compilation, so that only those parts of the program that have
changed need to be recompiled
...
They are platform dependent, and some compiler environments (e
...
, Microsoft
Visual Studio) manage project Makeﬁles automatically via the IDE
...

Two examples worth examining are GNU-Make (GMAKE) and CMAKE
...

TODO - add reference to GMAKE - add reference to CMAKE - cross-platform makeﬁle generation

16
...

- but is limited in the capabilities of the standard library
...
Graphics
(OpenGL, VTK), GUI frameworks, threads, interrupts, real-time, hardware, audio, serial comms,
sockets, ﬁle-structure—directories
...

139

16
...
program
...

16
...

- Interfacing C with FORTRAN, assembler, C++, MatLab, etc
...
7

Memory Interactions

Historically, instruction count was a premium
...

Programmers spent a lot of eﬀort ﬁnding ways to minimise instruction count
...

Modern computers, with fast CPUs, are no longer constrained primarily by instruction execution
...
While ever waiting for instructions or data to be fetched
from memory, the CPU is idle and cycles are wasted
...
This hierarchy, from fastest to slowest, consists
of registers, cache, main random access memory (RAM), hard-disk, and magnetic tape
...

Each level in the hierarchy is typically slower than the level above by several orders-of-magnitude
...
Essentially all
modern operating systems manage the transfer of data between RAM and hard-disk, so that the
hard-disk appears as additional, albeit slow, RAM known as virtual memory
...

If the information is already in registers, it can be executed immediately
...
Similarly “lines”
of RAM are moved up into cache, and “pages” of hard-disk memory are moved up to RAM
...
For example, if all cache lines are full and a
new line is required from RAM, the least recently used cache line is returned to RAM
...
Thus, if the next item required was a
neighbour, that item is already in cache and is available for immediate execution
...
An algorithm with
a large instruction count but good locality may perform much faster than another algorithm with
smaller instruction count
...

There are various factors that aﬀect locality of reference
...
There is usually
a tradeoﬀ between size and speed, whereby to use less memory the program requires the execution
of more instructions and vice-versa
...
Another factor is data-structures
...
A third
140

factor is the way an algorithm utilises a data-structure
...

The use of dynamic memory—allocation from the heap—can signiﬁcantly aﬀect program execution speed
...
(Functions like printf() manage this automatically by buﬀering characters and only sending them to the OS when the buﬀer is full or
when explicitly ﬂushed
...
Also, over successive
allocations and deallocations, the heap tends to become fragmented and develops bad locality-ofreference
...

One approach to alleviating heap allocation overhead is to use arenas
...
Clients use the arena to perform local allocations, and obtain portions of the arena cache
...

16
...
However, there exist a vast number of more sophisticated data structures
for more specialised problems
...

The literature on advanced algorithms is also vast
...
There are also special-case
sorting algorithms that have linear-time complexity for problems with appropriate structure
...
For further reading,
the following texts are recommended [Sed98, CLRS01, Knu98a, PTVF92]
...

Good programming practice tends to come with experience - abide by basic rules - apply them
consistently - make code clean, modular, and easy to read and maintain
Small scale organisation: - clean consistent formatting, naming conventions - comments - grouping of related operations, spacing (paragraphs) - conciseness: never write a long program when a
short one will do
...
- functions: names
and interfaces
Large scale organisation: - modularity - private, public interfaces

A
...
3 - comments 1
...
5, 2
...
5, 2
...
2 - structs 11
...
8 - switch
3
...
5 - magic numbers 4
...
2 - pointer syntax 7
...
5 - argv, argc 13
...
2
...
- design for the general case
...

This often results in a more elegant, often simpler solution, and promotes reuse
...
Layered design
...
- use adapter functions
to use components with non matching interfaces
...
2

Common Errors

Errors: - NUL is not NULL - dangling else - scanf needs pointer: & - assert is a macro, so don’t put
necessary code in it - break from if statement - use of error or abort - numerical - divide-by-zero overﬂow

142

Appendix B

The Compilation Process
The following is a brief explanation of how the source code of a program is converted to an executable
binary image
...

There are three key programs that convert the code from text to executable: the compiler, the
assembler and the linker
...
This consists
of a preprocessor, a lexical analyser, a parser, and (optionally) an optimiser
...
It includes
information from header-ﬁles, replaces symbolic constants, and expands macros
...

• The parser takes the string of tokens, and orders them logically into cohesive groups called
expressions
...

• The optimiser is an optional compilation stage that reorders expressions (and maybe substitutes equivalent expressions) to produce faster and/or smaller code
...
Further optimisation may take place after the
code-generation phase below
...

• The compiler back-end converts the expression trees to assembler code
...

• The assembler translates the assembler code to object code
...
The result is a binary
“executable image” of the program
...

143

Bibliography
[Ben00]

J
...
Programming Pearls
...

A unique, interesting and practical book about programming in the real world
...
It covers many
aspects of eﬃciency, particularly space eﬃciency, not found in other texts
...
H
...
E
...
L
...
Stein
...
The
MIT Press, 2nd edition, 2001
...
The
word “introduction” should not be misunderstood; it does not mean a simplistic
book, rather it indicates that this text is an entry-point to the vast and diverse
literature on the subject of computer algorithms
...
P
...
L
...
Steele
...
Prentice-Hall, 4th edition,
1995
...
It is an excellent reference
documenting and cross-referencing every detail of the C language
...
P
...
L
...
Steele
...
Prentice-Hall, 5th edition,
2002
...

[Knu98a] D
...
Knuth
...

Addison-Wesley, 3rd edition, 1998
...
It is a rigorous and authoritative source on
the properties of many classical algorithms and data-structures
...
E
...
The Art of Computer Programming, volume 2: Seminumerical Algorithms
...

[Knu98c]

D
...
Knuth
...

Addison-Wesley, 2nd edition, 1998
...
W
...
Pike
...
Addison-Wesley, 1999
...
It
covers topics like coding-style, program design, debugging, testing, eﬃciency,
and portability
...
W
...
M
...
The C Programming Language
...

One of the best C textbooks available; authoritative and complete
...
For the novice programmer, this text may
move too quickly and its examples may be too large and complex
...

[PTVF92] W
...
Press, S
...
Teukolsky, W
...
Vetterling, and B
...
Flannery
...
Cambridge University Press, 2nd edition, 1992
...

[Rit93]

D
...
Ritchie
...
In ACM History of Programming
Languages, 1993
...
bell-labs
...
pdf
...
A valuable read for those interested in better understanding the philosophy of the C
language
...
Sedgewick
...
Addison-Wesley, 3rd edition, 1998
...
It covers the fundamental datastructures and gives a comprehensive presentation on algorithms for sorting
and searching
...
C Programming FAQs: Frequently Asked Questions
...

http://www
...
com/~scs/C-faq/top
...

A comprehensive, authoratitive and useful FAQ on C language
...
The FAQ was published
as a book in 1995 and is also maintained online
...
h, 3
macro, 3
magic numbers, 11
main(), 3
memory, 2
stack, 2, 33
nested
comments, 3

eﬃciency, 27, 30, 42, 55, 76, 88, 117, 125, 126
enumeration, 12
escape character, 3, 11
expressions, 11

operators

146

arithmetic, 13
assignment, 15
bitwise, 15, 100
logical, 14
relational, 14
pass-by-reference, 26
pass-by-value, 25
portable, 2
postincrement, 14
precedence, 15
preincrement, 14
proﬁler, 126
scope, 18, 33
external, 8
statement, 3
string, 11, 60
truncation, 13
type, 3, 8
opaque, 109
variable, 3
whitespace, 3

147

Buy These Notes Preview

Notesale: Turn your study into money

Already a Member? >

Search for notes by fellow students, in your own course and all over the country.

My Basket

Document Preview