Notes on Mathematics - 1021 | More Info | Notesale | Buy and Sell Study Notes Online | Extra Student Income | University Notes

Search for notes by fellow students, in your own course and all over the country.

Browse our notes for titles which look like what you need, you can preview any of the notes via a sample of the contents. After you're happy these are the notes you're after simply pop them into your shopping cart.

My Basket

Buy These Notes

You have nothing in your shopping cart yet.

Title: Notes on Mathematics - 1021
Description: Lecture is on I Linear Algebra II Ordinary Differential Equation III Laplace Transform IV Numerical Applications

Buy These Notes Preview

Document Preview

Extracts from the notes are below, to see the PDF you'll receive please use the links above

Notes on Mathematics - 1021
Peeyush Chandra,

1

Supported by a grant from MHRD

A
...
Lal,

V
...
Santhanam

2

Contents
I

Linear Algebra

1 Matrices
1
...

1
...
1 Special Matrices
...
2 Operations on Matrices
...
2
...
3 Some More Special Matrices
...
3
...

1
...
1 Block Matrices
...
4 Matrices over Complex Numbers
2

7

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

Linear System of Equations
2
...

2
...

2
...
1 A Solution Method
...
3 Row Operations and Equivalent Systems
...
3
...

2
...

2
...
1 Gauss-Jordan Elimination
...
4
...

2
...

2
...

2
...
1 Example
...
6
...

2
...
3 Exercises
...
7 Invertible Matrices
...
7
...

2
...
2 Equivalent conditions for Invertibility
2
...
3 Inverse and Gauss-Jordan Method
...
8 Determinant
...
8
...

2
...
2 Cramer’s Rule
...
9 Miscellaneous Exercises
...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

19
19
20
21
21
24
26
27
29
30
33
33
34
35
35
35
37
39
40
43
45
46

3 Finite Dimensional Vector Spaces
49
3
...
49
3
...
1 Deﬁnition
...
1
...
51

4

CONTENTS

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

53
54
57
58
60
66

4 Linear Transformations
4
...
2 Matrix of a linear transformation
4
...

4
...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

69
69
72
75
80

...

...

...

...

87
87
92
100
103

...

...

107

...
113

...
121

3
...
3
3
...
1
...

3
...
4 Linear Combinations
Linear Independence
...

3
...
1 Important Results
...

...

...

...

...

...

...

5 Inner Product Spaces
5
...

5
...

5
...

5
...
1 Matrix of the Orthogonal Projection

...

...

...

...

6 Eigenvalues, Eigenvectors and Diagonalization
6
...

6
...

6
...

6
...

II

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

Ordinary Diﬀerential Equation

7 Diﬀerential Equations
7
...

7
...

7
...
1 Equations Reducible to Separable Form
7
...

7
...
1 Integrating Factors
...
4 Linear Equations
...
5 Miscellaneous Remarks
...
6 Initial Value Problems
...
6
...

7
...

129

...

...

...

...

...

...

...

...

...

...

8 Second Order and Higher Order Equations
8
...

8
...

8
...
1 Wronskian
...
2
...

8
...
4 Non Homogeneous Equations
...
5 Variation of Parameters
...
6 Higher Order Equations with Constant Coeﬃcients

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

131

...
134

...
136

...
141

...
145

...
150

...

...

...

...

153

...
156

...
159

...
162

...
166

CONTENTS
8
...
170

9 Solutions Based on Power Series
9
...

9
...
1 Properties of Power Series
...
2 Solutions in terms of Power Series
...
3 Statement of Frobenius Theorem for Regular (Ordinary)
9
...

9
...
1 Introduction
...
4
...

III

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

Laplace Transform

189

10 Laplace Transform
10
...

10
...

10
...
1 Examples
...
3 Properties of Laplace Transform
...
3
...
3
...

10
...

10
...
1 Limiting Theorems
...
5 Application to Diﬀerential Equations
...
6 Transform of the Unit-Impulse Function
...

...

Point

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...
1 Introduction
...
2 Diﬀerence Operator
...
2
...

11
...
2 Backward Diﬀerence Operator
11
...
3 Central Diﬀerence Operator
...
2
...

11
...
5 Averaging Operator
...
3 Relations between Diﬀerence operators
11
...

12 Lagrange’s Interpolation Formula
12
...

12
...

12
...
4 Gauss’s and Stirling’s Formulas
...
175

...
179

...
181

...
182

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

209

...
209

...
211

...
214

...
214

...

...

...
221

...
224

...
1 Introduction
...
2 Numerical Diﬀerentiation
...
3 Numerical Integration
...
3
...
233
13
...
2 Trapezoidal Rule
...
3
...
235

14 Appendix
14
...

14
...

14
...

14
...

14
...
6 Condition for Exactness
...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...
239

...
246

...
251

...
1

Deﬁnition of a Matrix

Deﬁnition 1
...
1 (Matrix) A rectangular array of numbers is called a matrix
...

The horizontal arrays of a matrix are called its rows and the vertical arrays are called its columns
...

A matrix A of order m × n can be represented in the following form:


a11

 a21
A=
...


...

...

am2

···
···

...

···


a1n

a2n 

...


...

In a more concise manner, we also denote the matrix A by [aij ] by suppressing its order
...
1
...


...

am1

a12
a22

...

...

...
 to represent a matrix
...


...
Then a11 = 1, a12 = 3, a13 = 7, a21 = 4, a22 = 5, and a23 = 6
...

Let A =

Whenever a vector is used, it should be understood from the context whether it is
a row vector or a column vector
...
1
...
, m and j = 1, 2,
...

In other words, two matrices are said to be equal if they have the same order and their corresponding
entries are equal
...
MATRICES

Example 1
...
4 The linear system of equations 2x + 3y = 5 and 3x + 2y = 5 can be identiﬁed with the
2 3 : 5
matrix

...
1
...
1
...
A matrix in which each entry is zero is called a zero-matrix, denoted by 0
...

0

2
...
Thus,
its order is m × m (for some m) and is represented by m only
...
In a square matrix, A = [aij ], of order n, the entries a11 , a22 ,
...

4
...
In other words, the
4 0
non-zero entries appear only on the principal diagonal
...

A diagonal matrix D of order n with the diagonal entries d1 , d2 ,
...
, dn )
...
, n then the diagonal matrix D is called a scalar matrix
...



1 0 0
1 0


, and I3 = 0 1 0
...
A square matrix A = [aij ] with aij =

The subscript n is suppressed in case the order is clear from the context or if no confusion arises
...
A square matrix A = [aij ] is said to be an upper triangular matrix if aij = 0 for i > j
...

A square matrix A is said to be triangular if it is an upper or a lower triangular matrix
...
An upper triangular matrix will be represented
0 0 −2


a11 a12 · · · a1n


 0 a22 · · · a2n 

...

...

...

...


...

...
2

Operations on Matrices

Deﬁnition 1
...
1 (Transpose of a Matrix) The transpose of an m × n matrix A = [aij ] is deﬁned as the
n × m matrix B = [bij ], with bij = aji for 1 ≤ i ≤ m and 1 ≤ j ≤ n
...

1
...
OPERATIONS ON MATRICES

11

That is, by the transpose of an m × n matrix A, we mean a matrix of order n × m having the rows
of A as its columns and the columns of A as its rows
...

0 1 2
5 2
Thus, the transpose of a row vector is a column vector and vice-versa
...
2
...

Proof
...
Then, the deﬁnition of transpose gives
cij = bji = aij for all i, j
and the result follows
...
2
...
Then the
sum A + B is deﬁned to be the matrix C = [cij ] with cij = aij + bij
...

Deﬁnition 1
...
4 (Multiplying a Scalar to a Matrix) Let A = [aij ] be an m × n matrix
...

For example, if A =

1 4
0 1

5 20 25
5

...
2
...
Then
1
...
(A + B) + C = A + (B + C)

(commutativity)
...

3
...

4
...

Proof
...

Let A = [aij ] and B = [bij ]
...

The reader is required to prove the other parts as all the results follow from the properties of real
numbers
...
2
...
Suppose A + B = A
...

2
...
Then show that B = (−1)A = [−aij ]
...
2
...

1
...
This matrix B is called the additive inverse of A, and
is denoted by −A = (−1)A
...
Also, for the matrix 0m×n , A + 0 = 0 + A = A
...

12

CHAPTER 1
...
2
...
2
...
The product AB is a matrix C = [cij ] of order m × r, with
n

cij =
k=1

aik bkj = ai1 b1j + ai2 b2j + · · · + ain bnj
...


1 2 1
1 2 3


For example, if A =
and B = 0 0 3 then
2 4 1
1 0 4
AB =

1+0+3 2+0+0
2+0+1 4+0+0

1 + 6 + 12
4
=
2 + 12 + 4
3

2 19

...
However, for square
matrices A and B of the same order, both the product AB and BA are deﬁned
...
2
...

Remark 1
...
10
1
...
Also for any d ∈ R,
the matrix dIn commutes with every square matrix of order n
...

2
...
For example, consider the following two
1 0
1 1

...

1

Theorem 1
...
11 Suppose that the matrices A, B and C are so chosen that the matrix multiplications are
deﬁned
...
Then (AB)C = A(BC)
...

2
...

3
...
That is, multiplication distributes over addition
...
If A is an n × n matrix then AIn = In A = A
...
For any square matrix A of order n and D = diag(d1 , d2 ,
...

A similar statement holds for the columns of A when A is multiplied on the right by D
...
Part 1
...
Then
p

(BC)kj =

n

bkℓ cℓj and (AB)iℓ =
ℓ=1

aik bkℓ
...
3
...

bkℓ cℓj
ℓ=1
p

n

k=1 ℓ=1
p
n

=

p

n

aik BC

(AB)C

AB

c
iℓ ℓj

ℓ=1
ij

...
, n, we have
n

(DA)ij =

dik akj = di aij
k=1

as dik = 0 whenever i = k
...

The reader is required to prove the other parts
...
2
...
Let A and B be two matrices
...
Also, if the matrix product AB is deﬁned then prove that (AB)t = B t At
...
Let A = [a1 , a2 ,
...
Compute the matrix products AB and BA
...


...
Let n be a positive integer
...

1

4
...

(a) Suppose that the matrix product AB is deﬁned
...

(b) Suppose that the matrix products AB and BA are deﬁned
...

(c) Suppose that the matrices A and B are square matrices of order n
...

1
...
3
...
A matrix A over R is called symmetric if At = A and skew-symmetric if At = −A
...
A matrix A is said to be orthogonal if AAt = At A = I
...
3
...
Let A = 2 4 −1 and B = −1 0
3 −1 4
−2 3
B is a skew-symmetric matrix
...
Then A is a symmetric matrix and
0

14

CHAPTER 1
...
Let A =



1
√
3
√
1
 2
1
√
6

1
√
3
1
− √2
1
√
6

1
√
3




0 
...

2
− √6

3
...
Then An = 0 and Aℓ = 0 for 1 ≤ ℓ ≤

n − 1
...
The least positive integer k for which Ak = 0 is called the order of nilpotency
...
Then A2 = A
...

4
...
3
...
Show that for any square matrix A, S = 1 (A + At ) is symmetric, T = 1 (A − At ) is
2
2
skew-symmetric, and A = S + T
...
Show that the product of two lower triangular matrices is a lower triangular matrix
...

3
...
Show that AB is symmetric if and only if AB = BA
...
Show that the diagonal entries of a skew-symmetric matrix are zero
...
Let A, B be skew-symmetric matrices with AB = BA
...
Let A be a symmetric matrix of order n with A2 = 0
...
Let A be a nilpotent matrix
...

1
...
1

Submatrix of a Matrix

Deﬁnition 1
...
4 A matrix obtained by deleting some of the rows and/or columns of a matrix is said to be
a submatrix of the given matrix
...

2

4
1 4
and
are not submatrices of A
...
)
0 2
0

Miscellaneous Exercises
Exercise 1
...
5

1
...
2
...
2
...

cos θ
y1
1 0
x1
and B =
, y=
, A=
sin θ
x2
y2
0 −1
and y = Bx
...
Let x =

3
...

y2 = b21 z1 + b22 z2
x2 = a21 y1 + a22 y2

− sin θ

...
3
...

(b) If xt = [x1 , x2 ], yt = [y1 , y2 ] and zt = [z1 , z2 ] then ﬁnd matrices A, B and C such that
x = Ay, y = Bz and x = Cz
...
For a square matrix A of order n, we deﬁne trace of A, denoted by tr (A) as
tr (A) = a11 + a22 + · · · ann
...

(b) tr (AB) = tr (BA)
...
Show that, there do not exist matrices A and B such that AB − BA = cIn for any c = 0
...
Let A and B be two m × n matrices and let x be an n × 1 column vector
...

(b) Prove that if Ax = Bx for all x, then A = B
...
Let A be an n × n matrix such that AB = BA for all n × n matrices B
...



1 2


8
...
Show that there exist inﬁnitely many matrices B such that BA = I2
...

1
...
1

Block Matrices

Let A be an n × m matrix and B be an m × p matrix
...
Then, we can decompose the
H
; where P has order n × r and H has order r × p
...
Similarly, H and K are submatrices of B and H consists of the ﬁrst r
rows of B and K consists of the last m − r rows of B
...

Theorem 1
...
6 Let A = [aij ] = [P Q] and B = [bij ] =

H
be deﬁned as above
...

Proof
...
The matrix products P H
and QK are valid as the order of the matrices P, H, Q and K are respectively, n × r, r × p, n × (m − r)
and (m − r) × p
...
Then, for 1 ≤ i ≤ n and 1 ≤ j ≤ p,
we have
m

(AB)ij

=

r

aik bkj =
k=1
r

=

m

aik bkj +
k=1
m

Pik Hkj +
k=1

aik bkj
k=r+1

Qik Kkj
k=r+1

= (P H)ij + (QK)ij = (P H + QK)ij
...
MATRICES

Theorem 1
...
6 is very useful due to the following reasons:
1
...

2
...
In this case, it may be easy to handle the matrix product using the block form
...
Or when we want to prove results using induction, then we may assume the result for r × r
submatrices and then look for (r + 1) × (r + 1) submatrices, etc
...

d
0
2a + 5c 2b + 5d

0 −1

If A =  3
1
−2 5

0 −1

A= 3
1
−2 5

0 −1

A= 3
1
−2 5


2

4  , then A can be decomposed
−3


2
0 −1 2


4  , or A =  3
1
4
−3
−2 5 −3

2

4  and so on
...
3
...
Compute the matrix product AB using the block matrix multiplication for the matrices



as follows:



 , or

m1 m2
s1 s2
Suppose A = n1
and B = r1
P Q
E F
...

Even if A + B is deﬁned, the orders of P and E may not be same and hence, we may not be able
P +E Q+F
to add A and B in the block form
...

R+G S+H
Similarly, if the product AB is deﬁned, the product P E need not be deﬁned
...
And
P E + QG P F + QH
in this case, we have AB =

...



1
 0

A=
 0
0

0
1

0
1

1
1

1
0



1
1
 1
1 


 and B = 
 1
0 
1
−1

2
1

2
2

1
1

1
−1


1
1 


...
If P, Q, R and S are symmetric, what can you say about A? Are P, Q, R and S
R S
symmetric, when A is symmetric?

2
...
4
...
Let A = [aij ] and B = [bij ] be two matrices
...
, an are the rows of A and
b1 , b2 ,
...
If the product AB is deﬁned, then show that


a1 B


 a2 B 

...
, Abp ] = 


...

an B

[That is, left multiplication by A, is same as multiplying each column of B by A
...
]

1
...
All the deﬁnitions still hold
...

Deﬁnition 1
...
1 (Conjugate Transpose of a Matrix)
1
...
If A = [aij ]
then the Conjugate of A, denoted by A, is the matrix B = [bij ] with bij = aij
...
Then
1
i−2
A=

1
0

4 − 3i
−i

...
Let A be an m × n matrix over C
...

For example, Let A =

1
0

4 + 3i
i

...

−i
−i − 2

3
...

4
...

5
...

6
...

Remark 1
...
2 If A = [aij ] with aij ∈ R, then A∗ = At
...
4
...
Give examples of Hermitian, skew-Hermitian and unitary matrices that have entries
with non-zero imaginary parts
...
Restate the results on transpose in terms of conjugate transpose
...
Show that for any square matrix A, S =
A = S + T
...
Show that if A is a complex triangular matrix and AA∗ = A∗ A then A is a diagonal matrix
...
MATRICES

Chapter 2

Linear System of Equations
2
...

1
...
Consider the system ax = b
...

(b) If a = 0 and
i
...

ii
...

2
...

Consider the equation ax + by = c
...
Thus for the system
a1 x + b1 y = c1 and a2 x + b2 y = c2 ,
the set of solutions is given by the points of intersection of the two lines
...
Each case is illustrated by an example
...
The unique solution is (x, y)t = (1, 0)t
...

(b) Infinite Number of Solutions
x + 2y = 1 and 2x + 4y = 2
...
In other words, both the equations represent the same line
...

(c) No Solution
x + 2y = 1 and 2x + 4y = 3
...

Observe that in this case, a1 b2 − a2 b1 = 0 but a1 c2 − a2 c1 = 0
...
As a last example, consider 3 equations in 3 unknowns
...
As in the
case of 2 equations in 2 unknowns, we have to look at the points of intersection of the given three
planes
...
The three cases are illustrated by examples
...

LINEAR SYSTEM OF EQUATIONS

(a) Unique Solution
Consider the system x+ y + z = 3, x+ 4y + 2z = 7 and 4x+ 10y − z = 13
...
e
...

(b) Infinite Number of Solutions
Consider the system x + y + z = 3, x + 2y + 2z = 5 and 3x + 4y + 4z = 11
...

(c) No Solution
The system x + y + z = 3, x + 2y + 2z = 5 and 3x + 4y + 4z = 13 has no solution
...

The readers are advised to supply the proof
...
2

Deﬁnition and a Solution Method

Deﬁnition 2
...
1 (Linear System) A linear system of m equations in n unknowns x1 , x2 ,
...

...

=

b2

...

...
2
...
Linear System (2
...
1) is called homogeneous if
b1 = 0 = b2 = · · · = bm and non-homogeneous otherwise
...

...
 , x = 
...


...


...

...


...


...

...

bm
xn
am1 am2 · · · amn
The matrix A is called the coefficient matrix and the block matrix [A b] , is the augmented
matrix of the linear system (2
...
1)
...
2
...
That
is, for 1 ≤ i ≤ m and 1 ≤ j ≤ n, the entry aij of the coeﬃcient matrix A corresponds to the ith equation
and j th variable xj
...

Deﬁnition 2
...
3 (Solution of a Linear System) A solution of the linear system Ax = b is a column vector
y with entries y1 , y2 ,
...
2
...

That is, if yt = [y1 , y2 ,
...

Note: The zero n-tuple x = 0 is always a solution of the system Ax = 0, and is called the trivial
solution
...

2
...
ROW OPERATIONS AND EQUIVALENT SYSTEMS

2
...
1

21

A Solution Method

Example 2
...
4 Let us solve the linear system x + 7y + 3z = 11, x + y + z = 3, and 4x + 10y − z = 13
...
The above linear system and the linear system
x+y+z

=3

x + 7y + 3z

= 11

4x + 10y − z

Interchange the ﬁrst two equations
...
2
...
(why?)
2
...
)

6y − 5z

=1

(obtained by subtracting 4 times the ﬁrst equation
from the third equation
...
2
...
2
...
(why?)
3
...
2
...

(2
...
4)

which has the same set of solution as the system (2
...
3)
...
The system (2
...
4) and system
x+y+z

=3

3y + z

=4

divide the second equation by 2

z

=1

divide the second equation by 2

(2
...
5)

has the same set of solution
...
Or in terms of a vector, the set of solution
3
t
is { (x, y, z) : (x, y, z) = (1, 1, 1)}
...
Now, z = 1 implies y =

2
...
3
...

1
...
2
...
)

22

CHAPTER 2
...
multiply a non-zero constant throughout an equation, say “multiply the k th equation by c = 0”;
(compare the system (2
...
5) and the system (2
...
4)
...
replace an equation by itself plus a constant multiple of another equation, say “replace the k th equation
by k th equation plus c times the j th equation”
...
2
...
2
...
2
...
2
...
)
Observations:
1
...
2
...

2
...
This means the operation at Step 1, has an inverse operation
...

It will be a useful exercise for the reader to identify the inverse operations at each step in
Example 2
...
4
...
2
...
That is, after applying a ﬁnite number of
elementary operations, a simpler linear system is obtained which can be easily solved
...
“interchange the ith and j th equations”,
2
...
“replace the k th equation by k th equation minus c times the j th equation”
...
2
...

Deﬁnition 2
...
2 (Equivalent Linear Systems) Two linear systems are said to be equivalent if one can be
obtained from the other by a ﬁnite number of elementary operations
...
2
...

Lemma 2
...
3 Let Cx = d be the linear system obtained from the linear system Ax = b by a single
elementary operation
...

Proof
...
” The reader is advised to prove the result for other elementary operations
...
Let (α1 , α2 ,
...
Then substituting for αi ’s in place of xi ’s in the k th and j th
equations, we get
ak1 α1 + ak2 α2 + · · · akn αn = bk , and aj1 α1 + aj2 α2 + · · · ajn αn = bj
...

(2
...
1)

(ak1 + caj1 )x1 + (ak2 + caj2 )x2 + · · · + (akn + cajn )xn = bk + cbj
...
3
...
3
...
3
...
, αn ) is also a solution for the k th Equation (2
...
2)
...
, βn ) is a solution of the linear system Cx = d then
it is also a solution of the linear system Ax = b
...

Lemma 2
...
3 is now used as an induction step to prove the main result of this section (Theorem
2
...
4)
...
3
...

Proof
...
We prove
the theorem by induction on n
...
3
...
If n > 1, assume that the theorem is true for n = m
...
Apply the Lemma 2
...
3 again at the “last step” (that is, at the (m + 1)th step
from the mth step) to get the required result using induction
...
3
...
For solving a linear system of equations, we applied elementary operations to equations
...
The variables x1 , x2 ,
...
Therefore, in place of looking at the system
of equations as a whole, we just need to work with the coeﬃcients
...

Deﬁnition 2
...
5 (Elementary Row Operations) The elementary row operations are deﬁned as:
1
...
multiply a non-zero constant throughout a row, say “multiply the k th row by c = 0”, denoted Rk (c);
3
...

Exercise 2
...
6 Find the inverse row operations corresponding to the elementary row operations that have
been deﬁned just above
...
3
...

Example 2
...
8 The three matrices given below are row equivalent
...

1 1 1 3
1 1 1 3
1 1 1 3


0 1 1 2


Whereas the matrix 2 0 3 5 is not row equivalent to the matrix
1 1 1 3


1

0
1

0 1
2 3
1 1


2

5
...

2
...
1

LINEAR SYSTEM OF EQUATIONS

Gauss Elimination Method

Deﬁnition 2
...
9 (Forward/Gauss Elimination Method) Gaussian elimination is a method of solving a
linear system Ax = b (consisting of m equations in n unknowns) by bringing the augmented matrix


a11 a12 · · · a1n b1


 a21 a22 · · · a2n b2 

...

...


...

...


...


...

...

am1 am2 · · · amn bm
to an upper triangular form








c11
0

...

...

...

...

0

···

c1n
c2n

...

...

...

dm






...

The following examples illustrate the Gauss elimination procedure
...
3
...

y+z

=

2

2x + 3z

=

5

x+y+z =

0 1

Solution: In this case, the augmented matrix is 2 0
1 1
lowing steps
...
The method proceeds along the fol3

1
...

2x + 3z
y+z
x+y+z

=5
=2
=3

2
...

x + 3z
2
y+z
x+y+z

= 5
2
=2
=3


2

0
1


0
1
1

1 0

0 1
1 1


3 5

1 2
...
Add −1 times the 1st equation to the 3rd equation (or R31 (−1))
...
Add −1 times the 2nd equation to the 3rd equation (or R32 (−1))
...

3
5
2




2
...

3
−2

2
...
ROW OPERATIONS AND EQUIVALENT SYSTEMS
5
...

3

x + 3z
2
y+z
z



5
= 2
=2
=1

3
2

1 0

0 1
0 0

1
1

5
2




2
...
Finally the ﬁrst equation gives
x = 1
...

Example 2
...
11 Solve the linear system by Gauss elimination method
...
Add −1 times the ﬁrst equation to the second equation
...
Add −3 times the ﬁrst equation to the third equation
...
Add −1 times the second equation to the third equation
x+y+z
y+z

=3
=2



1 1

0 1
3 4


1 3

1 2 
...

1 2


1

0
0

1
1
0


1 3

1 2
...
In other
words, the system has infinite number of solutions
...
3
...

x+y+z

= 3

x + 2y + 2z

= 5

3x + 4y + 4z = 12


1 1 1 3


Solution: In this case, the augmented matrix is 1 2 2 5  and the method proceeds as follows:
3 4 4 12
1
...

x+y+z
y+z
3x + 4y + 4z

=3
=2
= 12



1 1

0 1
3 4


1 3

1 2 
...

LINEAR SYSTEM OF EQUATIONS

2
...

x+y+z
y+z
y+z


1

0
0

1
1
1


1 3

1 2
...

0 1

3
...

This can never hold for any value of x, y, z
...

Remark 2
...
13 Note that to solve a linear system, Ax = b, one needs to apply only the elementary
row operations to the augmented matrix [A b]
...
4

Row Reduced Echelon Form of a Matrix

Deﬁnition 2
...
1 (Row Reduced Form of a Matrix) A matrix C is said to be in the row reduced form if
1
...
the column containing this 1 has all its other entries zero
...

Example 2
...
2
1
...
Recall that the (i, j)th entry of the identity matrix is

1 if i = j
Iij = δij =

...

δij is usually referred to as the Kronecker delta function
...
The matrices 
0
0


1
0

3
...

0
0


5
2

 is not in the row reduced form
...
4
...
The columns containing the leading terms are called the leading
columns
...
4
...
4
...
Let [C d] be the row-reduced matrix obtained by applying the Gauss elimination method to the
augmented matrix [A b]
...
The variables which are not basic are called free variables
...

Observation: In Example 2
...
11, the solution set was given by
(x, y, z)t = (1, 2 − z, z)t = (1, 2, 0)t + z(0, −1, 1)t, with z arbitrary
...

Remark 2
...
5 It is very important to observe that if there are r non-zero rows in the row-reduced form
of the matrix then there will be r leading terms
...
Therefore,
if there are r leading terms and n variables, then there will be r basic variables and
n − r free variables
...
4
...
3
...
But this
time, we start with the 3rd row
...
Add −1 times the third equation to the second equation (or

3
5
x + 2z = 2
1

y
=2
0
z
=1
0
II
...


3
0 2 5
2

1 0 1
...


0 0 1

1 0 1
...
From the above matrix, we directly have the set of solution as (x, y, z)t = (1, 1, 1)t
...
4
...
C is already in the row reduced form;
2
...
the leading terms appear from left to right in successive rows
...
Then i1 < i2 < · · · < ik
...
4
...

0 1



0 1 0 2
1 1



corresponding matrices in the row reduced echelon form are respectively, 0 0 1 1 and 0 0
0 0 0 0
0 0
1
0
0



0 2
0


0 0 and B = 1
1 1
0

0 0
1 0
0 0

Then the
0 0
0 1
0 0


0

0
...

LINEAR SYSTEM OF EQUATIONS

Deﬁnition 2
...
8 (Row Reduced Echelon Matrix) A matrix which is in the row reduced echelon form is
also called a row reduced echelon matrix
...
4
...
3
...
3
...

The elimination process applied to obtain the row reduced echelon form of the augmented matrix is called
the Gauss-Jordan elimination
...

Method to get the row-reduced echelon form of a given matrix A
Let A be an m × n matrix
...

Step 1: Consider the ﬁrst column of the matrix A
...

Else, ﬁnd a row, say ith row, which contains a non-zero entry in the ﬁrst column
...
Suppose the non-zero entry in the (1, 1)-position is α = 0
...
Now, use the 1 to make all the
entries below this 1 equal to 0
...

Else, forget the ﬁrst row and ﬁrst column
...

Step 3: Keep repeating this process till we reach a stage where all the entries below a particular row,
say r, are zero
...
Then C has the following
form:
1
...
These 1’s are the leading terms of C
and the columns containing these leading terms are the leading columns
...
the entries of C below the leading term are all zero
...

Step 5: Next, use the leading term in the (r − 1)th row to make all entries in the (r − 1)th leading
column equal to zero and continue till we come to the ﬁrst leading term or column
...

Remark 2
...
10 Note that the row reduction involves only row operations and proceeds from left to
right
...

The proof of the following theorem is beyond the scope of this book and is omitted
...
4
...

Exercise 2
...
12

1
...

2
...
ROW REDUCED ECHELON FORM OF A MATRIX

29

(a) x + y + z + w = 0, x − y + z + w = 0 and −x + y + 3z + 3w = 0
...

(c) x + y + z = 3, x + y − z = 1 and x + y + 7z = 6
...

(e) x + y + z = 3, x + y − z = 1, x + y + 4z = 6 and x + y − 4z = −1
...
Find the row-reduced echelon form of

−1 1
3
1
3
5

1
...
4
...



5
10
2
7


2
...
4
...

Remark 2
...
14 There are three types of elementary matrices
...
Eij , which is obtained by the application of the elementary

1


th entry of E is (E )
matrix, In
...

otherwise

2
...
The (i, j)

...
Eij (c), which is obtained by the application of the elementary row operation Rij (c) to the identity

1 if k = ℓ


th entry of E (c) is (E )
matrix, In
...

ij
ij (k,ℓ)



0 otherwise
In particular,

E23

Example 2
...
15


1

= 0
0



0 0
c 0


0 1 , E1 (c) = 0 1
1 0
0 0



1 2 3 0


1
...

3 4 5 6



1 2 3 0
1
−

 −→ 
2 0 3 4 R23 3
3 4 5 6
2



0
1 0


0 , and E23 (c) = 0 1
1
0 0


0

c
...

1 0

That is, interchanging the two rows of the matrix A is same as multiplying on the left by the corresponding elementary matrix
...

30

CHAPTER 2
...
Consider the augmented matrix [A b] = 2 0
1 1
same as the matrix product

LINEAR SYSTEM OF EQUATIONS


1 2

3 5
...


0

2
1

1
0
1

1
3
1


2

5
3

−→
−
R13

−−→
−−
R32 (2)

−−−
−−→
R23 (−1)


1

2
0

1

0
0

1

0
0

1
0
1

1
3
1

1
1
0

1
1
3

0
1
0

0
0
1






3
1
1
1
3
1
1
1
3
−−− 
−−→
−

 −→ 

5 R21 (−2) 0 −2 1 −1 R23 0
1
1
2
2
0
1
1
2
0 −2 1 −1





3
1 1 1 3
1 0 0 1
−− − 
− −→
−−− 
−−→



2 R3 (1/3) 0 1 1 2 R12 (−1) 0 1 1 2
3
0 0 1 1
0 0 1 1

1

1
1

Now, consider an m × n matrix A and an elementary matrix E of order n
...
Therefore, for each
elementary matrix, there is a corresponding column transformation
...
4
...


3

3 and consider the elementary column operation f which interchanges
5




1 0 0
1 3 2




the second and the third column of A
...

0 1 0
3 5 4

1 2

Example 2
...
17 Let A = 2 0
3 4

Exercise 2
...
18
1
...
That is, E is the matrix obtained from I by applying the elementary row operation e
...

2
...

Does the Gauss-Jordan method also corresponds to multiplying by elementary matrices on the left?
Give reasons
...
Let A and B be two m × n matrices
...
When is this P unique?

2
...
In the examples considered, we have encountered three possibilities, namely
1
...
existence of an inﬁnite number of solutions, and

2
...
RANK OF A MATRIX

31

3
...

Based on the above possibilities, we have the following deﬁnition
...
5
...

The question arises, as to whether there are conditions under which the linear system Ax = b is
consistent
...
To proceed further, we need a few deﬁnitions
and remarks
...
Also, note that the number of non-zero rows in either the row reduced form
or the row reduced echelon form of a matrix are same
...
5
...

By the very deﬁnition, it is clear that row-equivalent matrices have the same row-rank
...


1 2

Example 2
...
3
1
...

0 −1 1
1 1 2




1 2 1
1 2
1
− − − − −→

 −− − − − − 

(b) 0 −1 −1 R2 (−1), R32 (1) 0 1 1
...

0 0 1
0 0 2




1 0 −1
1 0 0
−−−−−→

− − − − − − 

(d) 0 1 1  R23 (−1), R13 (1) 0 1 0
0 0 1
0 0 1


1

1
...

The last matrix in Step 1d is the row reduced form of A which has 3 non-zero rows
...

This result can also be easily deduced from the last matrix in Step 1b
...
Determine the row-rank of A = 2 3 1
...

1 1 0
0 −1 −1




1 2
1
1 2 1
− − − − −→

 −− − − − − 

(b) 0 −1 −1 R2 (−1), R32 (1) 0 1 1
...

LINEAR SYSTEM OF EQUATIONS

From the last matrix in Step 2b, we deduce row-rank(A) = 2
...
5
...
Then the row-reduced
echelon form of A agrees with the ﬁrst n columns of [A b], and hence
row-rank(A) ≤ row-rank([A b])
...

Remark 2
...
5 Consider a matrix A
...
4
...
The ﬁrst nonzero entry in each column is 1
...
A column containing only 0’s comes after all columns with at least one non-zero entry
...
The ﬁrst non-zero entry (the leading term) in each non-zero column moves down in successive
columns
...
It will be
proved later that
row-rank(A) = column-rank(A)
...

Deﬁnition 2
...
6 The number of non-zero rows in the row reduced form of a matrix A is called the rank of
A, denoted rank (A)
...
5
...
Then there exist elementary matrices E1 , E2 ,
...
, Fℓ such that
Ir 0

...
Es A F1 F2
...
Let C be the row reduced echelon matrix obtained by applying elementary row operations to
the given matrix A
...
So by
Remark 2
...
5, C will have r leading columns, say i1 , i2 ,
...
Note that, for 1 ≤ s ≤ r, the ith column
s
will have 1 in the sth row and zero elsewhere
...
Let D be the matrix obtained from C by successively interchanging the sth and ith column of C for 1 ≤ s ≤ r
...
As the (1, 1) block of D is an identity matrix,
0 0
the block (1, 2) can be made the zero matrix by application of column operations to D
...

Exercise 2
...
8
1
...
4
...

2
...

3
...
Then prove that A is row-equivalent to In
...
6
...
6

33

Existence of Solution of Ax = b

We try to understand the properties of the set of solutions of a linear system through an example, using
the Gauss-Jordan method
...
This example is more or less a motivation
...
6
...

4


0
0

For this particular matrix [C d], we want to see the set of solutions
...

Observations:
1
...
This number is also equal to the number of non-zero rows
in [C d]
...
The ﬁrst non-zero entry in the non-zero rows appear in columns 1, 2, 5 and 6
...
Thus, the respective variables x1 , x2 , x5 and x6 are the basic variables
...
The remaining variables, x3 , x4 and x7 are free variables
...
We assign arbitrary constants k1 , k2 and k3 to the free variables x3 , x4 and x7 , respectively
...

LINEAR SYSTEM OF EQUATIONS

where k1 , k2 and k3 are arbitrary
...

Let u0 =  
 
 
 
 
 
 
 
2
0
0
1
 
 
 
 
0
−1
4
0
0
1
0
0
Then it can easily be veriﬁed that Cu0 = d, and for 1 ≤ i ≤ 3, Cui = 0
...
The interested readers can
read the proof in Appendix 14
...

2
...
2

Main Theorem

Theorem 2
...
1 [Existence and Non-existence] Consider a linear system Ax = b, where A is a m × n matrix,
and x, b are vectors with orders n×1, and m×1, respectively
...

Then exactly one of the following statement holds:
1
...
, un−r are n × 1 vectors satisfying Au0 = b and Aui = 0 for 1 ≤ i ≤ n − r
...
if ra = r = n, the solution set of the linear system has a unique n × 1 vector x0 satisfying Ax0 = b
...
If r < ra , the linear system has no solution
...
6
...
Then by Theorem
2
...
1, we see that the linear system Ax = b is consistent if and only if
rank (A) = rank([A b])
...
6
...

Corollary 2
...
3 Let A be an m × n matrix
...

Proof
...
That is, Ax0 = 0 and x0 = 0
...
On the contrary, assume that rank(A) = n
...

Also A0 = 0 implies that 0 is a solution of the linear system Ax = 0
...
6
...
A contradiction to the fact
that x0 was a given non-trivial solution
...
Then
ra = rank [A 0] = rank(A) < n
...
6
...
From this inﬁnite set, we can choose any vector x0 that is diﬀerent from 0
...
That is, we have obtained a non-trivial solution x0
...
6
...
6
...

2
...
INVERTIBLE MATRICES

35

Proposition 2
...
4 Consider the linear system Ax = b
...

1
...

2
...

Remark 2
...
5
1
...
Then k1 x1 + k2 x2 is also a solution
of Ax = 0 for any k1 , k2 ∈ R
...
If u, v are two solutions of Ax = b then u − v is a solution of the system Ax = 0
...
That is, any two solutions of Ax = b diﬀer by a
solution of the associated homogeneous system Ax = 0
...

2
...
3

Exercises

Exercise 2
...
6
1
...

ii) a unique

(a) x + y + z = 3, x + 2y + cz = 4, 2x + 3y + 2cz = k
...

(c) x + y + 2z = 3, x + 2y + cz = 5, x + 2y + 4z = k
...

(e) x + 2y − z = 1, 2x + 3y + kz = 3, x + ky + 3z = 2
...

2
...

3
...
If the system A2 x = 0 has a non trivial solution then show that Ax = 0
also has a non trivial solution
...
7

Invertible Matrices

2
...
1

Inverse of a Matrix

Deﬁnition 2
...
1 (Inverse of a Matrix) Let A be a square matrix of order n
...
A square matrix B is said to be a left inverse of A if BA = In
...
A square matrix C is called a right inverse of A, if AC = In
...
A matrix A is said to be invertible (or is said to have an inverse) if there exists a matrix B such
that AB = BA = In
...
7
...
Suppose that there exist n × n matrices B and C such that
AB = In and CA = In , then B = C
...

LINEAR SYSTEM OF EQUATIONS

Proof
...

Remark 2
...
3
is unique
...
From the above lemma, we observe that if a matrix A is invertible, then the inverse

2
...
That is, AA−1 = A−1 A = I
...
7
...
Then
1
...

2
...

3
...

Proof
...

By deﬁnition AA−1 = A−1 A = I
...
This again
by deﬁnition, implies B −1 = A, or equivalently (A−1 )−1 = A
...

Verify that (AB)(B −1 A−1 ) = I = (B −1 A−1 )(AB)
...

Proof of Part 3
...
Taking transpose, we get
(AA−1 )t = (A−1 A)t = I t ⇐⇒ (A−1 )t At = At (A−1 )t = I
...

Exercise 2
...
5

1
...
Show that every elementary matrix is invertible
...
Let A1 , A2 ,
...
Prove that the product A1 A2 · · · Ar is also an invertible
matrix
...
If P and Q are invertible matrices and P AQ is deﬁned then show that rank (P AQ) = rank (A)
...
Find matrices P and Q which are product of elementary matrices such that B = P AQ where A =
2 4 8
1 0 0
and B =

...
Let A and B be two matrices
...

7
...
Then show that there exists invertible matrices Bi , Ci such that
R1 R2
S1 0
A1 0
Ir 0
B1 A =
, AC1 =
, B2 AC2 =
, and B3 AC3 =

...

2
...
INVERTIBLE MATRICES

37

8
...
Then A can be written as A = BC, where both B and C have
rank r and B is a matrix of size m × r and C is a matrix of size r × n
...
Let A and B be two matrices such that AB is deﬁned and rank (A) = rank (AB)
...
Similarly, if BA is deﬁned and rank (A) = rank (BA), then A = Y BA
for some matrix Y
...
Deﬁne X = R
0
0

A1
0

0
0

and

0
Q−1
...
Let A = [aij ] be an invertible matrix and let B = [pi−j aij ] for some nonzero real number p
...

11
...

−B −1 AC −1

12
...
Partition A and B as follows:
A=

A11
A21

B11
A12
, B=
B21
A22

B12

...

2
...
2

Equivalent conditions for Invertibility

Deﬁnition 2
...
6 A square matrix A or order n is said to be of full rank if rank (A) = n
...
7
...

1
...

2
...

3
...

4
...

Proof
...
Then there exists an invertible matrix P (a product of elementary
B1 B2
C1
matrices) such that P A =
, where B1 is an r×r matrix
...
Then
P = P In = P (AA−1 ) = (P A)A−1 =

B1
0

B2
0

C1
B1 C1 + B2 C2
=

...
7
...
Hence, P cannot be invertible
...
Thus, A is of full rank
...

LINEAR SYSTEM OF EQUATIONS

Suppose A is of full rank
...

But A has as many columns as rows and therefore, the last row of the row reduced echelon form of A
will be (0, 0,
...
Hence, the row reduced echelon form of A is the identity matrix
...
, Ek such
that A = E1 E2 · · · Ek In
...

4 =⇒ 1
Suppose A = E1 E2 · · · Ek ; where the Ei ’s are elementary matrices
...

The ideas of Theorem 2
...
7 will be used in the next subsection to ﬁnd the inverse of an invertible
matrix
...
We
repeat the proof for the sake of clarity
...
7
...

1
...
Then A−1 exists
...
Suppose there exists a matrix C such that CA = In
...

Proof
...
We will prove that the matrix A is of full rank
...

Let if possible, rank(A) = r < n
...
Then

...

=
0
B2

(2
...
2)

Thus the matrix P has n − r rows as zero rows
...
A contradiction to P being
a product of invertible matrices
...
That is, A is of full rank
...
7
...
That is, BA = In as well
...
Hence
AC = In = CA
...

Remark 2
...
9 This theorem implies the following: “if we want to show that a square matrix A of order
n is invertible, it is enough to show the existence of
1
...
or a matrix C such that CA = In
...
7
...

1
...

2
...

3
...

2
...
INVERTIBLE MATRICES

39

Proof
...
7
...
That is, for the linear system Ax = 0, the
number of unknowns is equal to the rank of the matrix A
...
6
...

2 =⇒ 1
Let if possible A be non-invertible
...
7
...
Thus
by Corollary 2
...
3, the linear system Ax = 0 has inﬁnite number of solutions
...

1 =⇒ 3
Since A is invertible, for every b, the system Ax = b has a unique solution x = A−1 b
...
, 0,
1
, 0,
...

ith position
By assumption, this system has a solution xi for each i, 1 ≤ i ≤ n
...
, xn ]
...
Then

AB = A[x1 , x2
...
, Axn ] = [e1 , e2
...

Therefore, by Theorem 2
...
8, the matrix A is invertible
...
7
...

1
...
Let A be a 1 × 2 matrix and B be a 2 × 1 matrix having positive entries
...

3
...
Prove that the matrix I − BA is invertible if
and only if the matrix I − AB is invertible
...
7
...
7
...

Corollary 2
...
12 Let A be an invertible n× n matrix
...
Then the same sequence of elementary row-operations when applied to the
identity matrix yields A−1
...
Let A be a square matrix of order n
...
, Ek be a sequence of elementary row
operations such that E1 E2 · · · Ek A = In
...
This implies A−1 = E1 E2 · · · Ek
...
Apply the Gauss-Jordan method to the matrix [A In ]
...
If B = In , then A−1 = C or else
A is not invertible
...
7
...

1 1 2


2 1 1 1 0 0


Solution: Consider the matrix 1 2 1 0 1 0
...

1
...

3
...

5
...
0

0


1
7
...

4 
3
4


3/4 −1/4 −1/4


8
...

−1/4 −1/4 3/4

Exercise 2
...
14 Find the



1 2 3
1



(i) 1 3 2 , (ii) 2
2 4 7
2

2
...



3 3
2 −1 3



3 2 , (iii) −1 3 −2
...



1 2

Example 2
...
1 Consider a matrix A = 1 3
2 4
A(1, 2|1, 3) = [4]
...
Then A(1|2) =
7

1
2

2
, A(1|3) =
7

1 3
, and
2 4

Deﬁnition 2
...
2 (Determinant of a Square Matrix) Let A be a square matrix of order n
...


j=1

2
...
DETERMINANT

41

Deﬁnition 2
...
3 (Minor, Cofactor of a Matrix) The number det (A(i|j)) is called the (i, j)th minor of
A
...
The (i, j)th cofactor of A, denoted Cij , is the number (−1)i+j Aij
...
8
...
Let A =

For example, for A =

a11

2
...
Then, det(A) = |A| = a11 A11 − a12 A12 = a11 a22 − a12 a21
...


a13

a23 
...
8
...
8
...

2

1
...




1
1 a a2

5


 , iii) 1 b b2 
...
Show that the determinant of a triangular matrix is the product of its diagonal entries
...
8
...
It is called non-singular if
det(A) = 0
...
The interested reader is advised to go through Appendix
14
...

Theorem 2
...
7 Let A be an n × n matrix
...
if B is obtained from A by interchanging two rows, then det(B) = − det(A),
2
...
if all the elements of one row or column are 0 then det(A) = 0,
4
...
if A is a square matrix having two rows equal then det(A) = 0
...

LINEAR SYSTEM OF EQUATIONS

Remark 2
...
8
1
...
” It turns out that the
way we have defined determinant is usually called the expansion of the determinant along
the ﬁrst row
...
Part 1 of Lemma 2
...
7 implies that “one can also calculate the determinant by expanding along
any row
...

det(A) =
j=1

Remark 2
...
9
1
...
Then consider the parallelogram, P QRS, formed by the vertices {P = (0, 0)t , Q = u, S = v, R = u + v}
...

u2 v2
√
Recall that the dot product, u • v = u1 v1 + u2 v2 , and u • u = (u2 + u2 ), is the length of the
1
2
vector u
...
With the above notation, if θ is the angle between the
vectors u and v, then
u•v
cos(θ) =

...

u•v
ℓ(u)ℓ(v)

2

(u1 v2 − u2 v1 )2

Hence, the claim holds
...

2
...
Recall that the
cross product of two vectors in R3 is,
u × v = (u2 v3 − u3 v2 , u3 v1 − u1 v3 , u1 v2 − u2 v1 )
...

w3

Let P be the parallelopiped formed with (0, 0, 0) as a vertex and the vectors u, v, w as adjacent
vertices
...
So, to compute the volume of the parallelopiped P, we
need to look at cos(θ), where θ is the angle between the vector w and the normal vector to the
parallelogram formed by u and v
...

Hence, | det(A)| = volume (P )
...
Let u1 , u2 ,
...
, un ] be an n × n matrix
...
, un as adjacent vertices:

2
...
DETERMINANT

43

(a) If u1 = (1, 0,
...
, 0)t ,
...
, 0, 1)t , then det(A) = 1
...

(b) If we replace the vector ui by αui , for some α ∈ R, then the determinant of the new matrix
is α · det(A)
...

(c) If u1 = ui for some i, 2 ≤ i ≤ n, then the vectors u1 , u2 ,
...
So, this parallelopiped lies on an (n − 1)-dimensional hyperplane
...
Also, | det(A)| = |0| = 0
...
The actual proof is beyond the scope of this book
...
8
...

Deﬁnition 2
...
10 (Adjoint of a Matrix) Let A be an n × n matrix
...



1

Example 2
...
11 Let A = 2
1
as C11 = (−1)1+1 A11 = 4, C12




4
2 −7
2 3



3 1
...

Theorem 2
...
12 Let A be an n × n matrix
...
for 1 ≤ i ≤ n,

n

j=1

n

2
...
A(Adj(A)) = det(A)In
...

det(A)

(2
...
2)

Proof
...

By the construction of B, two rows (ith and ℓth ) are equal
...
8
...
By
construction again, det A(ℓ|j) = det B(ℓ|j) for 1 ≤ j ≤ n
...
8
...

j=1

44

CHAPTER 2
...
Since, det(A) = 0, A

aik Cjk
k=1

if i = j
if i = j

1
Adj(A) = In
...
Hence, by Theorem 2
...
8 A has an inverse and
A−1 =

1
Adj(A)
...
8
...
Then
1 2 1



−1 1 −1


Adj(A) =  1
1 −1
−1 −3 1

and det(A) = −2
...
8
...
3, A−1




1/2 −1/2 1/2


= −1/2 −1/2 1/2 
...
8
...
7
...

Corollary 2
...
14 If A is a non-singular matrix, then
n
det(A)
Adj(A) A = det(A)In and
aij Cik =
0
i=1

if j = k

...
8
...
Then det(AB) = det(A) det(B)
...
Step 1
...

This means, A is invertible
...
7
...
So, let E1 , E2 ,
...

Then, by using Parts 1, 2 and 4 of Lemma 2
...
7 repeatedly, we get
det(AB)

=

det(E1 E2 · · · Ek B) = det(E1 ) det(E2 · · · Ek B)

=

det(E1 ) det(E2 ) det(E3 · · · Ek B)

=

det(E1 E2 ) det(E3 · · · Ek B)

...

...

Thus, we get the required result in case A is non-singular
...
8
...
Suppose det(A) = 0
...
Hence, there exists an invertible matrix P such that P A = C, where C =

C1

...

C1 B
0

as P −1 is non-singular

Thus, the proof of the theorem is complete
...
8
...
Then A is non-singular if and only if A has an inverse
...
Suppose A is non-singular
...
Thus, A
det(A)

has an inverse
...
Then there exists a matrix B such that AB = I = BA
...

This implies that det(A) = 0
...

Theorem 2
...
17 Let A be a square matrix
...

Proof
...
8
...

If A is singular, then det(A) = 0
...
8
...
Theret
fore, At also doesn’t have an inverse (for if At has an inverse then A−1 = (At )−1 )
...
8
...
Therefore, we again have det(A) = 0 = det(At )
...

2
...
2

Cramer’s Rule

Recall the following:
• The linear system Ax = b has a unique solution for every b if and only if A−1 exists
...

Thus, Ax = b has a unique solution for every b if and only if det(A) = 0
...

Theorem 2
...
18 (Cramer’s Rule) Let Ax = b be a linear system with n equations in n unknowns
...
, n,

where Aj is the matrix obtained from A by replacing the jth column of A by the column vector b
...

Proof
...
Thus, the linear system Ax = b has the solution
det(A)

1
Adj(A)b
...

det(A)
det(A)

The theorem implies that
b1
b2
1
x1 =

...

...

...

an2

···
···

...

···

a1n
a2n

...

...

...

a1n

···
···

...

···

a1j−1
a2j−1

...

...

...

bn

a1j+1
a2j+1

...

...

...

...

ann

for j = 2, 3,
...


1

Example 2
...
19 Suppose that A = 2
1
that Ax = b
...
Use Cramer’s rule to ﬁnd a vector x such
2 2
1

1 2
Solution: Check that det(A) = 1
...
9

1 3
1 2
1 1 = 1, and x3 = 2 3
1 2
1 2

3
1 = −1,
2

1
1 = 0
...

1

Miscellaneous Exercises

Exercise 2
...
1

1
...
Show that det A = ±1
...
If A and B are two n × n non-singular matrices, are the matrices A + B and A − B non-singular?
Justify your answer
...
For an n × n matrix A, prove that the following conditions are equivalent:
(a) A is singular (A−1 doesn’t exist)
...

(c) det(A) = 0
...

(e) Ax = 0 has a non-trivial solution for x
...
e
...

2
...
MISCELLANEOUS EXERCISES


2 0 6 0

5 3 2 2

4
...
Does

47


4

7

5
...
Let A = [aij ]n×n where aij = xj−1
...
[The matrix A is usually

called the Van-dermonde matrix
...
Let A = [aij ] with aij = max{i, j} be an n × n matrix
...

7
...
Show that A is invertible
...
Solve the following system of equations by Cramer’s rule
...

ii) x − y + z − w = 1, x + y − z + w = 2, 2x + y − z − w = 7, x − y − z + w = 3
...
Suppose A = [aij ] and B = [bij ] are two n × n matrices such that bij = pi−j aij for 1 ≤ i, j ≤ n for
some non-zero real number p
...

10
...

Show that
(a) If all the entries in odd positions are multiplied with −1 then the value of the determinant doesn’t
change
...
does not change if the matrix is of even order
...
is multiplied by −1 if the matrix is of odd order
...
Let A be an n × n Hermitian matrix, that is, A∗ = A
...
[A is a matrix
with complex entries and A∗ = At
...
Let A be an n × n matrix
...

13
...
Prove that Adj(AB) = Adj(B)Adj(A)
...
Then show
C D
that rank (P ) = n if and only if D = CA−1 B
...
Let P =

48

CHAPTER 2
...

Let V be the set of points of intersection of the two planes
...
The point (0, 0, 0, 0) is an element of V
...
For the points (−1, 0, 1, 1) and (−5, 1, 7, 0) which belong to V ; the point (−6, 1, 8, 1) = (−1, 0, 1, 1)+
(−5, 1, 7, 0) ∈ V
...
Let α ∈ R
...

Similarly, for an m × n real matrix A, consider the set V, of solutions of the homogeneous linear
system Ax = 0
...
If Ax = 0 and Ay = 0, then x, y ∈ V
...
Also,
x + y = y + x
...
It is clear that if x, y, z ∈ V then (x + y) + z = x + (y + z)
...
The vector 0 ∈ V as A0 = 0
...
If Ax = 0 then A(−x) = −Ax = 0
...

5
...
Then αx ∈ V as A(αx) = αAx = 0
...

3
...
1
...
1
...
Vector Addition: To every pair u, v ∈ V there corresponds a unique element u ⊕ v in V such that
(a) u ⊕ v = v ⊕ u (Commutative law)
...

(c) There is a unique element 0 in V (the zero vector) such that u ⊕ 0 = u, for every u ∈ V (called
the additive identity)
...
FINITE DIMENSIONAL VECTOR SPACES
(d) For every u ∈ V there is a unique element −u ∈ V such that u ⊕ (−u) = 0 (called the additive
inverse)
...

2
...

(b) 1 ⊙ u = u for every u ∈ V, where 1 ∈ R
...
Distributive Laws: relating vector addition with scalar multiplication
For any α, β ∈ F and u, v ∈ V, the following distributive laws hold:
(a) α ⊙ (u ⊕ v) = (α ⊙ u) ⊕ (α ⊙ v)
...

Note: the number 0 is the element of F whereas 0 is the zero vector
...
1
...
If F = R, the
vector space is called a real vector space
...

We may sometimes write V for a vector space if F is understood from the context
...
1
...
Intuitively, these
results seem to be obvious but for better understanding of the axioms it is desirable to go through the
proof
...
1
...
Then
1
...

2
...

3
...

Proof
...

For u ∈ V, by Axiom 1d there exists −u ∈ V such that −u ⊕ u = 0
...

Proof of Part 2
...

Thus, for any α ∈ F, the ﬁrst part implies α ⊙ 0 = 0
...

Hence, using the ﬁrst part, one has 0 ⊙ u = 0 for any u ∈ V
...
If α = 0 then the proof is over
...
1
...

Thus we have shown that if α = 0 and α ⊙ u = 0 then u = 0
...

We have 0 = 0u = (1 + (−1))u = u + (−1)u and hence (−1)u = −u
...
1
...
1
...
The set R of real numbers, with the usual addition and multiplication (i
...
, ⊕ ≡ +
and ⊙ ≡ ·) forms a vector space over R
...
Consider the set R2 = {(x1 , x2 ) : x1 , x2 ∈ R}
...

Then R2 is a real vector space
...
Let Rn = {(a1 , a2 ,
...
For
u = (a1 ,
...
, bn ) in V and α ∈ R, we deﬁne
u ⊕ v = (a1 + b1 ,
...
, αan )
(called component wise or coordinate wise operations)
...
This vector space is denoted by Rn , called the real vector
space of n-tuples
...
Let V = R+ (the set of positive real numbers)
...
We now deﬁne a new vector addition and scalar multiplication
as
v1 ⊕ v2 = v1 · v2 and α ⊙ v = vα
for all v1 , v2 , v ∈ R+ and α ∈ R
...

5
...
Deﬁne (x1 , x2 ) ⊕ (y1 , y2 ) = (x1 + y1 + 1, x2 + y2 − 3), α ⊙ (x1 , x2 ) = (αx1 + α −
1, αx2 − 3α + 3) for (x1 , x2 ), (y1 , y2 ) ∈ R2 and α ∈ R
...

Recall

√
−1 is denoted i
...
Consider the set C = {x + iy : x, y ∈ R} of complex numbers
...

Then C is a real vector space
...

Then C forms a complex vector space
...
FINITE DIMENSIONAL VECTOR SPACES
7
...
, zn ) : zi ∈ C for 1 ≤ i ≤ n}
...
, zn ), (w1 ,
...
, zn ) ⊕ (w1 ,
...
, zn )

= (z1 + w1 ,
...
, αzn )
...

(b) If the set F is the set R of real numbers, then Cn is a real vector space having n-tuple of complex
numbers as its vectors
...
1
...

Whereas, in Example 7b, the scalars are Real Numbers and hence we cannot write i(1, 0) =
(i, 0)
...
Fix a positive integer n and let Mn (R) denote the set of all n × n matrices with real entries
...

9
...
Consider the set, Pn (R), of all polynomials of degree ≤ n with coeﬃcients
from R in the indeterminate x
...

Let f (x), g(x) ∈ Pn (R)
...
It can be veriﬁed that Pn (R) is a real vector space with the
addition and scalar multiplication deﬁned by:
f (x) ⊕ g(x)

=

α ⊙ f (x)

=

(a0 + b0 ) + (a1 + b1 )x + · · · + (an + bn )xn , and
αa0 + αa1 x + · · · + αan xn for α ∈ R
...
Consider the set P(R), of all polynomials with real coeﬃcients
...
Observe that
a polynomial of the form a0 + a1 x + · · · + am xm can be written as a0 + a1 x + · · · + am xm + 0 ·
xm+1 + · · · + 0 · xp for any p > m
...

We now deﬁne the vector addition and scalar multiplication as
f (x) ⊕ g(x) =
α ⊙ f (x) =

(a0 + b0 ) + (a1 + b1 )x + · · · + (ap + bp )xp , and
αa0 + αa1 x + · · · + αap xp for α ∈ R
...

11
...
For f, g ∈
C([−1, 1]) and α ∈ R, deﬁne
(f ⊕ g)(x)

(α ⊙ f )(x)

= f (x) + g(x), and
= αf (x), for all x ∈ [−1, 1]
...
The operations deﬁned above are called point wise
addition and scalar multiplication
...
1
...
Let V and W be real vector spaces with binary operations (+, •) and (⊕, ⊙), respectively
...

On the right hand side, we write x1 + x2 to mean the addition in V, while y1 ⊕ y2 is the addition in
W
...
With the
above deﬁnitions, V × W also forms a real vector space
...

From now on, we will use ‘u + v’ in place of ‘u ⊕ v’ and ‘α · u or αu’ in place of ‘α ⊙ u’
...
1
...
1
...
S(F) is said to be a subspace
of V (F) if αu + βv ∈ S whenever α, β ∈ F and u, v ∈ S; where the vector addition and scalar multiplication
are the same as that of V (F)
...
1
...

Example 3
...
8

1
...
Then

(a) S = {0}, the set consisting of the zero vector 0,
(b) S = V
are vector subspaces of V
...

2
...
Then S is a subspace of R3
...
)
3
...
Then S is not a subspace of R3
...
)
4
...
Then S is a subspace of R3
...
The vector space Pn (R) is a subspace of the vector space P(R)
...
1
...
Which of the following are correct statements?

(a) Let S = {(x, y, z) ∈ R3 : z = x2 }
...

(b) Let V (F) be a vector space
...
Then the set {αx : α ∈ F} forms a vector subspace of V
...
Then W is a subspace of the real vector space,
C([−1, 1])
...
Which of the following are subspaces of Rn (R)?
(a) {(x1 , x2 ,
...

(b) {(x1 , x2 ,
...

(c) {(x1 , x2 ,
...

(d) {(x1 , x2 ,
...

3

54

CHAPTER 3
...
, xn ) : either x1 or x2 or both is0}
...
, xn ) : |x1 | ≤ 1}
...
Which of the following are subspaces of i)Cn (R) ii)Cn (C)?
(a) {(z1 , z2 ,
...

(b) {(z1 , z2 ,
...

(c) {(z1 , z2 ,
...

3
...
4

Linear Combinations

Deﬁnition 3
...
10 (Linear Span) Let V (F) be a vector space and let S = {u1 , u2 ,
...
The linear span of S is the set deﬁned by
L(S) = {α1 u1 + α2 u2 + · · · + αn un : αi ∈ F, 1 ≤ i ≤ n}
If S is an empty set we deﬁne L(S) = {0}
...
1
...
Note that (4, 5, 5) is a linear combination of (1, 0, 0), (1, 1, 0), and (1, 1, 1) as (4, 5, 5) =
5(1, 1, 1) − 1(1, 0, 0) + 0(1, 1, 0)
...

2
...

(3
...
1)

Check that 3(1, 2, 3)+ (−1)(−1, 1, 4)+ 0(3, 3, 2) = (4, 5, 5)
...

3
...
The linear span of S = {(1, 1, 1), (2, 1, 3)} over R is
L(S) =
=
=

{α(1, 1, 1) + β(2, 1, 3) : α, β ∈ R}
{(α + 2β, α + β, α + 3β) : α, β ∈ R}
{(x, y, z) ∈ R3 : 2x − y = z}
...

Lemma 3
...
12 (Linear Span is a subspace) Let V (F) be a vector space and let S be a non-empty subset
of V
...

Proof
...
Let u, v ∈ L(S)
...
Hence,
u + v = (α1 + β)w1 + · · · + (αn + βn )wn ∈ L(S)
...

3
...
VECTOR SPACES

55

Remark 3
...
13 Let V (F) be a vector space and W ⊂ V be a subspace
...

Theorem 3
...
14 Let S be a non-empty subset of a vector space V
...

Proof
...
u ∈ L(S) and therefore, S ⊆ L(S)
...
Then by Proposition 3
...
13,
L(S) ⊆ W and hence the result follows
...
1
...
Then using the rows at , at ,
...
, bn ∈ Rm , we deﬁne
1
...
, am ),
2
...
, bn ),
3
...

4
...

Note that the “column space” of a matrix A consists of all b such that Ax = b has a solution
...

Lemma 3
...
16 Let A be a real m × n matrix
...
Then
Row Space(A) = Row Space(B)
...
We prove the result for the elementary matrix Eij (c), where c = 0 and i < j
...
, at
1 2
m
be the rows of the matrix A
...
, ai−1 , ai + caj ,
...
, ai−1 , ai ,
...
1
...
Then
1
...
the non-zero row vectors of a matrix in row-reduced form, forms a basis for the row-space
...

56

CHAPTER 3
...
Part 1) can be easily proved
...
For part 2), let D be the row-reduced
form of A with non-zero rows dt , dt ,
...
Then B = Ek Ek−1 · · · E2 E1 A for some elementary matrices
1
2
r
E1 , E2 ,
...
Then, a repeated application of Lemma 3
...
16 implies Row Space(A) = Row Space(B)
...
, at , then
1
2
m
L(a1 , a2 ,
...
, br )
...

Exercise 3
...
18
1
...
Give examples
to show that the column space of two row-equivalent matrices need not be same
...
Find all the vector subspaces of R2
...
Let P and Q be two subspaces of a vector space V
...
Also show
that P ∪ Q need not be a subspace of V
...
Let P and Q be two subspaces of a vector space V
...
Show
that P + Q is a subspace of V
...

5
...

Determine all xi such that L(S) = L(S \ {xi })
...
Let C([−1, 1]) be the set of all continuous functions on the interval [−1, 1] (cf
...
1
...
11)
...
2) = 0}, and
1
{f ∈ C([−1, 1]) : f ′ ( )exists }
...
Let V = {(x, y) : x, y ∈ R} over R
...

Show that V is not a vector space over R
...
Recall that Mn (R) is the real vector space of all n × n real matrices
...

(a) sln = {A ∈ Mn (R) : trace(A) = 0}

(b) Symn = {A ∈ Mn (R) : A = At }

(c) Skewn = {A ∈ Mn (R) : A + At = 0}

9
...
Deﬁne x ⊕ y = x − y and α ⊙ x = −αx
...
Hence, one can start with
any ﬁnite collection of vectors and obtain their span
...
Therefore, the following questions arise:
1
...
Is it possible to ﬁnd/choose vectors so that the linear span of the chosen vectors is the whole vector
space itself?
3
...
Can we ﬁnd
the minimum number of such vectors?
We try to answer these questions in the subsequent sections
...
2
...
2

57

Linear Independence

Deﬁnition 3
...
1 (Linear Independence and Dependence) Let S = {u1 , u2 ,
...
If there exist some non-zero αi ’s 1 ≤ i ≤ m, such that
α1 u1 + α2 u2 + · · · + αm um = 0,
then the set S is called a linearly dependent set
...

Example 3
...
2
1
...
Then check that 1(1, 2, 1)+1(2, 1, 4)+(−1)(3, 3, 5) =
(0, 0, 0)
...
2
...

2
...
Suppose there exists α, β, γ ∈ R such that α(1, 1, 1)+β(1, 1, 0)+
γ(1, 0, 1) = (0, 0, 0)
...

In other words, if S = {u1 , u2 ,
...

(3
...
1)

In case α1 = α2 = · · · = αm = 0 is the only solution of (3
...
1), the set S becomes a linearly
independent subset of V
...

Proposition 3
...
3 Let V be a vector space
...
Then the zero-vector cannot belong to a linearly independent set
...
If S is a linearly independent subset of V, then every subset of S is also linearly independent
...
If S is a linearly dependent subset of V then every set containing S is also linearly dependent
...
We give the proof of the ﬁrst part
...

Let S = {0 = u1 , u2 ,
...
Then for any γ = o, γu1 + ou2 +
· · · + 0un = 0
...
Therefore, the set S is linearly dependent
...
2
...
, vp } be a linearly independent subset of a vector space V
...
, vp , vp+1 } is linearly dependent, then vp+1 is a linear
combination of v1 , v2 ,
...

Proof
...
, vp , vp+1 } is linearly dependent, there exist scalars α1 , α2 ,
...

(3
...
2)

Claim: αp+1 = 0
...
Then equation (3
...
2) gives α1 v1 + α2 v2 + · · · + αp vp = 0 with not all
αi , 1 ≤ i ≤ p zero
...
, vp } is linearly
dependent which is contradictory to our hypothesis
...

αp+1

58

CHAPTER 3
...
Hence the result follows
...
We don’t give their proofs as they are
easy consequence of the above theorem
...
2
...
, un } be a linearly dependent subset of a vector space V
...
, uk ) = L(u1 , u2 ,
...

The next corollary follows immediately from Theorem 3
...
4 and Corollary 3
...
5
...
2
...
, vp } be a linearly independent subset of a vector space V
...
, vp )
...
, vp , v} is also linearly
independent subset of V
...
2
...
Consider the vector space R2
...
Find all choices for the vector u2 such
that the set {u1 , u2 } is linear independent subset of R2
...
If none of the elements appearing along the principal diagonal of a lower triangular matrix is zero, show
that the row vectors are linearly independent in Rn
...

3
...
Determine whether or not the vector (1, 1, 2, 1) ∈
L(S)?
4
...

5
...
In general if {f1 , f2 , f3 }
is a linearly independent set then {f1 , f1 + f2 , f1 + f2 + f3 } is also a linearly independent set
...
In R3 , give an example of 3 vectors u, v and w such that {u, v, w} is linearly dependent but any set
of 2 vectors from u, v, w is linearly independent
...
What is the maximum number of linearly independent vectors in R3 ?
8
...

9
...
Under what conditions on α are the vectors (1 + α, 1 − α) and (α − 1, 1 + α) in C2 (R) linearly
independent?
11
...
Further, let K be the subspace spanned by M and u and H
be the subspace spanned by M and v
...

3
...
3
...
A non-empty subset B of a vector space V is called a

(a) B is a linearly independent set, and
(b) L(B) = V, i
...
, every vector in V can be expressed as a linear combination of the elements of B
...
3
...
A vector in B is called a basis vector
...
3
...
, vp } be a basis of a vector space V (F)
...
, vp
...

But then the set {v1 , v2 ,
...
Hence, for 1 ≤ i ≤ p, αi = βi and we have the uniqueness
...
Hence, the empty set is a basis of the vector
space {0}
...
3
...
Check that if V = {(x, y, 0) : x, y ∈ R} ⊂ R3 , then B = {(1, 0, 0), (0, 1, 0)} or
B = {(1, 0, 0), (1, 1, 0)} or B = {(2, 0, 0), (1, 3, 0)} or · · · are bases of V
...
For 1 ≤ i ≤ n, let ei = (0,
...
, 0) ∈ Rn
...
, en } forms

i th place
a basis of Rn
...

That is, if n = 3, then the set {(1, 0, 0), (0, 1, 0), (0, 0, 1)} forms an standard basis of R3
...
Let V = {(x, y, z) : x+y−z = 0, x, y, z ∈ R} be a vector subspace of R3
...
It can be easily veriﬁed that the vector (3, 2, 5) ∈ V and
(3, 2, 5) = (1, 1, 2) + (2, 1, 3) = 4(1, 1, 2) − (1, 2, 3)
...
3
...

A basis of V can be obtained by the following method:
The condition x + y − z = 0 is equivalent to z = x + y
...

Hence, {(1, 0, 1), (0, 1, 1)} forms a basis of V
...
Let V = {a + ib : a, b ∈ R} and F = C
...
Note that any element
a + ib ∈ V can be written as a + ib = (a + ib)1
...

5
...
That is, V is a real vector space
...
Hence a basis of V is {1, i}
...
Also, i ∈ R and hence i · (1 + 0 · i) is not deﬁned
...
Recall the vector space P(R), the vector space of all polynomials with real coeﬃcients
...
, xn ,
...

Deﬁnition 3
...
4 (Finite Dimensional Vector Space) A vector space V is said to be ﬁnite dimensional if
there exists a basis consisting of ﬁnite number of elements
...

In Example 3
...
3, the vector space of all polynomials is an example of an inﬁnite dimensional vector
space
...

60

CHAPTER 3
...
3
...
Then the set {v1 } is linearly independent
...
Else there exists a vector, say, v2 ∈ V such that
v2 ∈ L(v1 )
...
2
...

Step 3: If V = L(v1 , v2 ), then {v1 , v2 } is a basis of V
...
So, by Corollary 3
...
6, the set {v1 , v2 , v3 } is linearly independent
...
, vi ), or L(v1 , v2 ,
...

In the ﬁrst case, we have {v1 , v2 ,
...

In the second case, L(v1 , v2 ,
...
So, we choose a vector, say, vi+1 ∈ V such that vi+1 ∈
L(v1 , v2 ,
...
Therefore, by Corollary 3
...
6, the set {v1 , v2 ,
...

This process will ﬁnally end as V is a ﬁnite dimensional vector space
...
3
...
Let S = {v1 , v2 ,
...
Suppose L(S) = V but
S is not a linearly independent set
...

2
...

3
...
Then show that the r non-zero rows in the row-reduced echelon form of
A are linearly independent and they form a basis of the row space of A
...
3
...
3
...
, vn } be a basis of a given vector space V
...
, wm } is a set of
vectors from V with m > n then this set is linearly dependent
...
Since we want to ﬁnd whether the set {w1 , w2 ,
...
3
...
, αm as the m unknowns
...

As {v1 , v2 ,
...

...

a12 v1 + a22 v2 + · · · + an2 vn

...

...

The set of equations (3
...
1) can be rewritten as



n

α1 

j=1

m

i
...
,

aj1 vj  + α2 
m

αi a1i
i=1

v1 +

n

j=1



aj2 vj  + · · · + αm 

αi a2i
i=1



n
j=1

m

v2 + · · · +

αi ani
i=1



ajm vj  = 0
vn = 0
...
3
...
, vn } is linearly independent, we have
m

m

αi a1i =
i=1

i=1

m

αi a2i = · · · =

αi ani = 0
...
3
...
, αm ) and A = 
...

...
e
...


...

...


...

...
6
...
Therefore, the equation (3
...
1) has a solution with not all
αi , 1 ≤ i ≤ m, zero
...
, wm } is a linearly dependent set
...
3
...
We give a method of ﬁnding a
basis of V from S
...
Construct a matrix A whose rows are the vectors in S
...
Use only the elementary row operations Ri (c) and Rij (c) to get the row-reduced form B of A (in
fact we just need to make as many zero-rows as possible)
...
Let B be the set of vectors in S corresponding to the non-zero rows of B
...

Example 3
...
9 Let S = {(1, 1, 1, 1), (1, 1, −1, 1), (1, 1, 0, 1), (1, −1, 1, 1)} be a subset of R4
...



1 1
1 1
1 1 −1 1


Solution: Here A = 

...


 R12 (−1), R13 (−1), R14 (−1) 
0 0 −1 0
0 0 −1 0
1 1
0 1
0 −2 0 0
0 −2 0 0
1 −1 1 1
Observe that the rows 1, 3 and 4 are non-zero
...
Thus, B = {(1, 1, 1, 1), (1, 1, 0, 1), (1, −1, 1, 1)} is a basis of L(S)
...
In this case, we get {(1, 1, 1, 1), (1, 1, −1, 1), (1, −1, 1, 1)} as a basis
of L(S)
...
3
...
Then any two bases of V have the same
number of vectors
...
Let {u1 , u2 ,
...
, vm } be two bases of V with m > n
...
, vm } is linearly dependent if we take {u1 , u2 ,
...
This
contradicts the assumption that {v1 , v2 ,
...
Hence, we get m = n
...
3
...

62

CHAPTER 3
...
2
...

Example 3
...
12

1
...
Then,
(a + ib, c + id) = (a + ib)(1, 0) + (c + id)(0, 1)
...

2
...
In this case, any vector
(a + ib, c + id) = a(1, 0) + b(i, 0) + c(0, 1) + d(0, i)
...

Remark 3
...
13 It is important to note that the dimension of a vector space may change if the underlying ﬁeld (the set of scalars) is changed
...
3
...
For f, g ∈ V, and t ∈ R, deﬁne
(f ⊕ g)(x)
(t ⊙ f )(x)

=

f (x) + g(x) and

=

f (tx)
...

For 1 ≤ i ≤ n, consider the functions
ei (x) = ei (x1 , x2 ,
...

Then it can be easily veriﬁed that the set {e1 , e2 ,
...

The next theorem follows directly from Corollary 3
...
6 and Theorem 3
...
7
...

Theorem 3
...
15 Let S be a linearly independent subset of a ﬁnite dimensional vector space V
...

Theorem 3
...
15 is equivalent to the following statement:
Let V be a vector space of dimension n
...
, vr } ⊂ V
...
, vn in V such that {v1 , v2 ,
...

Corollary 3
...
16 Let V be a vector space of dimension n
...
Also, every set of m vectors, m > n, is linearly dependent
...
3
...
Find bases of V and W containing a basis of V ∩ W
...
The solution set of the linear equations
v + x − 3y + z = 0, w − x − z = 0 and v = y
is given by
(v, w, x, y, z)t = (y, 2y, x, y, 2y − x)t = y(1, 2, 0, 1, 2)t + x(0, 0, 1, 0, −1)t
...

To ﬁnd a basis of W containing a basis of V ∩ W, we can proceed as follows:

3
...
BASES

63

1
...

2
...

Now use Remark 3
...
8 to get the required basis
...
Substituting y = 1, x = 1, and z = 0 in
(y, x + z, x, y, z) gives us the vector (1, 1, 1, 1, 0) ∈ W
...

Similarly, a vector of V has the form (v, w, x, y, 3y −v−x) for v, w, x, y ∈ R
...
Also, substituting v = 0, w = 1, x = 1 and y = 1, gives another
vector (0, 1, 1, 1, 2) ∈ V
...

Recall that for two vector subspaces M and N of a vector space V (F), the vector subspace M + N
is deﬁned by
M + N = {u + v : u ∈ M, v ∈ N }
...
4
...

Theorem 3
...
18 Let V (F) be a ﬁnite dimensional vector space and let M and N be two subspaces of V
...

(3
...
2)
Exercise 3
...
19
1
...
Also, ﬁnd dim(Pn (R))
...
Consider the real vector space, C([0, 2π]), of all real valued continuous functions
...
Prove that the collection of vectors {en : 1 ≤ n < ∞} is a
linearly independent set
...
Then we have a ﬁnite set of vectors,
say {ek1 , ek2 ,
...
That is, there exist scalars αi ∈ R for 1 ≤ i ≤ ℓ not all
zero such that
α1 sin(k1 x) + α2 sin(k2 x) + · · · + αℓ sin(kℓ x) = 0 for all x ∈ [0, 2π]
...
]

3
...
Is it a basis of C3 (R) also?
4
...
Find its basis and dimension
...
Let V = {(x, y, z, w) ∈ R4 : x + y − z + w = 0, x + y + z + w = 0} and W = {(x, y, z, w) ∈ R4 :
x − y − z + w = 0, x + 2y − w = 0} be two subspaces of R4
...

6
...
Find its basis and dimension
...
FINITE DIMENSIONAL VECTOR SPACES
7
...

8
...
Show that P + Q = R3
and P ∩ Q = {0}
...

Is it necessary that uP and uQ are unique?
9
...
Prove that
there exists an (n − k)-dimensional subspace W2 of V such that W1 ∩ W2 = {0} and W1 + W2 = V
...
Let P and Q be subspaces of Rn such that P + Q = Rn and P ∩ Q = {0}
...

11
...
Show that
P + Q = R3 and P ∩ Q = {0}
...

12
...
Is the set,
W = {p(x) ∈ P4 (R) : p(−1) = p(1) = 0}
a subspace of P4 (R)? If yes, ﬁnd its dimension
...
Let V be the set of all 2 × 2 matrices with complex entries and a11 + a22 = 0
...
Find its basis
...
Show W is a vector subspace of V,
and ﬁnd its dimension
...
Let A = 
 , and B = 
 be two matrices
...

(b) the matrices P1 and P2 such that P1 A and P2 B are in row-reduced form
...

(d) a basis each for the range spaces of A and B
...

(f) the dimensions of all the vector subspaces so obtained
...
Let M (n, R) denote the space of all n × n real matrices
...

(a) sl(n, R) = {A ∈ M (n, R) : tr(A) = 0}, where recall that tr(A) stands for trace of A
...

(c) A(n, R) = {A ∈ M (n, R) : A + At = 0}
...

3
...
BASES

65

Proposition 3
...
20 Let A be an m × n real matrix
...

Proof
...
, Rm be the rows of A and C1 , C2 ,
...
Note that
Row rank(A) = r, means that
dim L(R1 , R2 ,
...

Hence, there exists vectors
u1 = (u11 ,
...
, u2n ),
...
, urn ) ∈ Rn
with
Ri ∈ L(u1 , u2 ,
...

Therefore, there exist real numbers αij , 1 ≤ i ≤ m, 1 ≤ j ≤ r such that
r

R1 = α11 u1 + α12 u2 + · · · + α1r ur = (

r

α1i ui1 ,
i=1

i=1

r

R2 = α21 u1 + α22 u2 + · · · + α2r ur = (

r

α1i ui2 ,
...
,
i=1

α2i uin ),
i=1

and so on, till
r

Rm = αm1 u1 + · · · + αmr ur = (
So,



r

αmi ui1 ,
i=1

r

αmi ui2 ,
...

i=1


α1i ui1 






 i=1
α1r
α12
α11
 r











 α2r 
 α22 
 α21 
 i=1 α2i ui1 
C1 = 
= u11 
...
 + · · · + ur1 
...


...




...


...


...

 r

αmr
αm2
αm1


αmi ui1
r

i=1

In general, for 1 ≤ j ≤ n, we have
 r







 i=1 α1i uij 
α11
α12
α1r
 r











 α21 
 α22 
 α2r 
 i=1 α2i uij 
Cj = 
 = u1j 
...
 + · · · + urj 
...


...



...


...


...

 r

αm1
αm2
αmr


αmi uij
i=1

Therefore, we observe that the columns C1 , C2 ,
...
, αm1 )t , (α12 , α22 ,
...
, (α1r , α2r ,
...

Therefore,
Column rank(A) = dim L(C1 , C2 ,
...

A similar argument gives
Row rank(A) ≤ Column rank(A)
...

66

CHAPTER 3
...
4

Ordered Bases

Let B = {u1 , u2 ,
...
As B is a set, there is no ordering of its
elements
...

Deﬁnition 3
...
1 (Ordered Basis) An ordered basis for a vector space V (F) of dimension n, is a basis {u1 , u2 ,
...
, un } and
{1, 2, 3,
...

If the ordered basis has u1 as the ﬁrst vector, u2 as the second vector and so on, then we denote this
ordered basis by
(u1 , u2 ,
...

Example 3
...
2 Consider P2 (R), the vector space of all polynomials of degree less than or equal to 2 with
coeﬃcients from R
...

For any element a0 + a1 x + a2 x2 ∈ P2 (R), we have

a0 − a1
a0 + a1
(1 − x) +
(1 + x) + a2 x2
...

a0 + a1
a0 − a1
If we take (1 + x, 1 − x, x2 ) as an ordered basis, then
is the ﬁrst component,
is the
2
2
2
second component, and a2 is the third component of the vector a0 + a1 x + a2 x
...
, un ), (u2 , u3 ,
...
, un−1 ) are diﬀerent
even though they have the same set of vectors as elements
...
4
...
, vn ) be an ordered basis of a vector space
V (F) and let v ∈ V
...
, βn ) is called the coordinate of the vector v with respect to the ordered basis B
...
, βn )t , a column vector
...
, un ) and B2 = (un , u1 , u2 ,
...
Then for
any x ∈ V there exists unique scalars α1 , α2 ,
...

Therefore,
[x]B1 = (α1 , α2 ,
...
, αn−1 )t
...

Suppose that the ordered basis B1 is changed to the ordered basis B3 = (u2 , u1 , u3 ,
...
Then
[x]B3 = (α2 , α1 , α3 ,
...
So, the coordinates of a vector depend on the ordered basis chosen
...
4
...
Consider the ordered bases
B1 = (1, 0, 0), (0, 1, 0), (0, 0, 1) , B2 = (1, 0, 0), (1, 1, 0), (1, 1, 1) and B3 = (1, 1, 1), (1, 1, 0), (1, 0, 0) of
V
...

=

2 · (1, 0, 0) + (−2) · (1, 1, 0) + 1 · (1, 1, 1)
...

3
...
ORDERED BASES

67

Therefore, if we write u = (1, −1, 1), then
[u]B1 = (1, −1, 1)t , [u]B2 = (2, −2, 1)t , [u]B3 = (1, −2, 2)t
...
, un ) and
B2 = (v1 , v2 ,
...
Since, B1 is a basis of V, there exists unique scalars aij , 1 ≤ i, j ≤ n such that
n

vi =

for 1 ≤ i ≤ n
...
, ani )t
...
, αn )t
...
, vn ), we have


n

v=

n

i=1

n

n

αi vi =

i=1

αi 

j=1

aji uj  =

n

j=1

i=1

aji αi

uj
...
So,
n

[v]B1

n

=

a1i αi ,
i=1

=

=

a2i αi ,
...

...


...

an1

···

A[v]B2
...


...

αn
ann

Note that the ith column of the matrix A is equal to [vi ]B1 , i
...
, the ith column of A is the coordinate
of the ith vector vi of B2 with respect to the ordered basis B1
...

Theorem 3
...
5 Let V be an n-dimensional vector space with ordered bases B1 = (u1 , u2 ,
...
, vn )
...
, [vn ]B1 ]
...

Example 3
...
6 Consider two bases B1 = (1, 0, 0), (1, 1, 0), (1, 1, 1) and B2 = (1, 1, 1), (1, −1, 1), (1, 1, 0)
of R3
...
Then
[(x, y, z)]B1

=
=

(x − y) · (1, 0, 0) + (y − z) · (1, 1, 0) + z · (1, 1, 1)
(x − y, y − z, z)t

and
[(x, y, z)]B2

=

=

y−x
x−y
+ z) · (1, 1, 1) +
· (1, −1, 1)
2
2
+(x − z) · (1, 1, 0)
y−x
x−y
(
+ z,
, x − z)t
...
FINITE DIMENSIONAL VECTOR SPACES



0 2 0


2
...
The columns of the matrix A are obtained by the following rule:
1 1 0
[(1, 1, 1)]B1 = 0 · (1, 0, 0) + 0 · (1, 1, 0) + 1 · (1, 1, 1) = (0, 0, 1)t ,

[(1, −1, 1)]B1 = 2 · (1, 0, 0) + (−2) · (1, 1, 0) + 1 · (1, 1, 1) = (2, −2, 1)t
and
[(1, 1, 0)]B1 = 0 · (1, 0, 0) + 1 · (1, 1, 0) + 0 · (1, 1, 1) = (0, 1, 0)t
...

3
...

2
z
1 1 0
x−z
4
...

In the next chapter, we try to understand Theorem 3
...
5 again using the ideas of ‘linear transformations / functions’
...
4
...
Determine the coordinates of the vectors (1, 2, 1) and (4, −2, 2) with respect to the
basis B = (2, 1, 0), (2, 1, 1), (2, 2, 1) of R3
...
Consider the vector space P3 (R)
...

(b) Find the coordinates of the vector u = 1 + x + x2 + x3 with respect to the ordered basis B1 and
B2
...

(d) Let v = a0 + a1 x + a2 x2 + a3 x3
...

0

0 0

−a3

Chapter 4

Linear Transformations
4
...

Deﬁnition 4
...
1 (Linear Transformation) Let V and W be vector spaces over F
...

We now give a few examples of linear transformations
...
1
...
Deﬁne T : R−→R2 by T (x) = (x, 3x) for all x ∈ R
...

2
...
Let x = (x1 , x2 ,
...

n

(a) Deﬁne T (x) =

xi
...

(c) For a ﬁxed vector a = (a1 , a2 ,
...
Note that examples (a)
i=1

and (b) can be obtained by assigning particular values for the vector a
...
Deﬁne T : R2 −→R3 by T ((x, y)) = (x + y, 2x − y, x + 3y)
...

4
...
Deﬁne a map TA : Rn −→Rm by
TA (x) = Ax for every xt = (x1 , x2 ,
...

Then TA is a linear transformation
...

5
...

Deﬁne T : Rn+1 −→Pn (R) by
T ((a1 , a2 ,
...
, an+1 ) ∈ Rn+1
...

70

CHAPTER 4
...
1
...
Suppose that 0V is the zero vector in V and
0W is the zero vector of W
...

Proof
...

So, T (0V ) = 0W as T (0V ) ∈ W
...

Deﬁnition 4
...
4 (Zero Transformation) Let V be a vector space and let T : V −→W be the map deﬁned
by
T (v) = 0 for every v ∈ V
...
Such a linear transformation is called the zero transformation and is
denoted by 0
...
1
...

Then T is a linear transformation
...

We now prove a result that relates a linear transformation T with its value on a basis of the domain
space
...
1
...
, un ) be an ordered basis
of V
...
, T (un )
...
, T (un )
...
Since B is a basis of V, for any x ∈ V, there exist scalars α1 , α2 ,
...
So, by the deﬁnition of a linear transformation
T (x) = T (α1 u1 + · · · + αn un ) = α1 T (u1 ) + · · · + αn T (un )
...
, αn
...
, T (un ) in W
...
, αn ) of x with respect to
the ordered basis B and the vectors T (u1 ), T (u2 ),
...

Exercise 4
...
7

1
...

(a) Let V = R2 and W = R3 with T (x, y)
(b) Let V = W = R2 with T (x, y)
(c) Let V = W = R2 with T (x, y)

= (x + y + 1, 2x − y, x + 3y)

= (x − y, x2 − y 2 )
= (x − y, |x|)

(d) Let V = R2 and W = −→R4 with T (x, y)
(e) Let V = W = R4 with T (x, y, z, w)

= (x + y, x − y, 2x + y, 3x − 4y)

= (z, x, w, y)

2
...
Then, which of the following are
linear transformations T : M2 (R)−→M2 (R)?

4
...
DEFINITIONS AND BASIC PROPERTIES
(a) T (A) = At

(b) T (A) = I + A

71
(c) T (A) = A2

(d) T (A) = BAB −1 , where B is some ﬁxed 2 × 2 matrix
...
Let T : R −→ R be a map
...

4
...
Consider the linear transformation
TA (x) = Ax for every x ∈ Rn
...
In general, for k ∈ N, prove that T k (x) = Ak x
...
Use the ideas of matrices to give examples of linear transformations T, S : R3 −→R3 that satisfy:
(a) T = 0, T 2 = 0, T 3 = 0
...

(c) S 2 = T 2 , S = T
...

6
...
Let x ∈ Rn such
that T (x) = 0
...
In general, if T k = 0
for 1 ≤ k ≤ p and T p+1 = 0, then for any vector x ∈ Rn with T p (x) = 0 prove that the set
{x, T (x),
...

7
...
Consider the sets
S = {x ∈ Rn : T (x) = y} and N = {x ∈ Rn : T (x) = 0}
...

8
...
Is T linear on
(a) C over R (b) C over C
...
Find all functions f : R2 −→ R2 that satisfy the conditions
(a) f ( (x, x) ) = (x, x) and
(b) f ( (x, y) ) = (y, x) for all (x, y) ∈ R2
...

Is this function a linear transformation? Justify your answer
...
1
...
For w ∈ W, deﬁne the set
T −1 (w) = {v ∈ V : T (v) = w}
...

1
...

2
...

is a linear transformation
...
LINEAR TRANSFORMATIONS

Proof
...
So, the set
T −1 (w) is non-empty
...
But by assumption, T is one-one
and therefore v1 = v2
...

We now show that T −1 as deﬁned above is a linear transformation
...
Then by Part 1,
there exist unique vectors v1 , v2 ∈ V such that T −1 (w1 ) = v1 and T −1 (w2 ) = v2
...
So, for any α1 , α2 ∈ F, we have T (α1 v1 + α2 v2 ) = α1 w1 + α2 w2
...

Hence T −1 : W −→V, deﬁned as above, is a linear transformation
...
1
...
If the map
T is one-one and onto, then the map T −1 : W −→V deﬁned by
T −1 (w) = v whenever T (v) = w
is called the inverse of the linear transformation T
...
1
...
Deﬁne T : R2 −→R2 by T ((x, y)) = (x + y, x − y)
...

2
2

Note that
T ◦ T −1 ((x, y))

=
=
=

x+y x−y
,
))
2
2
x+y x−y x+y x−y
(
+
,
−
)
2
2
2
2
(x, y)
...
Verify that T −1 ◦ T = I
...

2
...
, an+1 )) = a1 + a2 x + · · · + an+1 xn
for (a1 , a2 ,
...
Then T −1 : Pn (R)−→Rn+1 is deﬁned as
T −1 (a1 + a2 x + · · · + an+1 xn ) = (a1 , a2 ,
...
Verify that T ◦ T −1 = T −1 ◦ T = I
...

4
...
For
this, we ask the reader to recall the results on ordered basis, studied in Section 3
...

Let V and W be ﬁnite dimensional vector spaces over the set F with respective dimensions m and n
...
Suppose B1 = (v1 , v2 ,
...
2
...
In the last section, we saw that a linear transformation is determined by its image on a basis of the
domain space
...

Now for each j, 1 ≤ j ≤ n, the vectors T (vj ) ∈ W
...
, wm ) of W
...
, amj ∈ F such that
a11 w1 + a21 w2 + · · · + am1 wm

T (v1 ) =
T (v2 ) =

...

...

m

Or in short, T (vj ) =
i=1

aij wi for 1 ≤ j ≤ n
...
, amj ]t
...

[T (vj )]B2 = 


...

amj
Let [x]B1 = [x1 , x2 ,
...
Then
n

n

T (x) = T (

xj vj ) =
j=1

n

=

m

xj (
j=1
m

=

xj T (vj )
j=1

aij wi )
i=1

n

(

aij xj )wi
...

...

...

...

...
Then the coordinates of the vector T (x) with

...

amn

 
a1j xj
a11
 
a2j xj   a21
=
...

 
...

 
...

n
am1
amj xj
j=1
j=1
n
j=1

a12
a22

...

...

···
···

...

···

 
a1n
x1
 
a2n   x2 

...


...

We thus have the following theorem
...
2
...

Let T : V −→W be a linear transformation
...

74

CHAPTER 4
...
2
...
, vn ) be an ordered basis of V and B2 = (w1 , w2 ,
...
Let T : V −→ W be a linear transformation with A = T [B1 , B2 ]
...
In general, the ith column of A is the
coordinate of the vector T (vi ) in the basis B2
...

Example 4
...
3

1
...

We obtain T [B1 , B2 ], the matrix of the linear transformation T with respect to the ordered bases
B1 = (1, 0), (0, 1)

and B2 = (1, 1), (1, −1)

of R2
...
Also, by deﬁnition of the linear transformation T, we have
T ( (1, 0) ) = (1, 1) = 1 · (1, 1) + 0 · (1, −1)
...

That is, [T ( (0, 1) )]B2 = (0, 1)t
...
Observe that in this case,
1

[T ( (x, y) )]B2 = [(x + y, x − y)]B2 = x(1, 1) + y(1, −1) =
T [B1 , B2 ] [(x, y)]B1 =

1
0

0
1

x
, and
y

x
x
= [T ( (x, y) )]B2
...
Let B1 = (1, 0, 0), (0, 1, 0), (0, 0, 1) , B2 = (1, 0, 0), (1, 1, 0), (1, 1, 1) be two ordered bases of R3
...

Then
T ((1, 0, 0)) =

1 · (1, 0, 0) + 0 · (1, 1, 0) + 0 · (1, 1, 1),

T ((0, 1, 0)) =

−1 · (1, 0, 0) + 1 · (1, 1, 0) + 0 · (1, 1, 1), and

T ((0, 0, 1)) =

0 · (1, 0, 0) + (−1) · (1, 1, 0) + 1 · (1, 1, 1)
...

0 0
1


1 0 0


Similarly check that T [B1 , B1 ] = 0 1 0
...
3
...
Let T : R3 −→R2 be deﬁne by T ((x, y, z)) = (x + y − z, x + z)
...
Then
T [B1 , B2 ] =

1
1

1 −1

...

Exercise 4
...
4 Recall the space Pn (R) ( the vector space of all polynomials of degree less than or equal to
n)
...

Find the matrix of the linear transformation D
...

Remark 4
...
5

1
...
, [T (vn )]B2 ]
...
It is important to note that
[T (x)]B2 = T [B1 , B2 ] [x]B1
...

3
...

We sometimes write A for TA
...
Then observe that
T [B1 , B2 ] = A
...
3

Rank-Nullity Theorem

Deﬁnition 4
...
1 (Range and Null Space) Let V, W be ﬁnite dimensional vector spaces over the same set
of scalars and T : V −→W be a linear transformation
...
R(T ) = {T (x) : x ∈ V }, and
2
...

Proposition 4
...
2 Let V and W be ﬁnite dimensional vector spaces and let T : V −→W be a linear transformation
...
, vn ) is an ordered basis of V
...
(a) R(T ) is a subspace of W
...
, T (vn ))
...

2
...

(b) dim(N (T )) ≤ dim(V )
...
LINEAR TRANSFORMATIONS
3
...

N (T ) = {0} is the zero subspace of V ⇐⇒

{T (ui ) : 1 ≤ i ≤ n} is a basis of

4
...

Proof
...
We thus leave the proof for the
readers
...
We need to show that N (T ) = {0}
...
Then by deﬁnition, T (u) = 0
...
1
...
Thus T (u) = T (0)
...
That is, N (T ) = {0}
...
We need to show that T is one-one
...
Then, by linearity of T, T (u − v) = 0
...
This in turn
implies u = v
...

The other parts can be similarly proved
...
3
...
The space R(T ) is called the range space of T and N (T ) is called the null
space of T
...
We write ρ(T ) = dim(R(T )) and ν(T ) = dim(N (T ))
...
ρ(T ) is called the rank of the linear transformation T and ν(T ) is called the nullity of T
...
3
...

Solution: By Deﬁnition R(T ) = L(T (1, 0, 0), T (0, 1, 0), T (0, 0, 1))
...

Also, by deﬁnition
N (T ) =
=
=
=
=

{(x, y, z) ∈ R3 : T (x, y, z) = 0}

{(x, y, z) ∈ R3 : (x − y + z, y − z, x, 2x − 5y + 5z) = 0}
{(x, y, z) ∈ R3 : x − y + z = 0, y − z = 0,
{(x, y, z) ∈ R

{(0, y, y) ∈ R

=

x = 0, 2x − 5y + 5z = 0}

: y − z = 0, x = 0}

3

: y = z, x = 0}

3

: y arbitrary}

{(x, y, z) ∈ R

=

3

L((0, 1, 1))

Exercise 4
...
5
1
...
, T (vn )} be
linearly independent in R(T )
...
, vn } ⊂ V is linearly independent
...
3
...
Let T : R2 −→R3 be deﬁned by
T (1, 0) = (1, 0, 0), T (0, 1) = (1, 0, 0)
...

3
...
Recall the vector space Pn (R)
...

Describe the null space and range space of D
...

5
...

(a) Find T (x, y, z) for x, y, z ∈ R,
(b) Find R(T ) and N (T )
...

(c) Show that T 3 = T and ﬁnd the matrix of the linear transformation with respect to the standard
basis
...
Let T : R2 −→ R2 be a linear transformation with
T ((3, 4)) = (0, 1), T ((−1, 1)) = (2, 3)
...

7
...

8
...

A −→ B1 −→ B1 −→ B2 · · · −→ Bk−1 −→ Bk −→ B
...

We now state and prove the rank-nullity Theorem
...
3
...

Theorem 4
...
6 (Rank Nullity Theorem) Let T : V −→W be a linear transformation and V be a ﬁnite
dimensional vector space
...

78

CHAPTER 4
...
Let dim(V ) = n and dim(N (T )) = r
...
, ur } is a basis of N (T )
...
, ur } is a linearly independent set in V, we can extend it to form a basis of V (see Corollary
3
...
15)
...
, un } such that {u1 ,
...
, un } is a basis of V
...
3
...
, T (un ))
= L(0,
...
, T (un ))
= L(T (ur+1 ), T (ur+2 ),
...

We now prove that the set {T (ur+1), T (ur+2 ),
...
Suppose the set is
not linearly independent
...
, αn , not all zero such that
αr+1 T (ur+1 ) + αr+2 T (ur+2 ) + · · · + αn T (un ) = 0
...

So, by deﬁnition of N (T ),
αr+1 ur+1 + αr+2 ur+2 + · · · + αn un ∈ N (T ) = L(u1 ,
...

Hence, there exists scalars αi , 1 ≤ i ≤ r such that
αr+1 ur+1 + αr+2 ur+2 + · · · + αn un = α1 u1 + α2 u2 + · · · + αr ur
...

But the set {u1 , u2 ,
...
Thus by deﬁnition of linear
independence
αi = 0 for all i, 1 ≤ i ≤ n
...
, T (un )} is a basis of R(T )
...

Using the Rank-nullity theorem, we give a short proof of the following result
...
3
...
Then
T is one-one ⇐⇒ T is onto ⇐⇒ T is invertible
...
By Proposition 4
...
2, T is one-one if and only if N (T ) = {0}
...
3
...
Or equivalently T is onto
...
But we have shown that T is one-one if and
only if T is onto
...

Remark 4
...
8 Let V be a ﬁnite dimensional vector space and let T : V −→V be a linear transformation
...

The following are some of the consequences of the rank-nullity theorem
...

4
...
RANK-NULLITY THEOREM

79

Corollary 4
...
9 The following are equivalent for an m × n real matrix A
...
Rank (A) = k
...
There exist exactly k rows of A that are linearly independent
...
There exist exactly k columns of A that are linearly independent
...
There is a k × k submatrix of A with non-zero determinant and every (k + 1) × (k + 1) submatrix of
A has zero determinant
...
The dimension of the range space of A is k
...
There is a subset of Rm consisting of exactly k linearly independent vectors b1 , b2 ,
...

7
...

Exercise 4
...
10

1
...

(a) If V is ﬁnite dimensional then show that the null space and the range space of T are also ﬁnite
dimensional
...
if dim(V ) < dim(W ) then T is onto
...
if dim(V ) > dim(W ) then T is not one-one
...
Let A be an m × n real matrix
...
, bm )t such that the system Ax = b
does not have any solution
...
Let A be an m × n matrix
...

[Hint: Deﬁne TA : Rn −→Rm by TA (v) = Av for all v ∈ Rn
...
Use Theorem
2
...
1 to show, Ax = 0 has n − r linearly independent solutions
...

Now observe that R(TA ) is the linear span of columns of A and use the rank-nullity Theorem 4
...
6
to get the required result
...
Prove Theorem 2
...
1
...
Deﬁne a linear transformation T : Rn −→Rm by T (v) = Av
...
Note that ρ(A) = column rank(A) = dim(R(T )) = ℓ(say)
...

i) Let Ci1 , Ci2 ,
...
Then rank(A) < rank([A b])
implies that {Ci1 , Ci2 ,
...
Hence b ∈ L(Ci1 , Ci2 ,
...
Hence,
the system doesn’t have any solution
...
]
5
...

80

CHAPTER 4
...
Deduce that ρ(T + S) ≤ ρ(T ) + ρ(S)
...

(b) Use the above and the rank-nullity Theorem 4
...
6 to prove ν(T + S) ≥ ν(T ) + ν(S) − n
...
Let V be the complex vector space of all complex polynomials of degree at most n
...
, zk , we deﬁne a linear transformation
T : V −→ Ck by T P (z) = P (z1 ), P (z2 ),
...

For each k ≥ 1, determine the dimension of the range space of T
...
Let A be an n × n real matrix with A2 = A
...
Prove that
(a) TA ◦ TA = TA (use the condition A2 = A)
...

Hint: Let x ∈ N (TA ) ∩ R(TA )
...
So,
x = TA (y) = (TA ◦ TA )(y) = TA TA (y) = TA (x) = 0
...

Hint: Let {v1 ,
...
Extend it to get a basis {v1 ,
...
, vn }
of Rn
...
3
...
, TA (vn )} is a basis of R(TA )
...
4

Similarity of Matrices

In the last few sections, the following has been discussed in detail:
Given a ﬁnite dimensional vector space V of dimension n, we ﬁxed an ordered basis B
...
Also, for any linear transformation T : V −→V, we got an n × n matrix T [B, B], the matrix of T with
respect to the ordered basis B
...

In this section, we understand the matrix representation of T in terms of diﬀerent bases B1 and
B2 of V
...
We start with the following
important theorem
...

Theorem 4
...
1 (Composition of Linear Transformations) Let V, W and Z be ﬁnite dimensional vector spaces with ordered bases B1 , B2 , B3 , respectively
...
Then the composition map S ◦ T : V −→Z is a linear transformation and
(S ◦ T ) [B1 , B3 ] = S[B2 , B3 ] T [B1 , B2 ]
...
Let B1 = (u1 , u2 ,
...
, vm ) and B3 = (w1 , w2 ,
...
Then
(S ◦ T ) [B1 , B3 ] = [[S ◦ T (u1 )]B3 , [S ◦ T (u2 )]B3 ,
...

4
...
SIMILARITY OF MATRICES

81

Now for 1 ≤ t ≤ n,
m

(S ◦ T ) (ut ) = S(T (ut )) = S

j=1

j=1

(T [B1 , B2 ])jt

p

=

=
j=1

(T [B1 , B2 ])jt S(vj )

p

m

=

m

(T [B1 , B2 ])jt vj
(S[B2 , B3 ])kj wk

k=1

m

(
k=1 j=1

(S[B2 , B3 ])kj (T [B1 , B2 ])jt )wk

p

=

(S[B2 , B3 ] T [B1 , B2 ])kt wk
...
, (S[B2 , B3 ] T [B1 , B2 ])pt )t
...
, [(S ◦ T ) (un )]B3 = S[B2 , B3 ] T [B1 , B2 ]
...

Proposition 4
...
2 Let V be a ﬁnite dimensional vector space and let T, S : V −→V be a linear transformations
...

Proof
...

Suppose that v ∈ N (S)
...
So, N (S) ⊂ N (T ◦ S)
...

Suppose dim(V ) = n
...

So, to complete the proof of the second inequality, we need to show that R(T ◦ S) ⊂ R(T )
...

We now prove the ﬁrst inequality
...
, vk } be a basis of N (S)
...
, vk } ⊂ N (T ◦ S) as
T (0) = 0
...
, vk , u1 , u2 ,
...

Claim: The set {S(u1 ), S(u2 ),
...

As u1 , u2 ,
...
, S(uℓ )} is a subset of N (T )
...
Then there exist non-zero scalars c1 , c2 ,
...

ℓ

So, the vector
i=1

ci ui ∈ N (S) and is a linear combination of the basis vectors v1 , v2 ,
...

Therefore, there exist scalars α1 , α2 , αk such that
ℓ

k

c i ui =
i=1

αi vi
...

i=1

82

CHAPTER 4
...
, vk , u1 , u2 ,
...
A contradiction
...
, S(uℓ )} is a linearly independent subset of N (T ) and so ν(T ) ≥ ℓ
...

Recall from Theorem 4
...
8 that if T is an invertible linear Transformation, then T −1 : V −→V is a
linear transformation deﬁned by T −1 (u) = v whenever T (v) = u
...
The reader is required to supply the proof (use Theorem 4
...
1)
...
4
...
Also let T : V −→V be an invertible linear transformation
...

Exercise 4
...
4 For the linear transformations given below, ﬁnd the matrix T [B, B]
...
Let B = (1, 1, 1), (1, −1, 1), (1, 1, −1) be an ordered basis of R3
...
Is T an invertible linear transformation? Give reasons
...
Let B = 1, x, x2 , x3 ) be an ordered basis of P3 (R)
...

Prove that T is an invertible linear transformation
...

Let V be a vector space with dim(V ) = n
...
, un ) and B2 = (v1 , v2 ,
...
Recall from Deﬁnition 4
...
5 that I : V −→V is the identity linear transformation
deﬁned by I(x) = x for every x ∈ V
...
, αn )t and [x]B2 =
(β1 , β2 ,
...

We now express each vector in B2 as a linear combination of the vectors from B1
...

Hence, [I(vi )]B1 = [vi ]B1 = (a1i , a2i , · · · , ani )t and
I[B2 , B1 ] =
=

[[I(v1 )]B1 , [I(v2 )]B1 ,
...

...

...

...


...

...

an1 an2 · · · ann

Thus, we have proved the following result
...
4
...
, un } and B2 = (v1 , v2 ,
...
Suppose x ∈ V with [x]B1 = (α1 , α2 ,
...
, βn )t
...

4
...
SIMILARITY OF MATRICES
Equivalently,

83



 
α1
a11
  
 α2   a21

...


...

...

an2

···
···

...

···


a1n

a2n 

...


...


...

βn

Note: Observe that the identity linear transformation I : V −→V deﬁned by I(x) = x for every
x ∈ V is invertible and
I[B2 , B1 ]−1 = I −1 [B1 , B2 ] = I[B1 , B2 ]
...

Let V be a ﬁnite dimensional vector space and let B1 and B2 be two ordered bases of V
...
We are now in a position to relate the two matrices T [B1 , B1 ] and T [B2 , B2 ]
...
4
...
, un ) and B2 =
(v1 , v2 ,
...
Let T : V −→V be a linear transformation with B = T [B1 , B1 ]
and C = T [B2 , B2 ] as matrix representations of T in bases B1 and B2
...
Then BA = AC
...

Proof
...
Using Theorem 4
...
1, the ﬁrst expression is
[T (x)]B2 = T [B2 , B2 ] [x]B2
...
4
...
4
...

(4
...
2)

Hence, using (4
...
1) and (4
...
2), we see that for every x ∈ V,
I[B1 , B2 ] T [B1, B1 ] I[B2 , B1 ] [x]B2 = T [B2 , B2 ] [x]B2
...

That is, A−1 BA = C or equivalently ACA−1 = B
...
Then for 1 ≤ i ≤ n,
n

n

T (ui ) =

bji uj and T (vi ) =
j=1

cji vj
...
4
...
LINEAR TRANSFORMATIONS

and therefore,



[T (vj )]B1



n

 
 k=1 b1k akj 
a1j
 n



 

b2k akj 
 a2j 


=  k=1
 = B 
...




...


...

 n

anj


bnk akj
k=1

Hence T [B2 , B1 ] = BA
...


...


...




...
We thus have AC = T [B2 , B1 ] = BA
...
Then for
each ordered basis B of V, we get an n × n matrix T [B, B]
...
So, as we change an ordered basis, the matrix of
the linear transformation changes
...
4
...

Now, let A and B be two n × n matrices such that P −1 AP = B for some invertible matrix P
...
Then we have seen that
if the standard basis of Rn is the ordered basis B, then A = TA [B, B]
...
Then
note that B = TA [B1 , B1 ]
...

Remark 4
...
7 The identity (4
...
3) shows how the matrix representation of a linear transformation T
changes if the ordered basis used to compute the matrix representation is changed
...

Deﬁnition 4
...
8 (Similar Matrices) Two square matrices B and C of the same order are said to be similar
if there exists a non-singular matrix P such that B = P CP −1 or equivalently BP = P C
...
4
...
Therefore, similar matrices are just
diﬀerent matrix representations of a single linear transformation
...
4
...
Consider P2 (R), with ordered bases

Example 4
...
10

B1 = 1, 1 + x, 1 + x + x2

and B2 = 1 + x − x2 , 1 + 2x + x2 , 2 + x + x2
...

Therefore,
I[B2 , B1 ]

[[I(1 + x − x2 )]B1 , [I(1 + 2x + x2 )]B1 , [I(2 + x + x2 )]B1 ]

=

[[1 + x − x2 ]B1 , [1 + 2x + x2 ]B1 , [2 + x + x2 ]B1 ]


0
−1 1


1
0
...
Also verify that

T [B2 , B2 ] = I[B1 , B2 ] T [B1 , B1 ] I[B2 , B1 ]

= I −1 [B2 , B1 ] T [B1 , B1 ] I[B2 , B1 ]
...
Consider two bases B1 = (1, 0, 0), (1, 1, 0), (1, 1, 1) and B2 = (1, 1, −1), (1, 2, 1), (2, 1, 1) of R3
...

Then



0 0

T [B1 , B1 ] = 1 1
0 1

Find I[B1 , B2 ] and verify,
Check that,


−2

4 ,
0


−4/5 1 8/5


and T [B2 , B2 ] = −2/5 2 9/5 
...


2 −2 −2


T [B1 , B1 ] I[B2 , B1 ] = I[B2 , B1 ] T [B2 , B2 ] = −2 4
5 
...
4
...
Let V be an n-dimensional vector space and let T : V −→V be a linear transformation
...

(a) Then prove that there exists a vector u ∈ V such that the set
{u, T (u),
...

(b) Let B = (u, T (u),
...
Then prove

0
1


T [B, B] = 0


...

...

0

...

...



...


...
LINEAR TRANSFORMATIONS
(c) Let A be an n × n matrix with the property that An−1 = 0 but An = 0
...

2
...

Let B be the standard basis and B1 = (1, 1, 1), (1, −1, 1), (1, 1, 2) be another ordered basis
...

(b) Find the matrix P such that P −1 T [B, B] P = T [B1 , B1 ]
...
Let T : R3 −→R3 be a linear transformation given by
T ((x, y, z)) = (x, x + y, x + y + z)
...

(a) Find the matrices T [B, B] and T [B1 , B1 ]
...

4
...

(a) Find the change of basis matrix P from B1 to B2
...

(c) Verify that P Q = I = QP
...
What do you notice?

Chapter 5

Inner Product Spaces
We had learned that given vectors i and j (which are at an angle of 90◦ ) in a plane, any vector in the
plane is a linear combination of the vectors i and j
...
To do this, we start by deﬁning a notion of inner
product (dot product) in a vector space
...

5
...
Note
that for any x, y, z ∈ R2 and α ∈ R, this inner product satisﬁes the conditions
x · (y + αz) = x · y + αx · z, x · y = y · x, and x · x ≥ 0
and x · x = 0 if and only if x = 0
...

Deﬁnition 5
...
1 (Inner Product) Let V (F) be a vector space over F
...
au + bv, w = a u, w + b v, w ,
2
...
u, u ≥ 0 for all u ∈ V and equality holds if and only if u = 0
...
1
...

,

...
1
...

1
...
Given two vectors u = (u1 , u2 ,
...
, vn ) of V, we deﬁne
u, v = u1 v1 + u2 v2 + · · · + un vn = uvt
...

88

CHAPTER 5
...
Let V = Cn be a complex vector space of dimension n
...
, un ) and v =
(v1 , v2 ,
...

4 −1

...
Check that
−1 2
Hint: Note that xAyt = 4x1 y1 − x1 y2 − x2 y1 + 2x2 y2
...
Let V = R2 and let A =

,

is an inner product
...
let x = (x1 , x2 , x3 ), y = (y1 , y2 , y3 ) ∈ R3
...

5
...
In this example, we deﬁne three products that satisfy two conditions
out of the three conditions for an inner product
...

(a) Deﬁne x, y = (x1 , x2 ), (y1 , y2 ) = x1 y1
...

2
2
(b) Deﬁne x, y = (x1 , x2 ), (y1 , y2 ) = x2 + y1 + x2 + y2
...

3
3
(c) Deﬁne x, y = (x1 , x2 ), (y1 , y2 ) = x1 y1 + x2 y2
...

Remark 5
...
4 Note that in parts 1 and 2 of Example 5
...
3, the inner products are uvt and uv∗ ,
respectively
...
In general, u and v are taken as
column vectors and hence one uses the notation ut v or u∗ v
...
1
...
1
...

Deﬁnition 5
...
6 (Length/Norm of a Vector) For u ∈ V, we deﬁne the length (norm) of u, denoted u ,
by u =
u, u , the positive square root
...
The next theorem gives the statement and a proof of this inequality
...
1
...
Then for any u, v ∈
V
| u, v | ≤ u v
...
Further, if u = 0, then
u
u

...
If u = 0, then the inequality holds
...
Note that λu + v, λu + v ≥ 0 for all λ ∈ F
...

u 2

2

5
...
DEFINITION AND BASIC PROPERTIES

89

Or, in other words
| v, u |2 ≤ u

2

v

2

and the proof of the inequality is over
...
That is, u
u 2

and v are linearly dependent
...

u

Deﬁnition 5
...
8 (Angle between two vectors) Let V be a real vector space
...

u v

We know that cos : [0, π] −→ [−1, 1] is an one-one and onto function
...

u v

1
...

2
...

3
...
, un } is called mutually orthogonal if ui , uj = 0 for all 1 ≤ i = j ≤ n
...
1
...
Let {e1 , e2 ,
...
Then prove that with respect to the
standard inner product on Rn , the vectors ei satisfy the following:
(a)

ei = 1 for 1 ≤ i ≤ n
...

2
...

(a) Find the angle between the vectors e1 = (1, 0)t and e2 = (0, 1)t
...
Find v ∈ R2 such that v, u = 0
...

3
...
]

a
b

and

(1, 2), (2, −1) = 0
...
Deﬁne x, y = yt Ax and solve a system of 3
c

90

CHAPTER 5
...
Let V be a complex vector space with dim(V ) = n
...
, un )
...
, an )t and [v]B = (b1 , b2 ,
...
Show that the above deﬁned map is
indeed an inner product
...
Let x = (x1 , x2 , x3 ), y = (y1 , y2 , y3 ) ∈ R3
...
With respect to this inner product, ﬁnd the angle between the vectors
(1, 1, 1) and (2, −5, 2)
...
Consider the set Mn×n (R) of all real square matrices of order n
...
Then
A + B, C = tr (A + B)C t = tr(AC t ) + tr(BC t ) = A, C + B, C
...

Let A = (aij )
...
So, it is clear that A, B is an inner product on
Mn×n (R)
...
Let V be the real vector space of all continuous functions with domain [−2π, 2π]
...
Then show that V is an inner product space with inner product −1 f (x)g(x)dx
...

8
...
Prove that
u+v ≤ u + v

for every u, v ∈ V
...

9
...
, zn ∈ C
...

When does the equality hold?
10
...
Observe that x, y = y, x
...

x = y ⇐⇒ x + y, x − y = 0, (x and y form adjacent sides of a rhombus as the diagonals
x + y and x − y are orthogonal)
...

2

(This is called the polarisation identity)
...
1
...
Suppose the norm of a vector is given
...

5
...
DEFINITION AND BASIC PROPERTIES

91

ii
...
The above equality tells us that the lengths of the two diagonals are equal
...
Let x, y ∈ Cn (C)
...

+ ix 2 , even though x, ix = 0
...

,
...
Let V be an n-dimensional inner product space, with an inner product
vector with u = 1
...

(a) Let S ⊥ = {v ∈ V : v, u = 0}
...

(b) Let 0 = α ∈ F and let S = {v ∈ V : v, u = α}
...

(c) For any v ∈ S, there exists a vector v0 ∈ S ⊥ , such that v = v0 + αu
...
1
...
Let {u1 , u2 ,
...

1
...
, un } is linearly independent
...

αi ui
i=1

2

n

=
i=1

|αi |2 ui 2 ;

3
...
, n
...

i=1

In particular, v, ui = 0 for all i = 1, 2,
...

Proof
...
, un }
...
, cn not all zero, such that
c1 u1 + c2 u2 + · · · + cn un = 0
...
This gives a contradiction to our assumption that some of
the ci ’s are non-zero
...

0
if i = j
For the second part, using ui , uj =
for 1 ≤ i, j ≤ n, we have
2
ui
if i = j
n

n

αi ui

2

n

=

i=1

αi ui ,
i=1
n

=

i=1
n

=
i=1

n

αi ui ,
i=1
n

αj ui , uj =

αi
i=1
n

n

αi ui =

j=1

|αi |2 ui 2
...
INNER PRODUCT SPACES

For the third part, observe from the ﬁrst part, the linear independence of the non-zero mutually
orthogonal vectors u1 , u2 ,
...
Since dim(V ) = n, they form a basis of V
...
Hence,
n

v, uj =

n

αi ui , uj =
i=1

αi ui , uj = αj
...

Deﬁnition 5
...
12 (Orthonormal Set) Let V be an inner product space
...
, vn } in V is called an orthonormal set if vi = 1 for i = 1, 2,
...

If the set {v1 , v2 ,
...
, vn } is called an
orthonormal basis of V
...
Consider the vector space R2 with the standard inner product
...
Also, the basis B1 = √ (1, 1), √ (1, −1)
2
2
is an orthonormal set
...
1
...
Let Rn be endowed with the standard inner product
...
1
...
1, the standard ordered
basis (e1 , e2 ,
...

In view of Theorem 5
...
11, we inquire into the question of extracting an orthonormal basis from
a given basis
...

Remark 5
...
14 The last part of the above theorem can be rephrased as “suppose {v1 , v2 ,
...
Then for each u ∈ V the numbers u, vi for 1 ≤ i ≤ n
are the coordinates of u with respect to the above basis”
...
, vn ) be an ordered basis
...
, u, vn )t
...
2

Gram-Schmidt Orthogonalisation Process

Let V be a ﬁnite dimensional inner product space
...
, un is a linearly independent subset
of V
...
, un to construct
new vectors v1 , v2 ,
...
, ui } =
Span {v1 , v2 ,
...
, n
...

Suppose we are given two vectors u and v in a plane
...
Let θ be the angle between the vectors u and v
...

u
u v
u, v
Deﬁned α = v cos(θ) =
= z, v
...
So, the vectors that we are interested in are
w
z and y =

...

5
...
GRAM-SCHMIDT ORTHOGONALISATION PROCESS

93

v
v

u
|| u ||
u

Figure 5
...
2
...
Suppose
{u1 , u2 ,
...
Then there exists a set {v1 , v2 ,
...

2
...
L(v1 , v2 ,
...
, ui ) for 1 ≤ i ≤ n
...
We successively deﬁne the vectors v1 , v2 ,
...

v1 =

u1

...

w2

Obtain w3 = u3 − u3 , v1 v1 − u3 , v2 v2 , and let v3 =

w3

...
, vi−1 are already obtained, we compute
wi = ui − ui , v1 v1 − ui , v2 v2 − · · · − ui , vi−1 vi−1 ,
and deﬁne
vi =

(5
...
1)

wi

...

u1
For n = 1, we have v1 =

...

u1 2

Hence, the result holds for n = 1
...
That is, suppose we are given any set of k, 1 ≤ k ≤ n − 1
linearly independent vectors {u1 , u2 ,
...
Then by the inductive assumption, there exists a set
{v1 , v2 ,
...

vi = 1 for 1 ≤ i ≤ k,

2
...
INNER PRODUCT SPACES
3
...
, vi ) = L(u1 , u2 ,
...

Now, let us assume that we are given a set of n linearly independent vectors {u1 , u2 ,
...

Then by the inductive assumption, we already have vectors v1 , v2 ,
...

vi = 1 for 1 ≤ i ≤ n − 1,

2
...
L(v1 , v2 ,
...
, ui ) for 1 ≤ i ≤ n − 1
...
2
...

We ﬁrst show that wn ∈ L(v1 , v2 ,
...
This will also imply that wn = 0 and hence vn =

(5
...
2)
wn
wn

is well deﬁned
...
, vn−1 )
...
, αn−1
such that
wn = α1 v1 + α2 v2 + · · · + αn−1 vn−1
...
2
...

Thus, by the third induction assumption,
un ∈ L(v1 , v2 ,
...
, un−1 )
...
, un } is linear
independent
...
Deﬁne vn =

...
Also, it can be easily veriﬁed that vn , vi = 0 for
wn
1 ≤ i ≤ n − 1
...

We illustrate the Gram-Schmidt process by the following example
...
2
...
Find an
orthonormal set {v1 , v2 , v3 } such that L( (1, −1, 1, 1), (1, 0, 1, 0), (0, 1, 0, 1) ) = L(v1 , v2 , v3 )
...
Deﬁne v1 =

...
Then
2
w2 = (0, 1, 0, 1) − (0, 1, 0, 1),
Hence, v2 =

(0, 1, 0, 1)
√

...
Then
2
=

(1, −1, 1, 1) − (1, −1, 1, 1),

=

w3

and v3 =

(1, 0, 1, 0)
√
v1 = (0, 1, 0, 1)
...

2

(1, 0, 1, 0)
(0, 1, 0, 1)
√
√
v1 − (1, −1, 1, 1),
v2
2
2

5
...
GRAM-SCHMIDT ORTHOGONALISATION PROCESS

95

Remark 5
...
3
1
...
, uk } be any basis of a k-dimensional subspace W of Rn
...
, vk } ⊂ Rn with
W = L(v1 , v2 ,
...
, vi ) = L(u1 , u2 ,
...

2
...
, un } of V that are linearly dependent
...
2
...
, uk ) = L(u1 , u2 ,
...

We claim that in this case, wk = 0
...
, ui ) = L(u1 , u2 ,
...
, uk−1 } is linearly independent (use Corollary 3
...
5)
...
2
...
, vk−1 } such that
L(u1 , u2 ,
...
, vk−1 )
...
, vk−1 ), by Remark 5
...
14
uk = uk , v1 v1 + uk , v2 v2 + · · · + uk , vk−1 vn−1
...

Therefore, in this case, we can continue with the Gram-Schmidt process by replacing uk by uk+1
...
Let S be a countably inﬁnite set of linearly independent vectors
...

4
...
, vk } be an orthonormal subset of Rn
...
, en ) be the standard
ordered basis of Rn
...
, αni )t
...
, vk ]
...


...

αn1

···
···

...

α12
α22

...

...


...

αnk

is an n × k matrix
...

s=1



α2 , 
ji 




(5
...
3)

96

CHAPTER 5
...
, vk ]
v1 2
 vt 
 v ,v
 2
 2 1

...


...


...

t
vk
vk , v1


1 0 ··· 0
0 1 · · · 0



...
 = Ik
...

...

...

...

vk , v2

···
···

...

···

v1 , vk
v2 , vk

...

...
2
...


...

α1k

α21
α22

...

...

...

αnk
αn1

···
···

...

···

α12
α22

...

...

...
 = Ik
...


...
Such matrices are called
orthogonal matrices and they have a special role to play
...
2
...

It is worthwhile to solve the following exercises
...
2
...
Let A and B be two n × n orthogonal matrices
...

2
...
Then prove that
(a) the rows of A form an orthonormal basis of Rn
...

(c) for any two vectors x, y ∈ Rn×1 , Ax, Ay = x, y
...

3
...
, un } be an orthonormal basis of Rn
...
Construct an n × n matrix A by

a11

 a21
A = [u1 , u2 ,
...


...

an1
where

B = (e1 , e2 ,
...

...

an2

···
···

...

···


a1n

a2n 

...


...

Prove that At A = In
...

4
...
If A is also an orthogonal matrix, then prove that A = In
...
2
...
2
...
Then there exist matrices Q
and R such that Q is orthogonal and R is upper triangular with A = QR
...
Also, in this case, the
decomposition is unique
...
We prove the theorem when A is non-singular
...

Let the columns of A be x1 , x2 ,
...
The Gram-Schmidt orthogonalisation process applied to the
vectors x1 , x2 ,
...
, un satisfying
L(u1 , u2 ,
...
, xi ),
ui = 1, ui , uj = 0,

for 1 ≤ i = j ≤ n
...
2
...
, un )
...
2
...
, ui ) =
L(x1 , x2 ,
...
So, we can ﬁnd scalars αji , 1 ≤ j ≤ i such that
xi = α1i u1 + α2i u2 + · · · + αii ui = (α1i ,
...
, 0)t
Let Q = [u1 , u2 ,
...
Then by Exercise 5
...
5
...

...

...

...

(5
...
5)

is an orthogonal matrix
...

...
2
...
, un ] 
...


...

...

αnn
α12
α22

...

...

...
,


α1n

α2n 

...


...
, xn ] = A
...
2
...
4) and R is an upper
triangular matrix
...
But this can be achieved by replacing
the vector ui by −ui whenever αii is negative
...
Observe the following properties of
2
upper triangular matrices
...
The inverse of an upper triangular matrix is also an upper triangular matrix, and
2
...

−1
Thus the matrix R2 R1 is an upper triangular matrix
...
2
...
1, the matrix Q−1 Q1 is
2
−1
an orthogonal matrix
...
2
...
4, R2 R1 = In
...

Suppose we have matrix A = [x1 , x2 ,
...
Then by Remark
5
...
3
...
, ur } of

98

CHAPTER 5
...
In this case, for each i, 1 ≤ i ≤ r, we have
L(u1 , u2 ,
...
, xj ), for some j, i ≤ j ≤ k
...

Theorem 5
...
7 (Generalised QR Decomposition) Let A be an n × k matrix of rank r
...
Q is an n × r matrix with Qt Q = Ir
...
If Q = [u1 , u2 ,
...
, ur ) = L(x1 , x2 ,
...
R is an r × k matrix with rank (R) = r
...
2
...
Let A = 

...

Solution: From Example 5
...
2, we know that
1
1
1
v1 = √ (1, 0, 1, 0), v2 = √ (0, 1, 0, 1), v3 = √ (0, −1, 0, 1)
...
2
...
If we denote u4 = (2, 1, 1, 1)t then by the Gram-Schmidt process,
w4

= u4 − u4 , v1 v1 − u4 , v2 v2 − u4 , v3 v3
1
=
(1, 0, −1, 0)t
...
2
...
2
...
2
...

0 


−1
√
2

The readers are advised to check that A = QR is indeed correct
...
Let A = 

...

Solution: Let us apply the Gram Schmidt orthogonalisation to the columns of A
...
So, we need to apply the process to the subset {(1, −1, 1, 1), (1, 0, 1, 0), (1, −2, 1, 2), (0, 1, 0, 1)}
of R4
...
2
...
Deﬁne v1 =

u1

...
Then
2

w2 = (1, 0, 1, 0) − u2 , v1 v1 = (1, 0, 1, 0) − v1 =
Hence, v2 =

99

1
(1, 1, 1, −1)
...
Let u3 = (1, −2, 1, 2)
...

So, we again take u3 = (0, 1, 0, 1)
...

So, v3 =

(0, 1, 0, 1)
√

...

√
0 0
2

(a) rank (A) = 3,
(b) A = QR with Qt Q = I3 , and
(c) R a 3 × 4 upper triangular matrix with rank (R) = 3
...
2
...
Determine an orthonormal basis of R4 containing the vectors (1, −2, 1, 3) and (2, 1, −3, 1)
...
Prove that the polynomials 1, x, 3 x2 − 1 , 5 x3 − 3 x form an orthogonal set of functions in the in2
2 2
2
1
ner product space C[−1, 1] with the inner product f, g = −1 f (t)g(t)dt
...

3
...
Find
an orthonormal basis for the subspace spanned by x, sin x and sin(x + 1)
...
Let M be a subspace of Rn and dim M = m
...

(a) How many linearly independent vectors can be orthogonal to M ?
(b) If M = {(x1 , x2 , x3 ) ∈ R3 : x1 + x2 + x3 = 0}, determine a maximal set of linearly independent
vectors orthogonal to M in R3
...
Determine an orthogonal basis of vector subspace spanned by
{(1, 1, 0, 1), (−1, 1, 1, −1), (0, 2, 1, 0), (1, 0, 0, 0)} in R4
...
Let S = {(1, 1, 1, 1), (1, 2, 0, 1), (2, 2, 4, 0)}
...

7
...
Suppose we have a vector xt = (x1 , x2 ,
...
Then prove the following:
(a) the set {x} can always be extended to form an orthonormal basis of Rn
...
, xn }
...
, en ) is the standard basis of Rn
...
, [xn ]B
...

8
...
Prove that there exists an orthogonal matrix A such that
Av = w
...

100

5
...
INNER PRODUCT SPACES

Orthogonal Projections and Applications

Recall that given a k-dimensional vector subspace of a vector space V of dimension n, one can always
ﬁnd an (n − k)-dimensional vector subspace W0 of V (see Exercise 3
...
19
...

The subspace W0 is called the complementary subspace of W in V
...

Deﬁnition 5
...
1 (Projection Operator) Let V be an n-dimensional vector space and let W be a kdimensional subspace of V
...
Then we deﬁne a map PW : V −→ V
by
PW (v) = w, whenever v = w + w0 , w ∈ W, w0 ∈ W0
...

Remark 5
...
2 The map P is well deﬁned due to the following reasons:
1
...

2
...

The next proposition states that the map deﬁned above is a linear transformation from V to V
...

Proposition 5
...
3 The map PW : V −→ V deﬁned above is a linear transformation
...
3
...

1
...
Then W ∩ W0 = {0} and W + W0 = R3
...

So, by deﬁnition,
 
0 −1 1 x
 

PW ((x, y, z)) = (z − y, 2z − 2x − y, 3z − 2x − 2y) = −2 −1 2 y 
...
Let W0 = L( (1, 1, 1) )
...
Also, for any vector (x, y, z) ∈ R3 ,
note that (x, y, z) = w + w0 , where
w = (z − y, z − x, 2z − x − y), and w0 = (x + y − z)(1, 1, 1)
...

−1 −1 2
z
Remark 5
...
5

1
...

2
...

5
...
ORTHOGONAL PROJECTIONS AND APPLICATIONS

101

3
...

We now prove some basic properties about projection maps
...
3
...
Let PW : V −→ V be a
projection operator of V onto W along W0
...
the null space of PW , N (PW ) = {v ∈ V : PW (v) = 0} = W0
...
the range space of PW , R(PW ) = {PW (v) : v ∈ V } = W
...
PW = PW
...

Proof
...

Let w0 ∈ W0
...
So, by deﬁnition, P (w0 ) = 0
...

Also, for any v ∈ V, let PW (v) = 0 with v = w + w0 for some w0 ∈ W0 and w ∈ W
...
That is, w = 0 and v = w0
...
Hence N (PW ) = W0
...
3
...
Let A be an n × n real matrix with A2 = A
...
Prove that
(a) TA ◦ TA = TA (use the condition A2 = A)
...

Hint: Let x ∈ N (TA ) ∩ R(TA )
...
So,
x = TA (y) = (TA ◦ TA )(y) = TA TA (y) = TA (x) = 0
...

Hint: Let {v1 ,
...
Extend it to get a basis {v1 ,
...
, vn }
of Rn
...
3
...
, TA (vn )} is a basis of R(TA )
...
Then TA is a projection operator of Rn onto W along
W0
...
3
...
7
...
Find all 2 × 2 real matrices A such that A2 = A
...

The next result uses the Gram-Schmidt orthogonalisation process to get the complementary subspace
in such a way that the vectors in diﬀerent subspaces are orthogonal
...
3
...
Let S be a non-empty
subset of V
...

Example 5
...
9 Let V = R
...
S = {0}
...

2
...

3
...
Then S ⊥ = {0}
...
INNER PRODUCT SPACES

Theorem 5
...
10 Let S be a subset of a ﬁnite dimensional inner product space V, with inner product
Then

,
...
S ⊥ is a subspace of V
...
Let S be equal to a subspace W
...
Moreover, if
w ∈ W and u ∈ W ⊥ , then u, w = 0 and V = W + W ⊥
...
We leave the prove of the ﬁrst part for the reader
...
Let {w1 , w2 ,
...
By Gram-Schmidt orthogonalisation process, we get an orthonormal basis, say, {v1 , v2 ,
...
Then, for any v ∈ V,
k

v−

i=1

v, vi vi ∈ W ⊥
...
Also, for any v ∈ W ∩ W ⊥ , by deﬁnition of W ⊥ , 0 = v, v = v 2
...
That
is, W ∩ W ⊥ = {0}
...
3
...
The subspace
W ⊥ is called the orthogonal complement of W in V
...
3
...
Let W = {(x, y, z) ∈ R3 : x + y + z = 0}
...

2
...
Prove that (W ⊥ )⊥ = W
...
Let V be the vector space of all n × n real matrices
...
1
...
6 shows that V is a real
inner product space with the inner product given by A, B = tr(AB t )
...

Deﬁnition 5
...
13 (Orthogonal Projection) Let W be a subspace of a ﬁnite dimensional inner product
space V, with inner product ,
...
Deﬁne PW : V −→ V
by
PW (v) = w where v = w + u, with w ∈ W, and u ∈ W ⊥
...

Deﬁnition 5
...
14 (Self-Adjoint Transformation/Operator) Let V be an inner product space with inner
product ,
...

Example 5
...
15
1
...
That is, At = A
...

Solution: By deﬁnition, for every xt , yt ∈ Rn ,
TA (x), y = (y)t Ax = (y)t At x = (Ay)t x = x, TA (y)
...

2
...
Then the linear transformation TA : Cn −→ Cn
deﬁned by TA (z) = Az for every zt ∈ Cn is a self-adjoint operator
...
3
...
By Proposition 5
...
3, the map PW deﬁned above is a linear transformation
...
3
...
PW = PW , (I − PW )PW = 0 = PW (I − PW )
...
Let u, v ∈ V with u = u1 + u2 and v = v1 + v2 for some u1 , v1 ∈ W and u2 , v2 ∈ W ⊥
...
Therefore, for every u, v ∈ V,
PW (u), v

=

u1 , v = u1 , v1 + v2 = u1 , v1 = u1 + u2 , v1

=

u, PW (v)
...

4
...
Then PW (w) = w for all w ∈ W
...
3
...
2 and
5
...
16
...

5
...
Thus, v − PW (v), PW (v) − w′ = 0,
for every w′ ∈ W
...

Therefore,
v − w ≥ v − PW (v)
and the equality holds if and only if w = PW (v)
...

That is, PW (v) is the vector nearest to v ∈ W
...
3
...

Matrix of the Orthogonal Projection

The minimization problem stated above arises in lot of applications
...

To this end, let W be a k-dimensional subspace of Rn with W ⊥ as its orthogonal complement
...
Suppose, we are given an orthonormal
basis B = (v1 , v2 ,
...
Under the assumption that B is known, we explicitly give the matrix of
PW with respect to an extended ordered basis of Rn
...
, vk , vk+1
...
Then by Theorem 5
...
11, for any v ∈ Rn , v =

n

v, vi vi
...
Let A = [v1 , v2 ,
...
Consider the standard orthogonal
i=1

104

CHAPTER 5
...
, en ) of Rn
...


...

an1

a12
a22

...

...

...
 , [v]B2

...

ank

···

and



[PW (v)]B2

n

aji ej , for 1 ≤ i ≤ k, then

j=1

n




a1i v, vi 
 i=1
 n




a2i v, vi 
 i=1

=


...




...

=  i=1



...




...
2
...
4, A A = Ik
...

(5
...
1)

Thus, using the associativity of matrix product and (5
...
1), we get

(AAt )(v)

=



a11
a
 12
A
...


...

...

a2k

···
···

...

···

ank

n

n

asi v, vi

as1

s=1
 n


as2

A s=1



 n
ask

i=1
n

asi v, vi
i=1

...

...





n

i=1

 n







= A i=1







 n


asi v, vi
i=1
s=1

k
a1i v, vi

 
 i=1
v, v1
 k
 v, v  
a2i v, vi
2 


A 
...
 

...
 

...

 k
v, vk

ani v, vi
i=1

=


an1
an2 


...


...




...




 n
ani v, vi


i=1

n

as1 asi

s=1
n

as2 asi
s=1

...

...
Thus, we have proved the following theorem
...
3
...
Suppose, B = (v1 , v2 ,
...

Deﬁne an n× k matrix A = [v1 , v2 ,
...
Then the matrix of the linear transformation PW in the standard
orthogonal ordered basis (e1 , e2 ,
...

5
...
ORTHOGONAL PROJECTIONS AND APPLICATIONS

105

Example 5
...
18 Let W = {(x, y, z, w) ∈ R4 : x = y, z = w} be a subspace of W
...

2
2

Therefore, if PW : R4 −→ R4 is an orthogonal projection of R4 onto W along W ⊥ , then the corresponding
matrix A is given by
 1

√
0
2
 1

√
0
 2

...

1

2
1
2

1
...
PW [B, B]2 = PW [B, B], and
3
...

Also, for any (x, y, z, w) ∈ R4 , we have
[(x, y, z, w)]B =
Thus, PW (x, y, z, w) =
vector (x, y, z, w) ∈ R4
...
3
...

z+w
x+y
(1, 1, 0, 0) +
(0, 0, 1, 1) is the closest vector to the subspace W for any
2
2

1
...

2
...
Let B be an orthonormal ordered basis of V
...

3
...
Consider the associated linear transformation
TA : Rn −→ Rn deﬁned by TA (v) = Av for all vt ∈ Rn
...

4
...
Let PW1 and PW2
⊥
⊥
be the corresponding orthogonal projection operators of V along W1 and W2 , respectively
...

106

CHAPTER 5
...
Let W be an (n− 1)-dimensional vector subspace of Rn and let W ⊥ be its orthogonal complement
...
, vn−1 , vn ) be an orthogonal ordered basis of Rn with (v1 , v2 ,
...
Deﬁne a map
T : Rn −→ Rn by T (v) = w0 − w
whenever v = w + w0 for some w ∈ W and w0 ∈ W ⊥
...

T is called the reﬂection along W ⊥
...
1

Introduction and Deﬁnitions

In this chapter, the linear transformations are from a given ﬁnite dimensional vector space V to itself
...
So, in this chapter,
all the matrices are square matrices and a vector x means x = (x1 , x2 ,
...

Example 6
...
1 Let A be a real symmetric matrix
...

To solve this, consider the Lagrangian
n

L(x, λ) = xt Ax − λ(xt x − 1) =

n

i=1 j=1

n

aij xi xj − λ(

i=1

x2 − 1)
...

∂xn

Therefore, to get the points of extrema, we solve for
(0, 0,
...
,
) =
= 2(Ax − λx)
...

Example 6
...
2 Consider a system of n ordinary diﬀerential equations of the form
d y(t)
= Ay, t ≥ 0;
dt

(6
...
1)

108

CHAPTER 6
...

To get a solution, let us assume that
y(t) = ceλt

(6
...
2)

is a solution of (6
...
1) and look into what λ and c has to satisfy, i
...
, we are investigating for a necessary
condition on λ and c so that (6
...
2) is a solution of (6
...
1)
...
1
...
Diﬀerentiating (6
...
2) with respect to t and
substituting in (6
...
1), leads to
λeλt c = Aeλt c or equivalently (A − λI)c = 0
...
1
...
1
...
1
...

That is, given an n × n matrix A, we are this lead to ﬁnd a pair (λ, c) such that c = 0 and (6
...
3) is satisﬁed
...
In general, we ask the question:
For what values of λ ∈ F, there exist a non-zero vector x ∈ Fn such that
Ax = λx?

(6
...
4)

Here, Fn stands for either the vector space Rn over R or Cn over C
...
1
...

By Theorem 2
...
1, this system of linear equations has a non-zero solution, if
rank (A − λI) < n,

or equivalently

det(A − λI) = 0
...
1
...
Observe
that det(A − λI) is a polynomial in λ of degree n
...

Deﬁnition 6
...
3 (characteristic Polynomial) Let A be a matrix of order n
...
The equation p(λ) = 0 is called the
characteristic equation of A
...

Some books use the term eigenvalue in place of characteristic value
...
1
...
Suppose λ = λ0 ∈ F is a root of the characteristic
equation
...

Proof
...
This shows that the matrix
A − λ0 I is singular and therefore by Theorem 2
...
1 the linear system
(A − λ0 In )x = 0
has a non-zero solution
...
1
...
So, we
consider only those x ∈ Fn that are non-zero and are solutions of the linear system Ax = λx
...
1
...
λ ∈ F is called an eigenvalue of A,

6
...
INTRODUCTION AND DEFINITIONS

109

2
...
the tuple (λ, x) is called an eigenpair
...
1
...

0 1
Consider the matrix A =

...

Given the matrix A, recall the linear transformation TA : F2 −→F2 deﬁned by
TA (x) = Ax for every x ∈ F2
...
If F = C, that is, if A is considered a complex matrix, then the roots of p(λ) = 0 in C are ±i
...

2
...
Therefore,
if F = R, then A has no eigenvalue but it has ±i as characteristic values
...
1
...
Similarly, if x1 , x2 ,
...
, cr ) ∈ Fr , it is easily seen that if
r

r

ci xi = 0, then
i=1

ci xi is also an eigenvector of A corresponding to the eigenvalue λ
...

Suppose λ0 ∈ F is a root of the characteristic equation det(A − λ0 I) = 0
...
Suppose rank (A − λ0 I) = r < n
...
3
...
That is, A has n − r linearly independent
eigenvectors corresponding to the eigenvalue λ0 whenever rank (A − λ0 I) = r < n
...
1
...
Let A = diag(d1 , d2 ,
...
Then p(λ) =
is the characteristic equation
...
, 0)t ), (d2 , (0, 1, 0,
...
, (dn , (0,
...

1 1

...
Hence, the characteristic equation has roots 1, 1
...
Now check that the equation (A − I2 )x = 0 for x = (x1 , x2 )t
is equivalent to the equation x2 = 0
...
Hence, from the above
remark, (1, 0)t is a representative for the eigenvector
...

2
...
Then det(A − λI2 ) = (1 − λ)2
...
Here, the
0 1
matrix that we have is I2 and we know that I2 x = x for every xt ∈ R2 and we can choose any two
linearly independent vectors xt , yt from R2 to get (1, x) and (1, y) as the two eigenpairs
...
Let A =

In general, if x1 , x2 ,
...
, (1, xn )
are eigenpairs for the identity matrix, In
...
EIGENVALUES, EIGENVECTORS AND DIAGONALIZATION
1 2

...
The characteristic equation has roots 3, −1
...
In this case, we have two distinct
eigenvalues and the corresponding eigenvectors are also linearly independent
...

4
...
Then det(A − λI2 ) = λ2 − 2λ + 2
...

1 1
Hence, over R, the matrix A has no eigenvalue
...

5
...
1
...
Find the eigenvalues of a triangular matrix
...
Find eigenpairs over C, for each of the following matrices:
1 0
1
1+i
i
1+i
cos θ − sin θ
,
,
,
, and
0 0
1−i
1
−1 + i
i
sin θ
cos θ

cos θ
sin θ

sin θ

...
Let A and B be similar matrices
...

(b) Let (λ, x) be an eigenpair for A and (λ, y) be an eigenpair for B
...
]
n

4
...
Suppose that for all i, 1 ≤ i ≤ n,

aij = a
...
What is the corresponding eigenvector?
5
...
Construct a 2 × 2 matrix A such
that the eigenvectors of A and At are diﬀerent
...
Let A be a matrix such that A2 = A (A is called an idempotent matrix)
...

7
...

Then prove that its eigenvalues are all 0
...
1
...
, λn , not necessarily distinct
...

i=1

Proof
...
, λn are the n eigenvalues of A, by deﬁnition,
det(A − λIn ) = p(λ) = (−1)n (λ − λ1 )(λ − λ2 ) · · · (λ − λn )
...
1
...
Therefore, by substituting λ = 0 in (6
...
5), we get
n
n

n

det(A) = (−1) (−1)

n

λi =
i=1

λi
...
1
...
1
...


...

an1

a12
a22 − λ

...

...



...

ann − λ

···
···

...

···

a0 − λa1 + λ a2 + · · ·

+(−1)n−1 λn−1 an−1 + (−1)n λn

(6
...
6)

(6
...
7)

for some a0 , a1 ,
...
Note that an−1 , the coeﬃcient of (−1)n−1 λn−1 , comes from the product
(a11 − λ)(a22 − λ) · · · (ann − λ)
...

i=1

But , from (6
...
5) and (6
...
7), we get
a0 − λa1 + λ2 a2 + · · · + (−1)n−1 λn−1 an−1 + (−1)n λn

=

(−1)n (λ − λ1 )(λ − λ2 ) · · · (λ − λn )
...
1
...

i=1

Hence, we get the required result
...
1
...

1
...
Then prove that 0 is an eigenvalue

2
...
If det(A) = 1, then prove that there exists a non-zero
vector v ∈ R3 such that Av = v
...
Then in the proof of the above theorem, we observed that the characteristic equation det(A − λI) = 0 is a polynomial equation of degree n in λ
...
, an−1 ∈ F, it has the form
λn + an−1 λn−1 + an−2 λ2 + · · · a1 λ + a0 = 0
...
Thus, we can only substitute λ by
elements of F
...
This is a celebrated theorem called the Cayley Hamilton Theorem
...

Theorem 6
...
13 (Cayley Hamilton Theorem) Let A be a square matrix of order n
...
That is,
An + an−1 An−1 + an−2 A2 + · · · a1 A + a0 I = 0
holds true as a matrix identity
...
EIGENVALUES, EIGENVECTORS AND DIAGONALIZATION

Some of the implications of Cayley Hamilton Theorem are as follows
...
Then its characteristic polynomial is p(λ) = λ2
...
This shows that the condition f (λ) = 0 for
each eigenvalue λ of A does not imply that f (A) = 0
...
1
...
Let A =

2
...
Then we can use the division algorithm to ﬁnd numbers α0 , α1 ,
...

Hence, by the Cayley Hamilton Theorem,
Aℓ = α0 I + α1 A + · · · + αn−1 An−1
...

In the language of graph theory, it says the following:
“Let G be a graph on n vertices
...
Then there is no path from v to u of any length
...
”

3
...
Then note that an = det(A) = 0 and
A−1 =

−1 n−1
[A
+ an−1 An−2 + · · · + a1 I]
...

Note that the vector A−1 (as an element of the vector space of all n × n matrices) is a linear combination
of the vectors I, A,
...

Exercise 6
...
15 Find inverse

2

i) 5
1

of the following matrices


3 4
−1 −1


ii)  1 −1
6 7
1 2
0
1

by using the Cayley Hamilton Theorem



1
1 −2 −1



iii) −2 1 −1
...
1
...
, λk are distinct eigenvalues of a matrix A with corresponding eigenvectors
x1 , x2 ,
...
, xk } is linearly independent
...
The proof is by induction on the number m of eigenvalues
...

Let the result be true for m, 1 ≤ m < k
...
We consider the equation
c1 x1 + c2 x2 + · · · + cm+1 xm+1 = 0

(6
...
9)

for the unknowns c1 , c2 ,
...
We have
0 = A0

= A(c1 x1 + c2 x2 + · · · + cm+1 xm+1 )

= c1 Ax1 + c2 Ax2 + · · · + cm+1 Axm+1
= c1 λ1 x1 + c2 λ2 x2 + · · · + cm+1 λm+1 xm+1
...
1
...
2
...
1
...
1
...

This is an equation in m eigenvectors
...

But the eigenvalues are distinct implies λi − λ1 = 0 for 2 ≤ i ≤ m + 1
...
Also, x1 = 0 and therefore (6
...
9) gives c1 = 0
...

We are thus lead to the following important corollary
...
1
...

Exercise 6
...
18

1
...

(a) A and At have the same set of eigenvalues
...

λ
(c) If λ is an eigenvalue of A then λk is an eigenvalue of Ak for any positive integer k
...

In each case, what can you say about the eigenvectors?
2
...

(a) Do A and B have the same set of eigenvalues?
(b) Give examples to show that the matrices A and B need not be similar
...
Let (λ1 , u) be an eigenpair for a matrix A and let (λ2 , u) be an eigenpair for another matrix B
...

(b) Give an example to show that if λ1 , λ2 are respectively the eigenvalues of A and B, then λ1 + λ2
need not be an eigenvalue of A + B
...
Let λi , 1 ≤ i ≤ n be distinct non-zero eigenvalues of an n × n matrix A
...
Then show that B = {u1 , u2 ,
...
If
[b]B = (c1 , c2 ,
...
2

c2
cn
c1
u1 + u2 + · · · +
un
...

In this section, we ask the question “does there exist a basis B of Fn such that TA [B, B], the matrix of
the linear transformation TA , is in the simplest possible form
...
In
this section, we show that for a certain class of matrices A, we can ﬁnd a basis B such that TA [B, B] is
a diagonal matrix, consisting of the eigenvalues of A
...
To show the above, we need the following deﬁnition
...
EIGENVALUES, EIGENVECTORS AND DIAGONALIZATION

Deﬁnition 6
...
1 (Matrix Diagonalization) A matrix A is said to be diagonalizable if there exists a nonsingular matrix P such that P −1 AP is a diagonal matrix
...
2
...
, λn
...
Observe that D = diag(λ1 , λ2 ,
...

Example 6
...
3 Let A =

0 1
−1 0

...
Let V = R2
...
1
...
Hence, there does not exist any non-singular 2 × 2 real matrix P such that
P −1 AP is a diagonal matrix
...
In case, V = C2 (C), the two complex eigenvalues of A are −i, i and the corresponding eigenvectors
are (i, 1)t and (−i, 1)t , respectively
...
Deﬁne
i −i
1

...

Theorem 6
...
4 let A be an n×n matrix
...

Proof
...
Then there exist matrices P and D such that
P −1 AP = D = diag(λ1 , λ2 ,
...

Or equivalently, AP = P D
...
, un ]
...

Since ui ’s are the columns of a non-singular matrix P, they are non-zero and so for 1 ≤ i ≤ n, we get
the eigenpairs (di , ui ) of A
...
3
...
, un are linearly independent
...

Conversely, suppose A has n linearly independent eigenvectors ui , 1 ≤ i ≤ n with eigenvalues λi
...
Let P = [u1 , u2 ,
...
Since u1 , u2 ,
...
3
...
Also,
AP

= [Au1 , Au2 ,
...
, λn un ]


λ1 0
0


 0 λ2 0 

...
, un ] 

...

...


...

...

Corollary 6
...
5 let A be an n × n matrix
...
Then A is
diagonalizable
...
2
...
As A is an n×n matrix, it has n eigenvalues
...
1
...
Hence, by Theorem 6
...
4, A is diagonalizable
...
2
...
, λk as its distinct eigenvalues and p(λ) as its
characteristic polynomial
...
Then
A is diagonalizable if and only if dim ker(A − λi I) = mi for each i, 1 ≤ i ≤ k
...

Proof
...
2
...
Also,

k
i=1

mi = n as deg(p(λ)) = n
...
Thus, for each i, 1 ≤ i ≤ k, the homogeneous linear system (A − λi I)x = 0
has exactly mi linearly independent vectors in its solution set
...

Indeed dim ker(A − λi I) = mi for 1 ≤ i ≤ k follows from a simple counting argument
...
Then for each i, 1 ≤ i ≤ k, we can
choose mi linearly independent eigenvectors
...
1
...
Hence A has n =
Hence by Theorem 6
...
4, A is diagonalizable
...

i=1


2 1 1


Example 6
...
7
1
...
Then det(A − λI) = (2 − λ)2 (1 − λ)
...
It is easily seen that 1, (1, 0, −1)t and ( 2, (1, 1, −1)t are the only eigenpairs
...
Hence,
by Theorem 6
...
4, the matrix A is not diagonalizable
...
Let A =  1 2 1 
...
Hence, A has eigenvalues 1, 1, 4
...
Note that the set {(1, −1, 0)t, (1, 0, −1)t } consisting of eigenvectors
corresponding to the eigenvalue 1 are not orthogonal
...
Also, the set {(1, 1, 1), (1, 0, −1), (1, −2, 1)} forms a basis of
 1
1
1 



R
...
2
...
Also, if U = 
3

∗

corresponding unitary matrix then U AU = diag(4, 1, 1)
...
In this case, the eigenvectors are mutually orthogonal
...
This result will be proved later
...
2
...
By ﬁnding the eigenvalues of the following matrices, justify whether or not A = P DP −1
for some real non-singular matrix P and a real diagonal matrix D
...

− sin θ cos θ
sin θ − cos θ

116

CHAPTER 6
...
Let A be an n × n matrix and B an m × m matrix
...
Then show that C is
0 B

diagonalizable if and only if both A and B are diagonalizable
...
Let T : R5 −→ R5 be a linear transformation with rank (T − I) = 3 and
N (T ) = {(x1 , x2 , x3 , x4 , x5 ) ∈ R5 | x1 + x4 + x5 = 0, x2 + x3 = 0}
...

4
...
Show that A cannot be diagonalized
...
2
...
]
5
...
3

following matrices diagonalizable?



3 2 1
1 0 −1

2 3 1


 , ii) 0 1 0  ,
0 −1 1
0 0 2
0 0 4


1

iii) 0
0


−3 3

−5 6
...
We
will also be dealing with matrices having complex entries and hence for a matrix A = [aij ], recall the
following deﬁnitions
...
3
...

1
...

2
...

(b) a unitary matrix if A A∗ = A∗ A = In
...

(d) a normal matrix if A∗ A = AA∗
...
A square matrix A with real entries is called
(a) a symmetric matrix if At = A
...

(c) a skew-symmetric matrix if At = −A
...
Each of these matrices are normal
...

Example 6
...
2

1
...
Then B is skew-Hermitian
...
3
...
Let A = √2
and B =
i 1
−1
√
that 2A is also a normal matrix
...
Then A is a unitary matrix and B is a normal matrix
...
3
...
They are called unitarily
equivalent if there exists a unitary matrix U such that A = U ∗ BU
...
3
...
Let A be any matrix
...

2

1
2 (A

+ A∗ ) is the

2
...

3
...

4
...

0 0
3
0 0 3
Proposition 6
...
5 Let A be an n × n Hermitian matrix
...

Proof
...
Then Ax = λx and A = A∗ implies
x∗ A = x∗ A∗ = (Ax)∗ = (λx)∗ = λx∗
...

But x is an eigenvector and hence x = 0 and so the real number x
λ = λ
...

2

= x∗ x is non-zero as well
...
3
...
Then A is unitarily diagonalizable
...

In other words, the eigenvectors of A form an orthonormal basis of Cn
...
We will prove the result by induction on the size of the matrix
...
Let the result be true for n = k − 1
...
So, let A be a k × k
matrix and let (λ1 , x) be an eigenpair of A with x = 1
...
, uk } (using Gram-Schmidt Orthogonalisation) of Ck
...
, uk } is an orthonormal set,

u∗ x = 0 for all i = 2, 3,
...

i
Therefore, observe that for all i, 2 ≤ i ≤ k,
(Aui )∗ x = (ui ∗ A∗ )x = u∗ (A∗ x) = u∗ (Ax) = u∗ (λ1 x) = λ1 (u∗ x) = 0
...
EIGENVALUES, EIGENVECTORS AND DIAGONALIZATION

Hence, we also have x∗ (Aui ) = 0 for 2 ≤ i ≤ k
...
, uk as
columns of U1 )
...
 [λ1 x Au2 · · · Auk ] = 

...



...



...

,
= 


...

0

···
···

...

···


x∗ Auk

u∗ (Auk )
2


...



...
As the matrix U1 is unitary, U1 = U1
...
This condition, together with the fact that λ1 is a real number (use Proposition 6
...
5), implies that B ∗ = B
...
Therefore, by induction
hypothesis there exists a (k − 1) × (k − 1) unitary matrix U2 such that
−1
U2 BU2 = D2 = diag(λ2 ,
...

Recall that , the entries λi , for 2 ≤ i ≤ k are the eigenvalues of the matrix B
...
Hence, the eigenvalues of A are λ1 , λ2 ,
...
Deﬁne
1 0
U = U1

...

D2

Thus, U −1 AU is a diagonal matrix with diagonal entries λ1 , λ2 ,
...
Hence,
the result follows
...
3
...
Then
1
...
the corresponding eigenvectors can be chosen to have real entries, and
3
...

Proof
...
Hence, by Proposition 6
...
5, the eigenvalues
of A are all real
...
Suppose xt ∈ Cn
...
So,
Ax = λx =⇒ A(y + iz) = λ(y + iz)
...
3
...
Thus, we can choose the
eigenvectors to have real entries
...
3
...

Exercise 6
...
8
1
...
Then all the eigenvalues of A are either zero or
purely imaginary
...

[Hint: Carefully study the proof of Theorem 6
...
6
...
Let A be an n × n unitary matrix
...

(b) the columns of A form an orthonormal basis of Cn
...

(d) for any vector x ∈ Cn×1 ,

Ax = x
...

(f) the eigenvectors x, y corresponding to distinct eigenvalues λ and µ satisfy x, y = 0
...

3
...
Then, show that if (λ, x) is an eigenpair for A then (λ, x) is an eigenpair
for A∗
...
Is it possible to ﬁnd a unitary
0 4
−4 −2
matrix U such that A = U ∗ BU ?

4
...
Let A be a 2 × 2 orthogonal matrix
...

cos θ

(b) if det A = −1, then there exists a basis of R2 in which the matrix of A looks like

1 0

...
Describe all 2 × 2 orthogonal matrices
...
Let A = 1 2 1
...

1 1 2

8
...
Then prove the following:
(a) if det(A) = 1, then A is a rotation about a ﬁxed axis, in the sense that A has an eigenpair (1, x)
such that the restriction of A to the plane x⊥ is a two dimensional rotation of x⊥
...

10
9
4 4
and B =
−4 −2
0 4
are similar but not unitarily equivalent, whereas unitary equivalence implies similarity equivalence as
U ∗ = U −1
...
The main reasons being:
Remark 6
...
9 In the previous exercise, we saw that the matrices A =

120

CHAPTER 6
...
Exercise 6
...
8
...

2
...

3
...

We next prove the Schur’s Lemma and use it to show that normal matrices are unitarily diagonalizable
...
3
...

Proof
...
The result is clearly true
if n = 1
...
we will prove the result in case n = k
...
Now the linearly independent set {x} is
extended, using the Gram-Schmidt Orthogonalisation, to get an orthonormal basis {x, u2 , u3 ,
...

Then U1 = [x u2 · · · uk ] (with x, u2 ,
...
 [λ1 x Au2 · · · Auk ] = 
...


...


...
By induction hypothesis there exists a (k − 1) × (k − 1) unitary
−1
matrix U2 such that U2 BU2 is an upper triangular matrix with diagonal entries λ2 ,
...
Observe that since the eigenvalues of B are λ2 ,
...
, λk
...
Then check that U is a unitary matrix and U −1 AU is an upper
0 U2
triangular matrix with diagonal entries λ1 , λ2 ,
...
Hence, the result
follows
...
3
...
Let A be an n × n real invertible matrix
...




√ 
2
1 1 1
2 −1




2
...
Hence, conclude that the upper triangular matrix obtained in the
√
0 0
2
”Schur’s Lemma” need not be unique
...
Show that the normal matrices are diagonalizable
...

Remark 6
...
12 (The Spectral Theorem for Normal Matrices) Let A be an n × n normal
matrix
...
, xn } of
Cn (C) such that Axi = λi xi for 1 ≤ i ≤ n
...
4
...
Let A be a normal matrix
...

We end this chapter with an application of the theory of diagonalization to the study of conic sections
in analytic geometry and the study of maxima and minima in analysis
...
4

Sylvester’s Law of Inertia and Applications

Deﬁnition 6
...
1 (Bilinear Form) Let A be a n × n matrix with real entries
...
, xn )t , y = (y1 , y2 ,
...

i,j=1

Observe that if A = I (the identity matrix) then the bilinear form reduces to the standard real inner
product
...
, n
...

Deﬁnition 6
...
2 (Sesquilinear Form) Let A be a n × n matrix with complex entries
...
, xn )t , y = (y1 , y2 ,
...

i,j=1

Note that if A = I (the identity matrix) then the sesquilinear form reduces to the standard complex
inner product
...
Also, if we want H(x, y) = H(y, x) then the matrix A need to be an
Hermitian matrix
...

The expression Q(x, x) is called the quadratic form and H(x, x) the Hermitian form
...
It can be easily shown that for any
choice of x, the Hermitian form H(x) is a real number
...
4
...
, xn )t , and A = [aij ]
...
Then check that A is an Hermitian matrix and for x = (x1 , x2 )t ,
2

the Hermitian form
1
2−i
2+i
2

x1
x2

=

x∗ Ax = (x1 , x2 )

=

H(x)

x1 x1 + 2x2 x2 + (2 − i)x1 x2 + (2 + i)x2 x1

=

|x1 |2 + 2|x2 |2 + 2Re[(2 − i)x1 x2 ]

where ‘Re’ denotes the real part of a complex number
...
Why?

122

CHAPTER 6
...
Note that if we replace x by cx, where c is any complex number, then H(x) simply gets
multiplied by |c|2 and hence one needs to study only those x for which x = 1, i
...
, x is a normalised
vector
...
3
...
3 one knows that if A = A∗ (A is Hermitian) then there exists a unitary matrix
U such that U ∗ AU = D (D = diag(λ1 , λ2 ,
...
So, taking z = U ∗ x (i
...
, choosing zi ’s as linear combination of xj ’s with coeﬃcients
coming from the entries of the matrix U ∗ ), one gets
n
∗

∗

∗

n
2

∗

H(x) = x Ax = z U AU z = z Dz =
i=1

λi |zi | =

∗

λi
i=1

2

n

uji xj

...
4
...
Also, for 1 ≤ i ≤ n,

uji ∗ xj represents the principal axes of the conic

j=1

that they represent in the n-dimensional space
...
4
...
One can easily show that there are more than one way of writing H(x) as sum of
squares
...

This question is answered by ‘Sylvester’s law of inertia’ which we state as the next lemma
...
4
...
, yr are linearly independent linear forms in x1 , x2 ,
...

Proof
...
4
...
Need to show that p
and r are uniquely given by A
...

Since, y = (y1 , y2 ,
...
, zn )t are linear combinations of x1 , x2 ,
...
Choose yp+1 = yp+2 = · · · = yr = 0
...
6
...
, yp such that z1 = z2 = · · · = zq = 0
...

Now, this can hold only if y1 = y2 = · · · = yp = 0, which gives a contradiction
...

Similarly, the case r > s can be resolved
...

We complete this chapter by understanding the graph of
ax2 + 2hxy + by 2 + 2f x + 2gy + c = 0
for a, b, c, f, g, h ∈ R
...

6
...
SYLVESTER’S LAW OF INERTIA AND APPLICATIONS

123

Example 6
...
5 Sketch the graph of 3x2 + 4xy + 3y 2 = 5
...

y

1
√
2
1
√
2

1
√
2
1
− √2

3 2
are (5, (1, 1)t ), (1, (1, −1)t )
...

...

Thus the given graph reduces to
5u2 + v 2 = 5 or equivalently u2 +

v2
= 1
...
That is, the principal
axes are
y + x = 0 and x − y = 0
...

S1

S2

Figure 6
...
EIGENVALUES, EIGENVECTORS AND DIAGONALIZATION

Deﬁnition 6
...
6 (Associated Quadratic Form) Let ax2 + 2hxy + by 2 + 2gx + 2f y + c = 0 be the equation
of a general conic
...

We now consider the general conic
...

Proposition 6
...
7 Consider the general conic
ax2 + 2hxy + by 2 + 2gx + 2f y + c = 0
...
an ellipse if ab − h2 > 0,
2
...
a hyperbola if ab − h2 < 0
...
Let A =

a
h

h

...

y

As A is a symmetric matrix, by Corollary 6
...
7, the eigenvalues λ1 , λ2 of A are both real, the corresponding eigenvectors u1 , u2 are orthonormal and A is unitarily diagonalizable with
A=

Let

u
= u1 u2
v

ut
1
ut
2

λ1
0

0
λ2

u1 u2
...
4
...
Then
y
ax2 + 2hxy + by 2 = λ1 u2 + λ2 v 2

and the equation of the conic section in the (u, v)-plane, reduces to
λ1 u2 + λ2 v 2 + 2g1 u + 2f1 v + c = 0
...
λ1 = 0 = λ2
...
4
...
Thus, the given conic reduces to a straight line
2g1 u + 2f1 v + c = 0 in the (u, v)-plane
...
λ1 = 0, λ2 = 0
...

(a) If d2 = d3 = 0, then in the (u, v)-plane, we get the pair of coincident lines v = −d1
...
4
...

d3

...
If λ2 · d3 < 0, the solution set corresponding to the given conic is an empty set
...
If λ2 · d3 > 0, then we get a pair of parallel lines v = −d1 ±

(c) If d2 = 0
...

Also, observe that λ1 = 0 implies that the det(A) = 0
...

3
...

Let λ2 = −α2
...

In this case, we have the following:
(a) suppose d3 = 0
...

The terms on the left can be written as product of two factors as λ1 , α2 > 0
...

(b) suppose d3 = 0
...
So, the equation of the conic reduces to
λ1 (u + d1 )2
α2 (v + d2 )2
−
= 1
...

As λ1 λ2 < 0, we have
ab − h2 = det(A) = λ1 λ2 < 0
...
λ1 , λ2 > 0
...

we now consider the following cases:
(a) suppose d3 = 0
...

(b) suppose d3 < 0
...
Hence, we do not get any
real ellipse in the (u, v)-plane
...
In this case, the equation of the conic reduces to
λ1 (u + d1 )2
α2 (v + d2 )2
+
= 1
...

126

CHAPTER 6
...

Remark 6
...
8 Observe that the condition
u
= u1 u2
v

x
y

implies that the principal axes of the conic are functions of the eigenvectors u1 and u2
...
4
...
x2 + 2xy + y 2 − 6x − 10y = 3
...
2x2 + 6xy + 3y 2 − 12x − 6y = 5
...
4x2 − 4xy + 2y 2 + 12x − 8y = 10
...
2x2 − 6xy + 5y 2 − 10x + 4y = 7
...

Let
ax2 + by 2 + cz 2 + 2dxy + 2exz + 2f yz + 2lx + 2my + 2nz + q = 0

(6
...
3)

be a general quadric
...
The steps are:
1
...

z

2
...

3
...
Then writing yt = (y1 , y2 , y3 ), the equation (6
...
3) reduces to
2
2
2
λ1 y1 + λ2 y2 + λ3 y3 + 2l1 y1 + 2l2 y2 + 2l3 y3 + q ′ = 0

(6
...
4)

where λ1 , λ2 , λ3 are the eigenvalues of A
...
Complete the squares, if necessary, to write the equation (6
...
4) in terms of the variables z1 , z2 , z3
so that this equation is in the standard form
...
Use the condition y = P t x to determine the centre and the planes of symmetry of the quadric in
terms of the original system
...
4
...
4
...


 
2 1 1
4


 
Solution: In this case, A = 1 2 1 and b = 2 and q = 2
...
So, the equation of the quadric reduces to
1
−2
√
√
0 0 1
0
3

6

10
2
2
2
2
2
4y1 + y2 + y3 + √ y1 + √ y2 − √ y3 + 2 = 0
...

12
4 3
2
6

So, the equation of the quadric in standard form is
2
2
2
4z1 + z2 + z3 =

9
,
12

−1 1
√
where the point (x, y, z)t = P ( 4−53 , √2 , √6 )t = ( −3 , 1 , −3 )t is the centre
...

128

CHAPTER 6
...
1

Introduction and Preliminaries

There are many branches of science and engineering where diﬀerential equations naturally arise
...
In this context, the
study of diﬀerential equations assumes importance
...
Without
spending more time on motivation, (which will be clear as we go along) let us start with the following
notations
...
The derivatives of
y (with respect to x) are denoted by
y′ =

d2 y
d(k) y
dy ′′
, y = 2 ,
...

The independent variable will be deﬁned for an interval I; where I is either R or an interval a < x <
b ⊂ R
...

Deﬁnition 7
...
1 (Ordinary Diﬀerential Equation, ODE) An equation of the form
f x, y, y ′ ,
...
1
...
Also,
the unknown function y is to be determined
...
1
...
1
...
, y (n) = 0, and the interval I is not
mentioned in most of the examples
...
y ′ = 6 sin x + 9;
2
...

√ ′ √
y = x + cos y;
2

4
...

5
...

132

CHAPTER 7
...
y ′′ + y = 0
...
y (3) = 0
...
y ′′ + m sin y = 0
...
1
...

In Example 7
...

Deﬁnition 7
...
4 (Solution) A function y = f (x) is called a solution of a diﬀerential equation on I if
1
...
y satisﬁes the diﬀerential equation for all x ∈ I
...
1
...
Show that y = ce−2x is a solution of y ′ + 2y = 0 on R for a constant c ∈ R
...
By direct diﬀerentiation we have y ′ = −2ce−2x = −2y
...
Show that for any constant a ∈ R, y =

a
is a solution of
1−x
(1 − x)y ′ − y = 0

on (−∞, 1) or on (1, ∞)
...

Solution: It can be easily checked
...
1
...
A solution of the form y = g(x) is
called an explicit solution
...

Remark 7
...
7 Since the solution is obtained by integration, we may expect a constant of integration
(for each integration) to appear in a solution of a diﬀerential equation
...

To start with, let us try to understand the structure of a ﬁrst order diﬀerential equation of the form
f (x, y, y ′ ) = 0

(7
...
2)

and move to higher orders later
...
1
...
1
...
1
...

Remark 7
...
9 The family of functions {y(
...
In other words, a general solution of Equation (7
...
2) is nothing but a one
parameter family of solutions of the Equation (7
...
2)
...
1
...
Show that for each k ∈ R, y = kex is a solution of y ′ = y
...
Here the parameter is k
...

7
...
INTRODUCTION AND PRELIMINARIES

133

2
...

Solution: This family is represented by the implicit relation
(x − 1)2 + y 2 = a2 ,

(7
...
3)

where a is a real constant
...

dx

(7
...
4)

The function y satisfying Equation (7
...
3) is a one parameter family of solutions or a general solution
of Equation (7
...
4)
...
Consider the one parameter family of circles with center at (c, 0) and unit radius
...
1
...
Show that y satisﬁes yy ′ + y 2 = 1
...

Now, eliminating c from the two equations, we get
(yy ′ )2 + y 2 = 1
...
1
...
2, we see that y is not deﬁned explicitly as a function of x but implicitly deﬁned
1
by Equation (7
...
3)
...
1
...
2
...

Let us now look at some geometrical interpretations of the diﬀerential Equation (7
...
2)
...
1
...
For instance, let us ﬁnd
1
x
the equation of the curve passing through (0, ) and whose slope at each point (x, y) is −
...

dx
4y
2
It is easy to verify that y satisﬁes the equation x2 + 4y 2 = 1
...
1
...
Find the order of the following diﬀerential equations:

(a) y 2 + sin(y ′ ) = 1
...

(c) (y ′ )3 + y ′′ − 2y 4 = −1
...
Find a diﬀerential equation satisﬁed by the given family of curves:
(a) y = mx, m real (family of lines)
...

(c) x = r2 cos θ, y = r2 sin θ, θ is a parameter of the curve and r is a real number (family of circles
in parametric representation)
...
Find the equation of the curve C which passes through (1, 0) and whose slope at each point (x, y) is
−x

...
DIFFERENTIAL EQUATIONS

7
...
But there are special cases of the function f for which the
above equation can be solved
...

(7
...
1)

Equation (7
...
1) is called a Separable Equation
...
2
...

g(y) dx
Integrating with respect to x, we get
H(x) =

1 dy
dy =
g(y) dx

h(x)dx =

dy
= G(y) + c,
g(y)

where c is a constant
...

Example 7
...
1
1
...

Solution: Here, g(y) = y (y − 1) and h(x) = 1
...

By using partial fractions and integrating, we get
y=

1
,
1 − ex+c

where c is a constant of integration
...
Solve y ′ = y 2
...

x+c
Observe that the solution is deﬁned, only if x + c = 0 for any x
...

ax − 1
Solution: It is easy to deduce that y = −

7
...
1

Equations Reducible to Separable Form

There are many equations which are not of the form 7
...
1, but by a suitable substitution, they can be
reduced to the separable form
...

In this case, we use the substitution, y = xu(x) to get y ′ = xu′ + u
...
For illustration, we consider some examples
...
2
...
2
...
Find the general solution of 2xyy ′ − y 2 + x2 = 0
...
Then
y
y
2 y ′ − ( )2 + 1 = 0
...

2 dx
1+u
x

On integration, we get
1 + u2 =

c
x

or
x2 + y 2 − cx = 0
...

2
4
c
c
This represents a family of circles with center ( 2 , 0) and radius 2
...
Find the equation of the curve passing through (0, 1) and whose slope at each point (x, y) is − 2y
...

dx
2y
Notice that it is a separable equation and it is easy to verify that y satisﬁes x2 + 2y 2 = 2
...
The equations of the type

dy
a1 x + b 1 y + c1
=
dx
a2 x + b 2 y + c2
can also be solved by the above method by replacing x by x + h and y by y + k, where h and k are to
be chosen such that
a1 h + b 1 k + c1 = 0 = a2 h + b 2 k + c2
...
Thus, if x = 0 then the
dx
a2 x + b 2 y
y
′
equation reduces to the form y = g( x )
...
2
...
Find the general solutions of the following:

dy
= −x(ln x)(ln y)
...

dx
2
...

dr
dy
x+y
(b) xe
=
, y(0) = 0
...
Obtain the general solutions of the following:
(a)

y
dy
(a) {y − xcosec ( )} = x
...

dy
x−y+2
(c)
=

...
Solve y ′ = y − y 2 and use it to determine lim y
...

x−
→∞

136

7
...
DIFFERENTIAL EQUATIONS

Exact Equations

As remarked, there are no general methods to ﬁnd a solution of Equation (7
...
2)
...
In this section, we introduce this concept
...
Consider an
equation
dy
= 0, (x, y) ∈ D
...
3
...

(7
...
2)

Deﬁnition 7
...
1 (Exact Equation) Equation (7
...
1) is called Exact if there exists a real valued twice continuously diﬀerentiable function f : R2 −→R (or the domain is an open subset of R2 ) such that
∂f
∂f
= M and
= N
...
3
...
3
...
3
...

∂x
∂y dx
dx
This implies that f (x, y) = c (where c is a constant) is an implicit solution of Equation (7
...
1)
...
3
...

dy
Example 7
...
3 The equation y + x dx = 0 is an exact equation
...

The proof of the next theorem is given in Appendix 14
...
2
...
3
...
The Equation
(7
...
1) is exact if and only if
∂M
∂N
=

...
3
...
3
...
3
...

Let us consider some examples, where Theorem 7
...
4 can be used to easily ﬁnd the general solution
...
3
...
Solve
2xey + (x2 ey + cos y )

dy
= 0
...

∂y
∂x

Therefore, the given equation is exact
...

∂y

7
...
EXACT EQUATIONS

137

The ﬁrst partial diﬀerentiation when integrated with respect to x (assuming y to be a constant) gives,
G(x, y) = x2 ey + h(y)
...
Thus, the general solution of
dy
the given equation is
x2 ey + sin y = c
...

2
...
Also, ﬁnd its general solution
...

∂y
∂x

Hence for the given equation to be exact, m = 2ℓ
...

dx
This equation is not meaningful if ℓ = 0
...

3
...

Solution: Here
M = 3x2 ey − x2 and N = x3 ey + y 2
...
Thus the given equation is exact
...
To determine h(y), we partially diﬀerentiate G(x, y) with respect to y and
3
compare with N to get h(y) = y
...

x3
y3
+
=c
3
3

138

CHAPTER 7
...
3
...
But the above equation may become exact, if we multiply it by a proper factor
...
But, if we multiply it with e−x , then the equation reduces to
e−x ydx − e−x dy = 0, or equivalently d e−x y = 0,
an exact equation
...
Formally
Deﬁnition 7
...
6 (Integrating Factor) A function Q(x, y) is called an integrating factor for the Equation
(7
...
1), if the equation
Q(x, y)M (x, y)dx + Q(x, y)N (x, y)dy = 0
is exact
...
3
...
Solve the equation ydx − xdy = 0, x, y > 0
...
Multiplying by xy , the equation
reduces to
1
1
ydx −
xdy = 0, or equivalently d (ln x − ln y) = 0
...
Hence, a general solution of the given equation is
xy
1
G(x, y) =
= c, for some constant c ∈ R
...

2
...

Solution: It can be easily veriﬁed that the given equation is not exact
...
It may be checked that an integrating factor for the given diﬀerential equation is
1
1
=

...
3
...

y x+y
xy x + y

(7
...
6)

Integrating (keeping y constant) Equation (7
...
5), we have
G(x, y) = 4 ln |x| − ln |x + y| + h(y)

(7
...
7)

7
...
EXACT EQUATIONS

139

and integrating (keeping x constant) Equation (7
...
6), we get
G(x, y) = −2 ln |y| − ln |x + y| + g(x)
...
3
...
3
...
3
...
Or equivalently, the solution is
x4 = c x + y y 2
...

Therefore, we suppose that xα y β is an integrating factor for some α, β ∈ R
...

Multiplying the terms M (x, y) and N (x, y) with xα y β , we get
M (x, y) = xα y β 4y 2 + 3xy , and N (x, y) = −xα y β (3xy + 2x2 )
...
That is, the terms
∂y
∂x

4(2 + β)xα y 1+β + 3(1 + β)x1+α y β
and
−3(1 + α)xα y 1+β − 2(2 + α)x1+α y β

y
must be equal
...
That is, the expression 5 is also an
x
integrating factor for the given diﬀerential equation
...

x4
x
Thus, we need h(y) = g(x) = c, for some constant c ∈ R
...

G(x, y) = −

Remark 7
...
8
1
...
3
...
3
...

2
...
3
...

3
...
3
...

4
...

(a) Consider a homogeneous equation M (x, y)dx + N (x, y)dy = 0
...

then

1
Mx + Ny

140

CHAPTER 7
...

(c) The equation M (x, y)dx + N (x, y)dy = 0 has e
1 ∂M
∂N
−
is a function of x alone
...

M
∂y
∂x

g(y)dy

as an integrating factor, if f (x) =

as an integrating factor, if g(y) =

(e) For the equation
yM1 (xy)dx + xN1 (xy)dy = 0
with M x − N y = 0, the function
Exercise 7
...
9

1
is an integrating factor
...
Show that the following equations are exact and hence solve them
...

dθ
y
x
dy
(b) (e−x − ln y + ) + (− + ln x + cos y)
= 0
...
Find conditions on the function g(x, y) so that the equation
(x2 + xy 2 ) + {ax2 y 2 + g(x, y)}

dy
=0
dx

is exact
...
What are the conditions on f (x), g(y), φ(x), and ψ(y) so that the equation
(φ(x) + ψ(y)) + (f (x) + g(y))

dy
=0
dx

is exact
...
Verify that the following equations are not exact
...

dy
= 0
...

dx
dy
(c) y + (x + x3 y 2 )
= 0
...

dx
(a) y + (x + x3 y 2 )

5
...

dx
dy
(b) y(xy + 2x2 y 2 ) + x(xy − x2 y 2 )
= 0 with y(1) = 1
...
4
...
4

141

Linear Equations

Some times we might think of a subset or subclass of diﬀerential equations which admit explicit solutions
...
In this
context, we have a class of equations, called Linear Equations (to be deﬁned shortly) which admit explicit
solutions
...
4
...
The equation
y ′ + p(x)y = q(x), x ∈ I

(7
...
1)

dy

...
4
...

A ﬁrst order equation is called a non-linear equation (in the independent variable) if it is neither a linear
homogeneous nor a non-homogeneous linear equation
...
4
...
The equation y ′ = sin y is a non-linear equation
...
The equation y ′ + y = sin x is a linear non-homogeneous equation
...
The equation y ′ + x2 y = 0 is a linear homogeneous equation
...
Multiplying Equation (7
...
1) by eP (x) ,

a

we get

eP (x) y ′ + eP (x) p(x)y = eP (x) q(x) or equivalently

d P (x)
(e
y) = eP (x) q(x)
...

In other words,
y = ce−P (x) + e−P (x)

eP (x) q(x)dx

(7
...
2)

where c is an arbitrary constant is the general solution of Equation (7
...
1)
...
4
...
4
...

(7
...
3)

a

As a simple consequence, we have the following proposition
...
4
...

(7
...
4)

In particular, when p(x) = k, is a constant, the general solution is y = ce−kx , with c an arbitrary constant
...
DIFFERENTIAL EQUATIONS

Example 7
...
5

1
...
4
...

Hence, P (x) = (−1)dx = −x
...
4
...

We can just use the second part of the above proposition to get the above result, as k = −1
...
The general solution of xy ′ = −y, x ∈ I (0 ∈ I) is y = cx−1 , where c is an arbitrary constant
...

x→0,x>0

A class of nonlinear Equations (7
...
1) (named after Bernoulli (1654 − 1705)) can be reduced to linear
equation
...

(7
...
5)

If a = 0 or a = 1, then Equation (7
...
5) is a linear equation
...
We then deﬁne
u(x) = y 1−a and therefore
u′ = (1 − a)y ′ y −a = (1 − a)(q(x) − p(x)u)
or equivalently
u′ + (1 − a)p(x)u = (1 − a)q(x),

(7
...
6)

a linear equation
...

Example 7
...
6 For m, n constants and m = 0, solve y ′ − my + ny 2 = 0
...
Then u(x) satisﬁes
u′ + mu = n
and its solution is
u = Ae−mx + e−mx
Equivalently
y=

nemx dx = Ae−mx +

1
Ae−mx +

n
m

with m = 0 and A an arbitrary constant, is the general solution
...
4
...
In Example 7
...
6, show that u′ + mu = n
...
Find the genral solution of the following:
(a) y ′ + y = 4
...

(c) y ′ − 2xy = 0
...

(e) y ′ + y = e−x
...

(g) (x2 + 1)y ′ + 2xy = x2
...
Solve the following IVP’s:
(a) y ′ − 4y = 5, y(0) = 0
...

m

7
...
MISCELLANEOUS REMARKS

143

(b) y ′ + (1 + x2 )y = 3, y(0) = 0
...

(d) y ′ − y 2 = 1, y(0) = 0
...

4
...
Then show that
y1 + y2 is a solution of
y ′ + a(x)y = b1 (x) + b2 (x)
...
Reduce the following to linear equations and hence solve:
(a) y ′ + 2y = y 2
...

(c) y ′ sin(y) + x cos(y) = x
...

6
...
5

1
y ′ + 4xy + xy 3 = 0, y(0) = √
...
4, we have learned to solve the linear equations
...
Below, we consider a few classes of equations which can
dy
be solved
...
A word of caution is needed here
...

1
...

(7
...
1)

dy
∂f (x, p) ∂f (x, p) dp
dp
=p=
+
·
of equivalently p = g(x, p,
)
...
5
...
5
...
We now assume that Equation
(7
...
2) can be solved for p and its solution is
h(x, p, c) = 0
...
5
...
5
...
5
...
5
...

Solve y = 2px − xp2
...

dx

dp
)(1 − p) = 0
...
DIFFERENTIAL EQUATIONS
That is, either p2 x = c or p = 1
...

x

The ﬁrst solution is a one-parameter family of solutions, giving us a general solution
...

2
...
If possible we solve for y and we proceed
...
We illustrate it below
...

Solution: We equivalently rewrite the given equation, by (arbitrarily) introducing a new parameter t by
y = a sin t, p = a cos t
from which it follows

dy
dy
dy
= a cos t; p =
=
dt
dx
dt

and so

dx
dt

dx
1 dy
=
= 1 or x = t + c
...

3
...

Find the general solution of x = p3 − p − 1
...
Now, from the given equation, we have
dy
dy dx
=
·
= p(3p2 − 1)
...
The desired solution in this case is in the parametric form, given by
y=

x = t3 − t − 1 and y =

3 4 1 2
t − t +c
4
2

where c is an arbitrary constant
...
5
...
It may not work in all cases
...
Find the general solution of y = (1 + p)x + p2
...
Express the
dp
solution in the parametric form

Exercise 7
...
2

y(p) = (1 + p)x + p2 , x(p) = 2(1 − p) + ce−p
...
Solve the following diﬀerential equations:
(a) 8y = x2 + p2
...
6
...

(c) y 2 log y − p2 = 2xyp
...

(e) 2y = 2x2 + 4px + p2
...
6

Initial Value Problems

As we had seen, there are no methods to solve a general equation of the form
y ′ = f (x, y)

(7
...
1)

and in this context two questions may be pertinent
...
Does Equation (7
...
1) admit solutions at all (i
...
, the existence problem)?
2
...
6
...
But there are partial answers if some
additional restrictions on the function f are imposed
...

For a, b ∈ R with a > 0, b > 0, we deﬁne
S = {(x, y) ∈ R2 : |x − x0 | ≤ a, |y − y0 | ≤ b} ⊂ I × R
...
6
...
The problem
of ﬁnding a solution y of
y ′ = f (x, y), (x, y) ∈ S, x ∈ I with y(x0 ) = y0

(7
...
2)

in a neighbourhood I of x0 (or an open interval I containing x0 ) is called an Initial Value Problem, henceforth
denoted by IVP
...
6
...

Further, we assume that a and b are ﬁnite
...

Such an M exists since S is a closed and bounded set and f is a continuous function and let h =
b
min(a, M )
...

Proposition 7
...
2 A function y is a solution of IVP (7
...
2) if and only if y satisﬁes
x

y = y0 +

f (s, y(s))ds
...
6
...
6
...
Any solution of the IVP (7
...
2) must satisfy the initial condition y(x0 ) = y0
...
6
...

146

CHAPTER 7
...
6
...
6
...
We deﬁne
x

y1 = yo +

f (s, y0 )ds
x0

and for n = 2, 3,
...

As yet we have not checked a few things, like whether the point (s, yn (s)) ∈ S or not
...
To get ourselves motivated, let us apply the above method
to the following IVP
...
6
...

Solution: From Proposition 7
...
2, a function y is a solution of the above IVP if and only if
x

y =1−

y(s)ds
...

0

x

y2 = 1 −

0

(1 − s)ds = 1 − x +

x2

...

2!
3!
n!

Note: The solution of the given IVP is
y = e−x and that

lim yn = e−x
...

We now formalise the above procedure
...
6
...
6
...
For x ∈ I with |x −
x0 | ≤ a, deﬁne inductively
y0 (x)

= y0 and for n = 1, 2,
...

(7
...
4)

x0

Then y0 , y1 ,
...
are called Picard’s successive approximations to the IVP (7
...
2)
...
6
...

Proposition 7
...
5 The Picard’s approximates yn ’s, for the IVP (7
...
2) deﬁned by Equation (7
...
4) is well
b
deﬁned on the interval |x − x0 | ≤ h = min{a, M }, i
...
, for x ∈ [x0 − h, x0 + h]
...
6
...
We have to verify that for each n = 0, 1, 2,
...
This is needed due to the reason that f (s, yn ) appearing as integrand in Equation (7
...
4)
may not be deﬁned
...
For
n = 1, we notice that, if |x − x0 | ≤ h then
|y1 − y0 | ≤ M |x − x0 | ≤ M h ≤ b
...

The rest of the proof is by the method of induction
...

Assume that for k = 1, 2,
...
Now, by deﬁnition of yn , we have
x

yn − y0 =

f (s, yn−1 )ds
...

This shows that (x, yn ) ∈ S whenever |x − x0 | ≤ h
...

Let us again come back to Example 7
...
3 in the light of Proposition 7
...
2
...
6
...

(7
...
5)

Solution: Note that x0 = 0, y0 = 1, f (x, y) = −y, and a = b = 1
...

By Proposition 7
...
2, on this set
M = max{|y| : (x, y) ∈ S} = 2 and h = min{1, 1/2} = 1/2
...
6
...

Observe that the exact solution y = e−x and the approximate solutions yn ’s of Example 7
...
3 exist
1 1
on [−1, 1]
...

2 2
That is, for any IVP, the approximate solutions yn ’s may exist on a larger interval as compared to
the interval obtained by the application of the Proposition 7
...
2
...

Example 7
...
7 Find the Picard’s successive approximations for the IVP
y ′ = f (y), 0 ≤ x ≤ 1, y ≥ 0 and y(0) = 0;
where
f (y) =

√
y for y ≥ 0
...
6
...
DIFFERENTIAL EQUATIONS

Solution: By deﬁnition y0 (x) = y0 ≡ 0 and

x

y1 (x) = y0 +

x

f (y0 )ds = 0 +
0

√
0ds = 0
...
and lim yn (x) ≡ 0
...
6
...

x2
x2
Also y(x) =
, 0 ≤ x ≤ 1 is a solution of Equation (7
...
6) and the {yn }’s do not converge to

...
6
...

The following result is about the existence of a unique solution to a class of IVPs
...

Theorem 7
...
8 (Picard’s Theorem on Existence and Uniqueness) Let S = {(x, y) : |x − x0 | ≤ a, |y −
∂f
y0 | ≤ b}, and a, b > 0
...
Also, let M, K ∈ R
∂y
be constants such that
∂f
|f | ≤ M, | | ≤ K on S
...
Then the sequence of successive approximations {yn } (deﬁned by Equation (7
...
4))
for the IVP (7
...
2) uniformly converges on |x − x0 | ≤ h to a solution of IVP (7
...
2)
...
6
...

Remark 7
...
9 The theorem asserts the existence of a unique solution on a subinterval |x − x0 | ≤ h of
the given interval |x − x0 | ≤ a
...
A natural question is whether the solution exists on the whole
of the interval |x − x0 | ≤ a
...

Whenever we talk of the Picard’s theorem, we mean it in this local sense
...
6
...
Compute the sequence {yn } of the successive approximations to the IVP
y ′ = y (y − 1), y(x0 ) = 0, x0 ≥ 0
...
Show that the solution of the IVP
y ′ = y (y − 1), y(x0 ) = 1, x0 ≥ 0
is y ≡ 1, x ≥ x0
...
The IVP
y′ =

√
y, y(0) = 0, x ≥ 0

x2
has solutions y1 ≡ 0 as well as y2 =
, x ≥ 0
...
Consider the IVP
y ′ = y, y(0) = 1 in {(x, y) : |x| ≤ a, |y| ≤ b}
for any a, b > 0
...
6
...

(b) Show that y = ex is the solution of the IVP which exists on whole of R
...
6
...

7
...
INITIAL VALUE PROBLEMS

7
...
1

149

Orthogonal Trajectories

One among the many applications of diﬀerential equations is to ﬁnd curves that intersect a given family
of curves at right angles
...
It is important to
note that we are not insisting that Γ should intersect every member of F, but if they intersect, the
angle between their tangents, at every point of intersection, is 90◦
...
That is, at the common point of intersection, the tangents are
orthogonal
...

Before procedding to an example, let us note that at the common point of intersection, the product
of the slopes of the tangent is −1
...
This gives the slope at any point
(x, y) and is independent of the choice of the curve
...

Example 7
...
11 Compute the orthogonal trajectories of the family F of curves given by
F :

y 2 = cx3 ,

(7
...
7)

where c is an arbitrary constant
...
6
...

(7
...
8)

Elimination of c between Equations (7
...
7) and (7
...
8), leads to
y′ =

3cx2
3 cx3
3y
=
·
=

...
6
...

3y

Solving this diﬀerential equation, we get
y2 = −

x2
+ c
...

3
Below, we summarize how to determine the orthogonal trajectories
...
6
...
Equation (7
...
10) is obtained by the elimination of
the constant c appearing in F (x, y, c) = 0 “using the equation obtained by diﬀerentiating this equation
with respect to x”
...

f (x, y)

(7
...
11)

Final Step: The general solution of Equation (7
...
11) is the orthogonal trajectories of the given family
...

150

CHAPTER 7
...
6
...
6
...

Solution: Diﬀerentiating Equation (7
...
12), we get y ′ = m
...
6
...
Or equivalently,
y−1
y′ =

...

1−y

(7
...
13)

It can be easily veriﬁed that the general solution of Equation (7
...
13) is
x2 + y 2 − 2y = c,

(7
...
14)

where c is an arbitrary constant
...
6
...
6
...

Exercise 7
...
13
1
...

(a) y = x + c
...

(c) y 2 = x + c
...

(e) x2 − y 2 = c
...
Show that the one parameter family of curves y 2 = 4k(k + x), k ∈ R are self orthogonal
...
Find the orthogonal trajectories of the family of circles passing through the points (1, −2) and (1, 2)
...
7

Numerical Methods

All said and done, the Picard’s Successive approximations is not suitable for computations on computers
...

(7
...
1)

In this section, we study a simple method to ﬁnd the “numerical solutions” of Equation (7
...
1)
...
7
...
What is presented here is
at a very rudimentary level nevertheless it gives a ﬂavour of the numerical method
...
In such case, we have
h2
y(x + h) = y + hy ′ + y ′′ + · · ·
2!

7
...
NUMERICAL METHODS

x0

x1

151

x2

xn = x

Figure 7
...
With this in mind, let us think of ﬁnding y, where y is the solution of
x − x0
and deﬁne
Equation (7
...
1) with x > x0
...
, n
...
, x = xn
...
Deﬁne y1 =
y0 + hf (x0 , y0 )
...

Similarly, we deﬁne y2 = y1 + hf (x1 , y1 ) and we approximate y(x0 + 2h) = y(x2 ) ≃ y1 + hf (x1 , y1 ) = y2
and so on
...
, n − 1
...
, yn is called the Euler’s method
...
7
...
, (xn , yn )
...
2: Approximate Solution

152

CHAPTER 7
...
1

Introduction

Second order and higher order equations occur frequently in science and engineering (like pendulum
problem etc
...
It has its own ﬂavour also
...

Deﬁnition 8
...
1 (Second Order Linear Diﬀerential Equation) The equation
p(x)y ′′ + q(x)y ′ + r(x)y = c(x), x ∈ I

(8
...
1)

is called a second order linear differential equation
...
The functions p(·), q(·), and r(·) are called the coeﬃcients of Equation (8
...
1) and
c(x) is called the non-homogeneous term or the force function
...
1
...

Recall that a second order equation is called nonlinear if it is not linear
...
1
...
The equation
y ′′ +

9
sin y = 0
ℓ

is a second order equation which is nonlinear
...
y ′′ − y = 0 is an example of a linear second order equation
...
y ′′ + y ′ + y = sin x is a non-homogeneous linear second order equation
...
ax2 y ′′ + bxy ′ + cy = 0 c = 0 is a homogeneous second order linear equation
...
Here a, b, and c are real constants
...
1
...
1
...
1
...

Example 8
...
4

1
...

2
...

154

CHAPTER 8
...

Theorem 8
...
5 (Superposition Principle) Let y1 and y2 be two given solutions of
p(x)y ′′ + q(x)y ′ + r(x)y = 0, x ∈ I
...
1
...
1
...

It is to be noted here that Theorem 8
...
5 is not an existence theorem
...
1
...

Deﬁnition 8
...
6 (Solution Space) The set of solutions of a diﬀerential equation is called the solution space
...
1
...
Note that y(x) ≡ 0 is
also a solution of Equation (8
...
2)
...
1
...
A
moments reﬂection on Theorem 8
...
5 tells us that the solution space of Equation (8
...
2) forms a real
vector space
...
1
...
That
is, the solution space of a homogeneous linear diﬀerential equation is a real vector space
...
This question will be answered in a sequence
of results stated below
...

Deﬁnition 8
...
8 (Linear Dependence and Linear Independence) Let I be an interval in R and let f, g :
I −→ R be continuous functions
...

The functions f (·), g(·) are said to be linearly independent if f (·), g(·) are not linear dependent
...
1
...

In other words, we consider a homogeneous linear equation
y ′′ + q(x)y ′ + r(x)y = 0, x ∈ I,

(8
...
3)

where q and r are real valued continuous functions deﬁned on I
...
1
...

Theorem 8
...
9 (Picard’s Theorem on Existence and Uniqueness) Consider the Equation (8
...
3) along
with the conditions
y(x0 ) = A, y ′ (x0 ) = B, for some x0 ∈ I
(8
...
4)
where A and B are prescribed real constants
...
1
...
1
...

A word of Caution: Note that the coefficient of y ′′ in Equation (8
...
3) is 1
...
1
...

An important application of Theorem 8
...
9 is that the equation (8
...
3) has exactly 2 linearly independent solutions
...

8
...
INTRODUCTION

155

Theorem 8
...
10 Let q and r be real valued continuous functions on I
...
1
...
Moreover, if y1 and y2 are two linearly independent solutions of Equation
(8
...
3), then the solution space is a linear combination of y1 and y2
...
Let y1 and y2 be two unique solutions of Equation (8
...
3) with initial conditions
′
y1 (x0 ) = 1, y1 (x0 ) = 0,

′
and y2 (x0 ) = 0, y2 (x0 ) = 1 for some x0 ∈ I
...
1
...
1
...
We now claim that y1 and y2 are
linearly independent
...
1
...
If we can show that the only solution for the system (8
...
6) is α = β = 0,
then the two solutions y1 and y2 will be linearly independent
...
Hence the
result follows
...
1
...
Let ζ be
any solution of Equation (8
...
3) and let d1 = ζ(x0 ) and d2 = ζ ′ (x0 )
...

By Deﬁnition 8
...
3, φ is a solution of Equation (8
...
3)
...
So, φ
and ζ are two solution of Equation (8
...
3) with the same initial conditions
...
1
...

Thus, the equation (8
...
3) has two linearly independent solutions
...
1
...

1
...
1
...
The solutions y1 and y2 corresponding to the initial conditions
′
y1 (x0 ) = 1, y1 (x0 ) = 0,

′
and y2 (x0 ) = 0, y2 (x0 ) = 1 for some x0 ∈ I,

are called a fundamental system of solutions for Equation (8
...
3)
...
Note that the fundamental system for Equation (8
...
3) is not unique
...
Let {y1 , y2 } be a fundamental
c d
system for the diﬀerential Equation 8
...
3 and yt = [y1 , y2 ]
...
1
...
That is, if {y1 , y2 } is a fundamental
cy1 + dy2
system for Equation 8
...
3 then {ay1 + by2 , cy1 + dy2 } is also a fundamental system whenever
ad − bc = det(A) = 0
...
1
...

Note that {1 − x, 1 + x} is also a fundamental system
...

1 1

Exercise 8
...
13
1
...

156

CHAPTER 8
...

(b) y ′′ + (y ′ )2 + y sin x = 0
...

(d) (x2 + 1)y ′′ + (x2 + 1)2 y ′ − 5y = sin x
...
By showing that y1 = ex and y2 = e−x are solutions of
y ′′ − y = 0
conclude that sinh x and cosh x are also solutions of y ′′ − y = 0
...
Given that {sin x, cos x} forms a basis for the solution space of y ′′ + y = 0, ﬁnd another basis
...
2

More on Second Order Equations

In this section, we wish to study some more properties of second order equations which have nice
applications
...

Deﬁnition 8
...
1 (General Solution) Let y1 and y2 be a fundamental system of solutions for
y ′′ + q(x)y ′ + r(x)y = 0, x ∈ I
...
2
...
2
...
Note that y is also a solution of Equation (8
...
1)
...
2
...

8
...
1

Wronskian

In this subsection, we discuss the linear independence or dependence of two solutions of Equation (8
...
1)
...
2
...
For x ∈ I, deﬁne
W (y1 , y2 )

:=
=

y1
y2

′
y1
′
y2

′
′
y1 y2 − y1 y2
...

Example 8
...
3

1
...
Then
W (y1 , y2 ) =

sin x
cos x

cos x
≡ −1 for all x ∈ I
...

(8
...
2)

8
...
MORE ON SECOND ORDER EQUATIONS

157

′
′
2
...
Let us now compute y1 and y2
...

Therefore, for x ≥ 0,

y1
y2

W (y1 , y2 ) =

′
y1
x3
= 3
′
y2
x

3x2
=0
3x2

and for x < 0,
W (y1 , y2 ) =

y1
y2

′
y1
−x3
=
′
y2
x3

−3x2
= 0
...

It is also easy to note that y1 , y2 are linearly independent on (−1, 1)
...

Given two solutions y1 and y2 of Equation (8
...
1), we have a characterisation for y1 and y2 to be
linearly independent
...
2
...
Let y1 and y2 be two solutions of Equation (8
...
1)
...
Then for any x ∈ I,
x

W (y1 , y2 ) = W (y1 , y2 )(x0 ) exp(−

q(s)ds)
...
2
...

Proof
...

So
d
W (y1 , y2 )
dx

′′
′′
= y1 y2 − y1 y2

(8
...
4)

′
′
= y1 (−q(x)y2 − r(x)y2 ) − (−q(x)y1 − r(x)y1 ) y2

(8
...
5)

′
y1 y2

(8
...
6)

= −q(x)W (y1 , y2 )
...
2
...

x0

This completes the proof of the ﬁrst part
...
Alternatively, W (y1 , y2 ) satisﬁes a ﬁrst order linear homogeneous equation and therefore
W (y1 , y2 ) ≡ 0 if and only if W (y1 , y2 )(x0 ) = 0
...
2
...
If the Wronskian W (y1 , y2 ) of two solutions y1 , y2 of (8
...
1) vanish at a point
x0 ∈ I, then W (y1 , y2 ) is identically zero on I
...
SECOND ORDER AND HIGHER ORDER EQUATIONS

2
...
2
...

Theorem 8
...
6 Let y1 and y2 be any two solutions of Equation (8
...
1)
...
Then y1
and y2 are linearly independent on I if and only if W (y1 , y2 )(x0 ) = 0
...
Let y1 , y2 be linearly independent on I
...

Suppose not
...
So, by Theorem 2
...
1 the equations
′
′
c1 y1 (x0 ) + c2 y2 (x0 ) = 0 and c1 y1 (x0 ) + c2 y2 (x0 ) = 0

(8
...
8)

′
′
admits a non-zero solution d1 , d2
...
)
Let y = d1 y1 + d2 y2
...
2
...

Therefore, by Picard’s Theorem on existence and uniqueness of solutions (see Theorem 8
...
9), the solution y ≡ 0 on I
...
That is, y1 , y2 is linearly
dependent on I
...
Therefore, W (y1 , y2 )(x0 ) = 0
...

Suppose that W (y1 , y2 )(x0 ) = 0 for some x0 ∈ I
...
2
...
Suppose that c1 y1 (x) + c2 y2 (x) = 0 for all x ∈ I
...

Since x0 ∈ I, in particular, we consider the linear system of equations
′
′
c1 y1 (x0 ) + c2 y2 (x0 ) = 0 and c1 y1 (x0 ) + c2 y2 (x0 ) = 0
...
2
...
6
...
2
...
So, by Deﬁnition 8
...
8, y1 , y2 are linearly independent
...
2
...
The interval I = (−1, 1)
...
y1 = x2 |x|, y2 = x3 and W (y1 , y2 ) ≡ 0 for all x ∈ I
...
The functions y1 and y2 are linearly independent
...
2
...
2
...

The following corollary is a consequence of Theorem 8
...
6
...
2
...
2
...
Let y be any solution
of Equation (8
...
1)
...

Proof
...
Let y(x0 ) = a, y ′ (x0 ) = b
...

Also for any x0 ∈ I, by Theorem 8
...
6, W (y1 , y2 )(x0 ) = 0 as y1 , y2 are linearly independent solutions of
Equation (8
...
1)
...
6
...
2
...

Deﬁne ζ(x) = d1 y1 + d2 y2 for x ∈ I
...
2
...
Hence, by Picard’s Theorem on existence and uniqueness (see Theorem 8
...
9), ζ = y for all
x ∈ I
...

8
...
MORE ON SECOND ORDER EQUATIONS
Exercise 8
...
9
W (y1 , y2 )
...
Let y1 and y2 be any two linearly independent solutions of y ′′ + a(x)y = 0
...
Let y1 and y2 be any two linearly independent solutions of
y ′′ + a(x)y ′ + b(x)y = 0, x ∈ I
...

3
...
[Hint: Use Exercise 8
...
9
...
]

8
...
2

Method of Reduction of Order

We are going to show that in order to ﬁnd a fundamental system for Equation (8
...
1), it is suﬃcient to
have the knowledge of a solution of Equation (8
...
1)
...
2
...
2
...
2
...
The method is described below and is usually called the
method of reduction of order
...
2
...
Assume that y2 = u(x)y1 is a solution
of Equation (8
...
1), where u is to be determined
...
2
...

By letting u′ = v, and observing that y1 is a solution of Equation (8
...
1), we have
′
v ′ y1 + v(2y1 + py1 ) = 0

which is same as

d
2
2
(vy1 ) = −p(vy1 )
...

Substituting v = u′ and integrating we get
x

u=
x0

1
2 (s) exp(−
y1

s
x0

p(t)dt)ds, x0 ∈ I

and hence a second solution of Equation (8
...
1) is
x

y2 = y1
x0

1

exp(−
2
y1 (s)

s

p(t)dt)ds
...
That is, {y1 , y2 } form a fundamental system for Equation (8
...
1)
...

160

CHAPTER 8
...
2
...
2
...
2
...

4
Solution: With the notations used above, note that x0 = 1, p(x) = , and y2 (x) = u(x)y1 (x), where u
x
is given by
x

u =
1
x

=
1
x

=
1

s
1
exp −
p(t)dt ds
2
y1 (s)
1
1
4
2 (s) exp ln(s ) ds
y1
s2
1
ds = 1 − ;
4
s
x

where A and B are constants
...

x x2
1
1
1
1
Since the term
already appears in y1 , we can take y2 = 2
...
2
...

y2 (x) =

Exercise 8
...
11 In the following, use the given solution y1 , to ﬁnd another solution y2 so that the two
solutions y1 and y2 are linearly independent
...
y ′′ = 0, y1 = 1, x ≥ 0
...
y ′′ + 2y ′ + y = 0, y1 = ex , x ≥ 0
...
x2 y ′′ − xy ′ + y = 0, y1 = x, x ≥ 1
...
xy ′′ + y ′ = 0, y1 = 1, x ≥ 1
...
y ′′ + xy ′ − y = 0, y1 = x, x ≥ 1
...
3

Second Order equations with Constant Coeﬃcients

Deﬁnition 8
...
1 Let a and b be constant real numbers
...
3
...

Let us assume that y = eλx to be a solution of Equation (8
...
1) (where λ is a constant, and is to be
determined)
...

It is easy to note that
L(eλx ) = p(λ)eλx
...
3
...

(8
...
2)

8
...
SECOND ORDER EQUATIONS WITH CONSTANT COEFFICIENTS

161

Equation (8
...
2) is called the characteristic equation of Equation (8
...
1)
...
3
...

Case 1: Let λ1 , λ2 be real roots of Equation (8
...
2) with λ1 = λ2
...
3
...
That is, {eλ1 x , eλ2 x } forms a fundamental system of solutions of Equation (8
...
1)
...

Then p′ (λ1 ) = 0
...

dx
But p′ (λ1 ) = 0 and therefore,
L(xeλ1 x ) = 0
...
3
...
In this case, we have
a fundamental system of solutions of Equation (8
...
1)
...
3
...

So, α − iβ is also a root of Equation (8
...
2)
...
3
...
3
...

Then u and v are solutions of Equation (8
...
1)
...
3
...
3
...

Proof
...

Let λ = α + iβ be a complex root of p(λ) = 0
...
3
...
By Lemma 8
...
2, y1 = eαx cos(βx) and y2 = sin(βx) are
solutions of Equation (8
...
1)
...
It is as good as
saying {eλx cos(βx), eλx sin(βx)} forms a fundamental system of solutions of Equation (8
...
1)
...
3
...
Find the general solution of the follwoing equations
...

(b) 2y ′′ + 5y = 0
...

(d) y ′′ + k 2 y = 0, where k is a real constant
...
Solve the following IVP’s
...

(b) y ′′ − y = 0, y(0) = 1, y ′ (0) = 1
...

(d) y ′′ + 4y ′ + 4y = 0, y(0) = 1, y ′ (0) = 0
...
Find two linearly independent solutions y1 and y2 of the following equations
...

(b) y ′′ + 6y ′ + 5y = 0
...

162

CHAPTER 8
...
Also, in each case, ﬁnd W (y1 , y2 )
...
Show that the IVP
y ′′ + y = 0, y(0) = 0 and y ′ (0) = B
has a unique solution for any real number B
...
Consider the problem
y ′′ + y = 0, y(0) = 0 and y ′ (π) = B
...
3
...
Compare this with Exercise 4
...
3
...

8
...
we assume that q(·), r(·) and f (·) are real valued
continuous function deﬁned on I
...

(8
...
1)
We assume that the functions q(·), r(·) and f (·) are known/given
...
4
...
The equation
y ′′ + q(x)y ′ + r(x)y = 0
...
4
...
4
...

Consider the set of all twice diﬀerentiable functions deﬁned on I
...

Then (8
...
1) and (8
...
2) can be rewritten in the (compact) form
L(y) = f

(8
...
3)

L(y) = 0
...
4
...
4
...
4
...

Theorem 8
...
1
(8
...
2)
...
Let y1 and y2 be two solutions of (8
...
1) on I
...
Let z be any solution of (8
...
1) on I and let z1 be any solution of (8
...
2)
...
4
...

Proof
...
We
therefore have
L(y1 ) = f and L(y2 ) = f
...
4
...

For the proof of second part, note that
L(z) = f and L(z1 ) = 0
implies that
L(z + z1 ) = L(z) + L(z1 ) = f
...
4
...

The above result leads us to the following deﬁnition
...
4
...
4
...
4
...
4
...
4
...
4
...

We now prove that the solution of (8
...
1) with initial conditions is unique
...
4
...
Let y1 and y2 be two solutions of the IVP
y ′′ + qy ′ + ry = f, y(x0 ) = a, y ′ (x0 ) = b
...
4
...

Proof
...
Then z satisﬁes
L(z) = 0, z(x0 ) = 0, z ′ (x0 ) = 0
...
1
...
Or in other words, y1 ≡ y2 on I
...
4
...
e
...
4
...
4
...
4
...
4
...
To repeat, the two steps needed to solve (8
...
1), are:
1
...
4
...
compute a particular solution of (8
...
1)
...

Step 1
...
The remainder of the section is devoted to step 2
...
e
...
4
...

Exercise 8
...
5

1
...
(You may note here that y = −x is a particular solution
...
(First show that y = sin x is a particular solution
...
Solve the following IVPs:
(a) y ′′ + y = 2ex , y(0) = 0 = y ′ (0)
...
)
(b) y ′′ − y = −2 cos x, y(0) = 0, y ′ (0) = 1
...
4
...
1b )
3
...
Let yi ’s be particular solutions of
y ′′ + q(x)y ′ + r(x)y = fi (x), i = 1, 2;
where q(x) and r(x) are continuous functions
...

164

8
...
SECOND ORDER AND HIGHER ORDER EQUATIONS

Variation of Parameters

In the previous section, calculation of particular integrals/solutions for some special cases have been
studied
...
In this section, we
deal with a useful technique of ﬁnding a particular solution when the coeﬃcients of the homogeneous
part are continuous functions and the forcing function f (x) (or the non-homogeneous term) is piecewise
continuous
...
5
...
Then we know that
y = c1 y 1 + c2 y 2
is a solution of (8
...
1) for any constants c1 and c2
...
5
...
5
...
The details are given in the following theorem
...
5
...
Let y1 and y2 be two linearly independent solutions
of (8
...
1) on I
...
5
...
5
...
(Note that the integrals in (8
...
4) are the indeﬁnite
integrals of the respective arguments
...
Let u(x) and v(x) be continuously diﬀerentiable functions (to be determined) such that
yp = uy1 + vy2 , x ∈ I

(8
...
5)

is a particular solution of (8
...
3)
...
5
...

(8
...
6)

u′ y1 + v ′ y2 = 0
...
5
...

(8
...
8)

We choose u and v so that

Substituting (8
...
7) in (8
...
6), we have

Since yp is a particular solution of (8
...
3), substitution of (8
...
5) and (8
...
8) in (8
...
3), we get
′′
′
′′
′
′
′
u y1 + q(x)y1 + r(x)y1 + v y2 + q(x)y2 + r(x)y2 + u′ y1 + v ′ y2 = f (x)
...
5
...

(8
...
9)

8
...
VARIATION OF PARAMETERS

165

We now determine u and v from (8
...
7) and (8
...
9)
...
5
...
5
...
Integration of (8
...
10) give us
y2 f (x)
dx and v =
W

u=−

y1 f (x)
dx
W

(8
...
11)

( without loss of generality, we set the values of integration constants to zero)
...
5
...
5
...
Thus the proof is complete
...

Remark 8
...
2
1
...
5
...
Sometimes, it is useful to write (8
...
11) in the form
x

u=−

x0

y2 (s)f (s)
ds and v =
W (s)

x
x0

y1 (s)f (s)
ds
W (s)

where x ∈ I and x0 is a ﬁxed point in I
...
5
...
5
...

2
...
They need not be constants
...

3
...
While using (8
...
4), one has to keep in mind that the coeﬃcient of y ′′ in (8
...
3)
is 1
...
5
...
Find the general solution of
y ′′ + y =

1
, x ≥ 0
...

Here, the solutions y1 = sin x and y2 = cos x are linearly independent over I = [0, ∞) and W =
W (sin x, cos x) = 1
...
5
...

2 + sin x
= −y1

So, the required general solution is
y = c1 cos x + c2 sin x + yp
where yp is given by (8
...
13)
...
5
...
SECOND ORDER AND HIGHER ORDER EQUATIONS

2
...

Solution: Verify that the given equation is
y ′′ −

2 ′
2
y + 2y = x
x
x

and two linearly independent solutions of the corresponding homogeneous part are y1 = x and y2 = x2
...

1 2x
By Theorem 8
...
1, a particular solution yp is given by
yp

=
=

x2 · x
dx + x2
x2
x3
x3
− + x3 =

...
7 are not applicable as the given equation is
not an equation with constant coeﬃcients
...
5
...
Find a particular solution for the following problems:

(a) y ′′ + y = f (x), 0 ≤ x ≤ 1 where f (x) =
(b) y ′′ + y = 2 sec x for all x ∈ (0, π )
...

(c) y ′′ − 3y ′ + 2y = −2 cos(e−x ), x > 0
...

2
...

(b) y ′′ + y = sin x for all x ∈ R
...
Solve the following IVPs:
(a) y ′′ + y = f (x), x ≥ 0 where f (x) =

0
1

if 0 ≤ x < 1
with y(0) = 0 = y ′ (0)
...

(b) y ′′ − y = |x| for all x ∈ [−1, ∞) with y(−1) = 0 and y ′ (−1) = 1
...
6

Higher Order Equations with Constant Coeﬃcients

This section is devoted to an introductory study of higher order linear equations with constant coeﬃcients
...
3)
...
6
...
6
...
, an being real constants
(called the coeﬃcients of the linear equation) and the function f (x) is a piecewise continuous function
deﬁned on the interval I
...
If f (x) ≡ 0, then
(8
...
1) which reduces to
Ln (y) = 0 on I,

(8
...
2)

is called a homogeneous linear equation, otherwise (8
...
1) is called a non-homogeneous linear equation
...

Deﬁnition 8
...
1 A function y deﬁned on I is called a solution of (8
...
1) if y is n times diﬀerentiable and
y along with its derivatives satisfy (8
...
1)
...
6
...
If u and v are any two solutions of (8
...
1), then y = u − v is also a solution of
(8
...
2)
...
6
...
6
...
6
...

2
...
6
...
Then for any constants (need not be real) c1 , c2 ,
y = c1 y 1 + c2 y 2
is also a solution of (8
...
2)
...

3
...
6
...
This, along with the super-position principle, ensures that
the set of solutions of (8
...
2) forms a vector space over R
...
6
...

As in Section 8
...
6
...
It is easy to note (as in Section 8
...
6
...
6
...
6
...
6
...

Note that p(λ) is of polynomial of degree n with real coeﬃcients
...
Also, in case of complex roots, they will occur in conjugate pairs
...
The proof of the theorem is omitted
...
6
...
6
...
6
...
If λ1 , λ2 ,
...
, eλn x
are the n linearly independent solutions of (8
...
2)
...
If λ1 is a repeated root of p(λ) = 0 of multiplicity k, i
...
, λ1 is a zero of (8
...
3) repeated k times, then
eλ1 x , xeλ1 x ,
...
6
...

168

CHAPTER 8
...
If λ1 = α + iβ is a complex root of p(λ) = 0, then so is the complex conjugate λ1 = α − iβ
...
6
...

These are complex valued functions of x
...
6
...
Thus, in the case of λ1 = α + iβ being a complex root of p(λ) = 0, we
have the linearly independent solutions
eαx cos(βx) and eαx sin(βx)
...
6
...
Find the solution space of the diﬀerential equation
y ′′′ − 6y ′′ + 11y ′ − 6y = 0
...

By inspection, the roots of p(λ) = 0 are λ = 1, 2, 3
...

2
...

Solution: Its characteristic equation is
p(λ) = λ3 − 2λ2 + λ = 0
...
So, the linearly independent solutions are 1, ex , xex
and the solution space is
{c1 + c2 ex + c3 xex : c1 , c2 , c3 ∈ R}
...
Find the solution space of the diﬀerential equation
y (4) + 2y ′′ + y = 0
...

By inspection, the roots of p(λ) = 0 are λ = i, i, −i, −i
...

8
...
HIGHER ORDER EQUATIONS WITH CONSTANT COEFFICIENTS

169

From the above discussion, it is clear that the linear homogeneous equation (8
...
2), admits n linearly independent solutions since the algebraic equation p(λ) = 0 has exactly n roots (counting with
multiplicity)
...
6
...
, yn be any set of n linearly independent solution of
(8
...
2)
...
6
...
, cn are arbitrary real constants
...
6
...
Find the general solution of y ′′′ = 0
...
So, the general
solution is
y = c1 + c2 x + c3 x2
...
Find the general solution of
y ′′′ + y ′′ + y ′ + y = 0
...
So,
the general solution is
y = c1 e−x + c2 sin x + c3 cos x
...
6
...
Find the general solution of the following diﬀerential equations:

(a) y ′′′ + y ′ = 0
...

(c) y iv + 2y ′′ + y = 0
...
Find a linear diﬀerential equation with constant coeﬃcients and of order 3 which admits the following
solutions:
(a) cos x, sin x and e−3x
...

(c) 1, ex and x
...
Solve the following IVPs:
(a) y iv − y = 0, y(0) = 0, y ′ (0) = 0, y ′′ (0) = 0, y ′′′ (0) = 1
...

4
...
, an−1 ∈ R be given constants
...
6
...
(8
...
4) is also
called the standard form of the Euler equation
...

dxn
dx

170

CHAPTER 8
...

So, xλ is a solution of (8
...
4), if and only if
λ(λ − 1) · · · (λ − n + 1) + an−1 λ(λ − 1) · · · (λ − n + 2) + · · · + a0 = 0
...
6
...
6
...
6
...
With the above understanding, solve the following homogeneous Euler equations:
(a) x3 y ′′′ + 3x2 y ′′ + 2xy ′ = 0
...

(c) x3 y ′′′ − x2 y ′′ + xy ′ − y = 0
...
6
...

5
...
6
...
Let x = et or equivalently t = ln x
...
Then
dy
(a) show that xd(y) = Dy(t), or equivalently x dx =

dy)
dt
...

(c) with the new (independent) variable t, the Euler equation (8
...
4) reduces to an equation with
constant coeﬃcients
...

We turn our attention toward the non-homogeneous equation (8
...
1)
...
6
...
6
...
6
...
The solution y involves n arbitrary constants
...
6
...

Solving an equation of the form (8
...
1) usually means to ﬁnd a general solution of (8
...
1)
...
Solving
(8
...
1) essentially involves two steps (as we had seen in detail in Section 8
...

Step 1: a) Calculation of the homogeneous solution yh and
b) Calculation of the particular solution yp
...
Note
that a particular solution is not unique
...
6
...
6
...
6
...
The undetermined coeﬃcients method is applicable for
equations (8
...
1)
...
7

Method of Undetermined Coeﬃcients

In the previous section, we have seen than a general solution of
Ln (y) = f (x) on I

(8
...
6)

can be written in the form
y = yh + yp ,
where yh is a general solution of Ln (y) = 0 and yp is a particular solution of (8
...
6)
...
7
...
7
...
f (x) = keαx ; k = 0, α a real constant
2
...
f (x) = xm
...
f (x) = keαx ; k = 0, α a real constant
...
e
...
Note that Ln (eαx ) =
p(α)eαx
...
Thus
Ln (yp ) = Ap(α)eαx
...

k αx
e is a particular solution of Ln (y) = keαx
...
e
...
e
...

Example 8
...
1

1
...

Solution: Here f (x) = 2ex with k = 2 and α = 1
...

Note that α = 1 is not a root of p(λ) = 0
...
This on substitution gives
Aex − 4Aex = 2ex =⇒ −3Aex = 2ex
...

3

2
...

Solution: The characteristic polynomial is p(λ) = λ3 − 3λ2 + 3λ − 1 = (λ − 1)3 and α = 1
...
Thus, we assume yp = Ax3 ex
...

Solving for A, we get A =

1
x3 ex
, and thus a particular solution is yp =

...
SECOND ORDER AND HIGHER ORDER EQUATIONS

3
...

Solution: The characteristic polynomial is p(λ) = λ3 − λ and α = 2
...

p(α)
6
6
4
...

Exercise 8
...
2 Find a particular solution for the following diﬀerential equations:
1
...

2
...

3
...

Case II
...
e
...
Here, we
assume that yp is of the form
yp = eαx A cos(βx) + B sin(βx) ,
and then comparing the coeﬃcients of eαx cos x and eαx sin x (why!) in Ln (y) = f (x), obtain the values
of A and B
...
e
...

Example 8
...
3

1
...

Solution: Here, α = 1 and β = 1
...
Note that the roots of p(λ) = 0 are −1 ± i
...
This gives us

(−4B + 4A)ex sin x + (4B + 4A)ex cos x = 4ex sin x
...

1
ex
On solving for A and B, we get A = −B =
...

2
2
2
...

Solution: Here, α = 0 and β = 1
...

So, let yp = x (A cos x + B sin x)
...
Thus, a particular solution is yp =
2
−1
x cos x
...
7
...
7
...
y ′′′ − y ′′ + y ′ − y = ex cos x
...
y ′′′′ + 2y ′′ + y = sin x
...
y ′′ − 2y ′ + 2y = ex cos x
...
f (x) = xm
...
Then we assume that
yp = Am xm + Am−1 xm−1 + · · · + A0
and then compare the coeﬃcient of xk in Ln (yp ) = f (x) to obtain the values of Ai for 0 ≤ i ≤ m
...
e
...

Example 8
...
5 Find a particular solution of
y ′′′ − y ′′ + y ′ − y = x2
...

Comparing the coeﬃcients of diﬀerent powers of x and solving, we get A2 = −1, A1 = −2 and A0 = 0
...

Finally, note that if yp1 is a particular solution of Ln (y) = f1 (x) and yp2 is a particular solution of
Ln (y) = f2 (x), then a particular solution of
Ln (y) = k1 f1 (x) + k2 f2 (x)
is given by
yp = k1 yp1 + k2 yp2
...

Example 8
...
6 Find a particular soltution of
y ′′ + y = 2 sin x + sin 2x
...
y ′′ + y = 2 sin x
...
SECOND ORDER AND HIGHER ORDER EQUATIONS

2
...

−1
x cos x = −x cos x
...
7
...
2) is yp1 = 2
2
−1
For the second problem, one can check that yp2 =
sin(2x) is a particular solution
...

3

Exercise 8
...
7 Find a particular solution for the following diﬀerential equations:
1
...

2
...

3
...

4
...

5
...

6
...

Chapter 9

Solutions Based on Power Series
9
...
We also looked at Euler
Equations which can be reduced to the above form
...
In general, there are no methods of
ﬁnding a solution of an equation of the form
y ′′ + q(x)y ′ + r(x)y = f (x), x ∈ I
where q(x) and r(x) are real valued continuous functions deﬁned on an interval I ⊂ R
...
One such
class of functions is called the set of analytic functions
...
1
...
, an ,
...
An expression of the type
∞

n=0

an (x − x0 )n

(9
...
1)

is called a power series in x around x0
...

In short, a0 , a1 ,
...
are called the coeﬃcient of the power series and x0 is called the center
...
So,
the set
S = {x ∈ R :

∞

n=0

an (x − x0 )n converges}

is a non-empty
...
We are thus led to the following deﬁnition
...
1
...
Consider the power series
x−

x3
x5
x7
+
−
+ ···
...
Recall that the Taylor series expansion around x0 = 0 of sin x is same as the above power
series
...
Also, a2n+1 =

176

CHAPTER 9
...
Any polynomial
a0 + a1 x + a2 x2 + · · · + an xn
is a power series with x0 = 0 as the center, and the coeﬃcients am = 0 for m ≥ n + 1
...
1
...
1
...
1
...

From what has been said earlier, it is clear that the set of points x where the power series (9
...
1) is
convergent is the interval (−R + x0 , x0 + R), whenever R is the radius of convergence
...

Let R > 0 be the radius of convergence of the power series (9
...
1)
...
In
the interval I, the power series (9
...
1) converges
...
e
...

Such a function is well deﬁned as long as x ∈ I
...
1
...
Sometimes, we also use the terminology that (9
...
1) induces a function f on I
...
1
...
We
state one such result below but we do not intend to give a proof
...
1
...
Let

∞
n=1

R ≥ 0 such that

∞
n=1

an (x − x0 )n be a power series with center x0
...

In this case, the power series

∞
n=1

an (x − x0 )n converges absolutely and uniformly on
|x − x0 | ≤ r for all r < R

and diverges for all x with
|x − x0 | > R
...
Suppose R is the radius of convergence of the power series (9
...
1)
...

n

|an | exists and

1

...
1
...

(a) If ℓ = 0, then R =

Note that lim

n

n−
→∞

|an | exists if lim

n−→∞

lim

an+1
and
an

n−→∞

n

|an | = lim

n−→∞

an+1

...
1
...

In case, n |an | does not tend to a limit as n −→ ∞, then the above theorem holds if we replace
lim n |an | by lim sup n |an |
...
1
...
Consider the power series
(x + 1)n
...
So, n |an | = n 1 = 1
...
1
...

Example 9
...
6

(−1)n (x + 1)2n+1

...
Consider the power series
n≥0

x0 = −1, an = 0 for n even and a2n+1 =

(−1)n

...

|an | exists and equals 0
...
Note that

the series converges to sin(x + 1)
...
Consider the power series

∞

x2n
...

So,
2n+1

lim

n−→∞

Thus, lim

n−→∞

n

|a2n+1 | = 0 and

2n

|a2n | = 1
...

We let u = x2
...
But then from Example 9
...
6
...
Therefore, the original power series converges

whenever |x2 | < 1 or equivalently whenever |x| < 1
...
Note that
∞

1
=
x2n for |x| < 1
...
In this case,

4
...
doesn’t have any ﬁnite limit as

n −→ ∞
...

5
...
Recall that it represents ex
...
1
...
f is called analytic around x0 if there exists a
δ > 0 such that
f (x) =
an (x − x0 )n for every x with |x − x0 | < δ
...

9
...
1

Properties of Power Series

Now we quickly state some of the important properties of the power series
...
SOLUTIONS BASED ON POWER SERIES

with radius of convergence R1 > 0 and R2 > 0, respectively
...
Note
that both the power series converge for all x ∈ I
...

1
...

In particular, if

∞
n=0

an (x − x0 )n = 0 for all x ∈ I, then
an = 0 for all n = 0, 1, 2,
...
Term by Term Addition
For all x ∈ I, we have

∞

F (x) + G(x) =

(an + bn )(x − x0 )n

n=0

Essentially, it says that in the common part of the regions of convergence, the two power series
can be added term by term
...
Multiplication of Power Series
Let us deﬁne

n

c0 = a0 b0 , and inductively cn =

an−j bj
...

H(x) is called the “Cauchy Product” of F (x) and G(x)
...

j=1

4
...

Note that it also has R1 as the radius of convergence as by Theorem 9
...
4 lim

n−→∞

lim

n−→∞

n

|nan | = lim

n−→∞

n

|n| lim

n

n−→∞

|an | = 1 ·

n

1
|an | = · R1 and

1

...
Then for all x ∈ (−r + x0 , x0 + r), we have
∞

d
F (x) = F ′ (x) =
nan (x − x0 )n
...

9
...
SOLUTIONS IN TERMS OF POWER SERIES

179

In the following, we shall consider power series with x0 = 0 as the center
...

Exercise 9
...
1
ets) in x?

1
...

(b) 1 + sin x + (sin x)2 + · · · + (sin x)n + · · ·
2

2

n

(x0 = 0)
...

2
...

2!
4!
(2n)!

= x−

Find the radius of convergence of f (x) and g(x)
...

[Hint: Use Properties 1, 2, 3 and 4 mentioned above
...
]
3
...

(a) 1 + (x + 1) +

(x+1)2
2!

+ ···+

(x+1)n
n!

+ ···
...

9
...

(9
...
1)

Let a and b be analytic around the point x0 = 0
...

(9
...
2)

k=0

In the absence of any information, let us assume that (9
...
1) has a solution y represented by (9
...
2)
...
2
...
2
...
Let us take up an example for
illustration
...
2
...
2
...

Solution: Let
y=

∞

n=0

cn xn
...
2
...
SOLUTIONS BASED ON POWER SERIES

Then y ′ =

∞

∞

ncn xn−1 and y ′′ =

n=0

n=0

Equation (9
...
3), we get

∞
n=0

n(n − 1)cn xn−2
...

n=0

Hence for all n = 0, 1, 2,
...

(n + 1)(n + 2)

Therefore, we have
c2 = − c 0 ,
2!
c4 = (−1)2 c0 ,
4!

...

...

...

c1
c2n+1 = (−1)n (2n+1)!
...
So,
y = c0

∞

∞

(−1)n x2n
(−1)n x2n+1
+ c1
(2n)!
(2n + 1)!
n=0
n=0

or y = c0 cos(x) + c1 sin(x) where c0 and c1 can be chosen arbitrarily
...
That is, cos(x) is a solution of the Equation (9
...
3)
...
2
...

Exercise 9
...
2 Assuming that the solutions y of the following diﬀerential equations admit power series
representation, ﬁnd y in terms of a power series
...
y ′ = −y, (center at x0 = 0)
...
y ′ = 1 + y 2 , (center at x0 = 0)
...
Find two linearly independent solutions of
(a) y ′′ − y = 0, (center at x0 = 0)
...

9
...
Presently, we inquire the question,
namely, whether an equation of the form
y ′′ + a(x)y ′ + b(x)y = f (x), x ∈ I

(9
...
1)

admits a solution y which has a power series representation around x ∈ I
...
3
...
The following is one such result
...

9
...
LEGENDRE EQUATIONS AND LEGENDRE POLYNOMIALS

181

Theorem 9
...
1 Let a(x), b(x) and f (x) admit a power series representation around a point x = x0 ∈ I,
with non-zero radius of convergence r1 , r2 and r3 , respectively
...
Then the Equation
(9
...
1) has a solution y which has a power series representation around x0 with radius of convergence R
...
3
...
3
...
3
...

Secondly, a point x0 is called an ordinary point for (9
...
1) if a(x), b(x) and f (x) admit power
series expansion (with non-zero radius of convergence) around x = x0
...
3
...
3
...

The following are some examples for illustration of the utility of Theorem 9
...
1
...
3
...
Examine whether the given point x0 is an ordinary point or a singular point for the
following diﬀerential equations
...

(b) y ′′ +

sin x
x−1 y

= 0, x0 = 0
...

2
...
Also, ﬁnd the power
series solutions if it exists
...

(b) xy ′′ + y = 0, x0 = 0
...

9
...
4
...

Deﬁnition 9
...
1 The equation
(1 − x2 )y ′′ − 2xy ′ + p(p + 1)y = 0, −1 < x < 1

(9
...
1)

where p ∈ R, is called a Legendre Equation of order p
...
4
...

Equation (9
...
1) may be rewritten as
y ′′ −

2x
p(p + 1)
y′ +
y = 0
...
By Theorem 9
...
1, a solution y of
(9
...
1) admits a power series solution (with center at x0 = 0) with radius of convergence R = 1
...
SOLUTIONS BASED ON POWER SERIES

assume that y =

∞

ak xk is a solution of (9
...
1)
...
Substituting the

k=0

expression for

y′ =

∞

kak xk−1 and y ′′ =

k=0

∞
k=0

k(k − 1)ak xk−2

in Equation (9
...
1), we get
∞
k=0

{(k + 1)(k + 2)ak+2 + ak (p − k)(p + k + 1)} xk = 0
...

ak+2 = −

(p − k)(p + k + 1)
ak
...
In general,
a2m = (−1)m
and
a2m+1 = (−1)m

p(p − 2) · · · (p − 2m + 2)(p + 1)(p + 3) · · · (p + 2m − 1)
a0
(2m)!
(p − 1)(p − 3) · · · (p − 2m + 1)(p + 2)(p + 4) · · · (p + 2m)
a1
...
So, by choosing a0 = 1, a1 = 0 and a0 = 0, a1 = 1 in the
above expressions, we have the following two solutions of the Legendre Equation (9
...
1), namely,
p(p + 1) 2
(p − 2m + 2) · · · (p + 2m − 1) 2m
x + · · · + (−1)m
x + ···
2!
(2m)!

(9
...
2)

(p − 1)(p + 2) 3
(p − 2m + 1) · · · (p + 2m) 2m+1
x + · · · + (−1)m
x
+ ···
...
4
...
4
...
4
...
It
now follows that the general solution of (9
...
1) is
y = c1 y 1 + c2 y 2

(9
...
4)

where c1 and c2 are arbitrary real numbers
...
4
...
4
...
Suppose p = n is a non-negative integer
...

(k + 1)(k + 2)

(9
...
5)

Therefore, when k = n, we get
an+2 = an+4 = · · · = an+2m = · · · = 0 for all positive integer m
...
Then y1 in Equation (9
...
2) is a polynomial of degree n
...

9
...
LEGENDRE EQUATIONS AND LEGENDRE POLYNOMIALS

183

Case 2: Now, let n be a positive odd integer
...
4
...
In this case, y2 is an odd polynomial in the sense that the terms of y2 are odd powers of x and hence
y2 (−x) = −y2 (x)
...
4
...

Deﬁnition 9
...
3 A polynomial solution Pn (x) of (9
...
1) is called a Legendre Polynomial whenever
Pn (1) = 1
...
Then it can be checked that
Pn (1) = 1 if we choose
(2n)!
1 · 3 · 5 · · · (2n − 1)
an = n
=

...
In general, if n − 2m ≥ 0, then
an−2m = (−1)m

(2n − 2m)!

...
4
...

2
2

Proposition 9
...
4 Let p = n be a non-negative even integer
...
4
...

Similarly, if p = n is a non-negative odd integer, then any polynomial solution y of (9
...
1) which has only
odd powers of x is a multiple of Pn (x)
...
Suppose that n is a non-negative even integer
...
4
...
By
(9
...
4)
y = c1 y 1 + c2 y 2 ,
where y1 is a polynomial of degree n (with even powers of x) and y2 is a power series solution with odd
powers only
...

Similarly, Pn (x) = c′ y1 with c′ = 0
...
A similar proof holds
1
1
when n is an odd positive integer
...
They are used later for the orthogonality properties
of the Legendre polynomials, Pn (x)’s
...
4
...
, are given by
˙
Pn (x) =
Proof
...
Then
(x2 − 1)

d
dx V

1 dn 2
(x − 1)n
...

dx

(9
...
7)

184

CHAPTER 9
...

dx
dx

2(n + 1)x

(x), we have

(x2 − 1)U ′′ + U ′ {2(n + 1)x − 2nx} + U {n(n + 1) − 2n(n + 1)}
2

′′

′

(1 − x )U − 2xU + n(n + 1)U

or

= 0
= 0
...
4
...
So, by Proposition 9
...
4, we have
Pn (x) = αU (x) = α

dn 2
(x − 1)n for some α ∈ R
...

dn 2
(x − 1)n
dxn

= 2n n! or, equivalently
x=1

1 dn 2
(x − 1)n
2n n! dxn
and thus
Pn (x) =

Example 9
...
6

1
2n n!

=1
x=1

dn 2
(x − 1)n
...
When n = 0, P0 (x) = 1
...
When n = 1, P1 (x) =

1 d 2
(x − 1) = x
...
When n = 2, P2 (x) =

1 d2 2
1
3
1
(x − 1)2 = {12x2 − 4} = x2 −
...

Theorem 9
...
7 Let Pn (x) denote, as usual, the Legendre Polynomial of degree n
...

(9
...
8)

−1

Proof
...

′
(1 − x2 )Pn (x) + n(n + 1)Pn (x)
′

(9
...
9)
(9
...
10)

Multiplying Equation (9
...
9) by Pm (x) and Equation (9
...
10) by Pn (x) and subtracting, we get
′

′

′
′
n(n + 1) − m(m + 1) Pn (x)Pm (x) = (1 − x2 )Pm (x) Pn (x) − (1 − x2 )Pn (x) Pm (x)
...
4
...

′

′
′
(1 − x2 )Pm (x) Pn (x) − (1 − x2 )Pn (x) Pm (x) dx

=
=

Pn (x)Pm (x)dx
−1

−1

x=1
′
′
′
(1 − x2 )Pm (x)Pn (x)dx + (1 − x2 )Pm (x)Pn (x)
′
′
′
(1 − x2 )Pn (x)Pm (x)dx + (1 − x2 )Pn (x)Pm (x)

x=−1
x=1
x=−1

Since n = m, n(n + 1) = m(m + 1) and therefore, we have
1

Pn (x)Pm (x) dx = 0 if m = n
...
4
...

1
−1

2
Pn (x) dx =

2

...
4
...
Let us write V (x) = (x2 − 1)n
...

dxn
dx

dn
dn
V (x) n V (x)dx
...

m
dx
dx

(9
...
12)

Therefore, integrating I by parts and using (9
...
12) at each step, we get
1

I=
−1

d2n
V (x) · (−1)n V (x)dx = (2n)!
dx2n

1
−1

(1 − x2 )n dx = (2n)! 2

Now substitute x = cos θ and use the value of the integral

π
2

1
0

(1 − x2 )n dx
...

0

We now state an important expansion theorem
...

Theorem 9
...
9 Let f (x) be a real valued continuous function deﬁned in [−1, 1]
...

−1

Legendre polynomials can also be generated by a suitable function
...

186

CHAPTER 9
...
4
...
Then
∞

1
√
=
Pn (x)tn , t = 1
...
4
...
The function h(t) is called the generating function for the Legendre
polynomials
...
4
...
By using the Rodrigue’s formula, ﬁnd P0 (x), P1 (x) and P2 (x)
...
Use the generating function (9
...
13)
(a) to ﬁnd P0 (x), P1 (x) and P2 (x)
...

Using the generating function (9
...
13), we can establish the following relations:
(n + 1)Pn+1 (x)
nPn (x)

(2n + 1) x Pn (x) − n Pn−1 (x)

=
=

′
Pn+1 (x)

′
′
xPn (x) − Pn−1 (x)
′
xPn (x)

=

(9
...
14)
(9
...
15)

+ (n + 1)Pn (x)
...
4
...
4
...
4
...
4
...
The relation (9
...
14) is also known as Bonnet’s recurrence relation
...
4
...
4
...
The readers are required to proof the other two recurrence relations
...
4
...

2
n=0
Or equivalently,

∞

1

(x − t)(1 − 2xt + t2 )− 2 = (1 − 2xt + t2 )
We now substitute

∞
n=0

nPn (x)tn−1
...

n=0

The two sides and power series in t and therefore, comparing the coeﬃcient of tn , we get
xPn (x) − Pn−1 (x) = (n + 1)Pn (x) + (n − 1)Pn−1 (x) − 2n x Pn (x)
...
4
...

To prove (9
...
15), one needs to diﬀerentiate the generating function with respect to x (keeping t
ﬁxed) and doing a similar simpliﬁcation
...
4
...
4
...
4
...
These relations will be helpful in solving the problems given below
...
4
...
Find a polynomial solution y(x) of (1 − x2 )y ′′ − 2xy ′ + 20y = 0 such that y(1) = 10
...
Prove the following:

9
...
LEGENDRE EQUATIONS AND LEGENDRE POLYNOMIALS
1

(a)
−1
1

(b)

Pm (x)dx = 0 for all positive integers m ≥ 1
...

−1
1

(c)

xm Pn (x)dx = 0 whenever m and n are positive integers with m < n
...
Show that Pn (1) =

n(n + 1)
n(n + 1)
′
and Pn (−1) = (−1)n−1

...
Establish the following recurrence relations
...

′
(b) (1 − x2 )Pn (x) = n Pn−1 (x) − xPn (x)
...
SOLUTIONS BASED ON POWER SERIES

Part III

Laplace Transform

Chapter 10

Laplace Transform
10
...
Here, F (s) is called integral transform of f (t)
...
This transformation of f (t) into F (s)
provides a method to tackle a problem more readily
...
In view of this, the integral transforms ﬁnd numerous applications in engineering
problems
...
As we will see in the following, application of Laplace transform reduces a linear
diﬀerential equation with constant coeﬃcients to an algebraic equation, which can be solved by algebraic
methods
...

It is important to note here that there is some sort of analogy with what we had learnt during the
study of logarithms in school
...
In a similar way, we ﬁrst
transform the problem that was posed as a function of f (t) to a problem in F (s), make some calculations
and then use the table of inverse Laplace transform to get the solution of the actual problem
...

10
...
2
...
A function f (t) is said to be a piece-wise continuous function on a closed interval [a, b] ⊂ R, if there exists ﬁnite number of points a = t0 < t1 <
t2 < · · · < tN = b such that f (t) is continuous in each of the intervals (ti−1 , ti ) for 1 ≤ i ≤ N and
has ﬁnite limits as t approaches the end points, see the Figure 10
...

2
...
For example, see Figure 10
...

Deﬁnition 10
...
2 (Laplace Transform) Let f : [0, ∞) −→ R and s ∈ R
...
LAPLACE TRANSFORM

Figure 10
...

(Recall that

∞

b

g(t)dt exists if lim

b−
→∞ 0

0

Remark 10
...
3

g(t)d(t) exists and we deﬁne

∞

b

g(t)dt = lim

b−→∞ 0

0

g(t)d(t)
...
Let f (t) be an exponentially bounded function, i
...
,

|f (t)| ≤ M eαt for all t > 0 and for some real numbers α and M with M > 0
...

b
b−→∞ 0

2
...
Then by deﬁnition, lim
can use the theory of improper integrals to conclude that

f (t)e−st dt exists
...

s−→∞

Hence, a function F (s) satisfying
lim F (s) does not exist or lim F (s) = 0,

s−→∞

s−→∞

cannot be a Laplace transform of a function f
...
2
...
That is, F (s) is the Laplace transform of the function f (t)
...
In that case, we write
f (t) = L−1 (F (s))
...
2
...
Find F (s) = L(f (t)), where f (t) = 1, t ≥ 0
...

b−→∞ −s 0
s b−→∞ s
0
Note that if s > 0, then
e−sb
lim
= 0
...

s

Example 10
...
5

10
...
DEFINITIONS AND EXAMPLES

193

In the remaining part of this chapter, whenever the improper integral is calculated, we will not explicitly
write the limiting process
...

2
...

Solution: Integration by parts gives
∞

F (s) =
0

=

1
s2

∞

−te−st
s

te−st dt =

∞

+
0

0

e−st
dt
s

for s > 0
...
Find the Laplace transform of f (t) = tn , n a positive integer
...

4
...

Solution: We have
L(eat )

∞

=
0

e−(s−a)t dt

0

1
s−a

=

∞

eat e−st dt =
for s > a
...
Compute the Laplace transform of cos(at), t ≥ 0
...
Hence,
a2 + s2
s2

∞

cos(at)e−st dt =

0

1
s

...
Similarly, one can show that
L(sin(at)) =

s > 0
...

s2 + a2

1
7
...

t
Solution: Note that f (t) is not a bounded function near t = 0 (why!)
...

∞
∞ √
1
1 −st
s
dτ
√ e dt =
√ e−τ
L( √ ) =
( substitute τ = st)
τ
s
t
t
0
0
∞
∞
1
1
1
1
= √
τ − 2 e−τ dτ = √
τ 2 −1 e−τ dτ
...
LAPLACE TRANSFORM
∞

Recall that for calculating the integral

1

τ 2 −1 e−τ dτ, one needs to consider the double integral

0
∞
0

∞

e−(x

2

+y 2 )

∞

dxdy =

0

2

2

e−x dx

=

0

It turns out that

∞

√
1
π
Thus, L( √ ) = √ for s > 0
...

0

π
...

f (t)

L(f (t))

f (t)

L(f (t))

1

1
, s>0
s

t

1
, s>0
s2

tn

n!
, s>0
sn+1

eat

1
, s>a
s−a

sin(at)

a
, s>0
s2 + a2

cos(at)

s
, s>0
s2 + a2

a
, s>a
− a2

cosh(at)

sinh(at)

s2

s2

s
, s>a
− a2

Table 10
...
3

Properties of Laplace Transform

Lemma 10
...
1 (Linearity of Laplace Transform)
L af (t) + bg(t)

1
...
Then
∞

=

af (t) + bg(t) e−st dt

0

=

aL(f (t)) + bL(g(t))
...
If F (s) = L(f (t)), and G(s) = L(g(t)), then
L−1 aF (s) + bG(s) = af (t) + bg(t)
...

Example 10
...
2

1
...

eat + e−at
Solution: cosh(at) =

...

a
,
− a2

s > |a|
...
Similarly,
s2

10
...
PROPERTIES OF LAPLACE TRANSFORM

1

195

1
a

1

2a

a

a

2a

Figure 10
...

s(s + 1)

3
...

s
s+1

1
is f (t) = 1 − e−t
...
3
...

1 s
Then for a > 0, L(f (at)) = F ( )
...
By deﬁnition and the substitution z = at, we get
L(f (at))

e−st f (at)dt =

0

=

Exercise 10
...
4

∞

=
1
a

∞

e

s
−az

0

1
a

∞

z

e−s a f (z)dz

0

1 s
f (z)dz = F ( )
...
Find the Laplace transform of
t2 + at + b, cos(wt + θ), cos2 t, sinh2 t;

where a, b, w and θ are arbitrary constants
...
Find the Laplace transform of the function f (·) given by the graphs in Figure 10
...

3
...

s2 + 1 2s + 1

The next theorem relates the Laplace transform of the function f ′ (t) with that of f (t)
...
3
...
Suppose that there exist constants M and T such that
|f (t)| ≤ M eαt for all t ≥ T
...

(10
...
1)

196

CHAPTER 10
...
Note that the condition |f (t)| ≤ M eαt for all t ≥ T implies that
lim f (b)e−sb = 0 for s > α
...

We can extend the above result for nth derivative of a function f (t), if f ′ (t),
...
In this case, a repeated use of Theorem 10
...
5, gives the
following corollary
...
3
...
If f ′ (t),
...

(10
...
2)

In particular, for n = 2, we have
L f ′′ (t) = s2 F (s) − sf (0) − f ′ (0)
...
3
...
3
...
Also, let f (0) = 0
...

Example 10
...
8

1
...

+1

1
s
) = sin t
...

s2 + 1
s +1

2
...

Solution: Note that f (0) = 1 and f ′ (t) = −2 cos t sin t = − sin(2t)
...

s2 + 4

Now, using Theorem 10
...
5, we get
L(f (t)) =

1
s

−

s2

2
+1
+4

=

s2 + 2

...
3
...
If the function F (s) is diﬀerentiable, then
L(tf (t)) = −
Equivalently, L−1 (−

d
F (s)
...

ds

10
...
PROPERTIES OF LAPLACE TRANSFORM
∞

Proof
...
The result is obtained by diﬀerentiating both sides with

0

respect to s
...
Suppose that G(s) = L(g(t)) exists
...

ds

s

Thus, G(s) = − F (p)dp for some real number a
...

s

Hence,we have the following corollary
...
3
...
Then
t
∞

L(g(t)) = G(s) =

F (p)dp
...
Find L(t sin(at))
...
3
...
Hence L(t sin(at)) = 2

...

(s − 1)3

2
...

d
By lemma 10
...
9, we know that L(tf (t)) = − ds F (s)
...
Therefore,

d2
F (s)
ds2

L−1

= L−1

d
G(s)
ds

d
ds F (s)

= G(s)
...

Thus we get f (t) = 2t2 et
...
3
...

s

f (τ )dτ
...
By deﬁnition,
t

L

∞

f (τ ) dτ =
0

0

t

e−st

f (τ ) dτ
0

t

∞

dt =
0

e−st f (τ ) dτ dt
...
We assume that the
order of the integrations can be changed and therefore
t

∞
0

∞

e−st f (τ ) dτ dt =
0

0

∞
τ

e−st f (τ ) dt dτ
...
LAPLACE TRANSFORM

Thus,
t

L

f (τ ) dτ

t

∞

=

0

e−st f (τ ) dτ dt

0

0
∞

=

∞

0

τ
∞

=

0
∞

e−sτ f (τ )dτ

0

∞

e−sτ f (τ )dτ

0

Example 10
...
13

1
...

s

e−sz dz

0

t
0

sin(az)dz)
...
Hence
s + a2
t

L(
t

2
...

2)
2 + a2 )
s (s + a
s(s

...
3
...
Find the function f (t) such that F (s) =
Solution: We know L(et ) =
L−1

1

...

s
s s
s

4

...

Lemma 10
...
14 (s-Shifting) Let L(f (t)) = F (s)
...

Proof
...

1
...

b
b
Solution: We know L(sin(bt)) = 2

...

s + b2
(s − a)2 + b2

Example 10
...
15

2
...

Solution: By s-Shifting, if L(f (t)) = F (s) then L(eat f (t)) = F (s − a)
...

s
s2 + 36

= L−1

s
s2 + 6 2

= cos(6t)
...
3
...
3
...
We give a few examples to explain the methods for calculating the
inverse Laplace transform of F (s)
...
3
...
Denominator of F has Distinct Real Roots:
If F (s) =

(s + 1)(s + 3)
s(s + 2)(s + 8)

ﬁnd f (t)
...
Thus,
16s 12(s + 2) 48(s + 8)
3
1
35
f (t) =
+ e−2t + e−8t
...
Denominator of F has Distinct Complex Roots:
If F (s) =

s2

4s + 3
+ 2s + 5

ﬁnd f (t)
...
Thus,
(s + 1)2 + 22
2 (s + 1)2 + 22
1
f (t) = 4e−t cos(2t) − e−t sin(2t)
...
Denominator of F has Repeated Real Roots:
3s + 4
(s + 1)(s2 + 4s + 4)

If F (s) =

ﬁnd f (t)
...

(s + 1)(s2 + 4s + 4)
(s + 1)(s + 2)2
s + 1 s + 2 (s + 2)2
1
s+1

Solving for a, b and c, we get F (s) =
f (t) = e−t − e−2t + 2te−2t
...
3
...
3
...
3
...

if t ≥ a

e−sa
, s > 0
...
3
...
Deﬁne g(t) by
g(t) =
Then g(t) = Ua (t)f (t − a) and

0
f (t − a)

if 0 ≤ t < a

...

d
1
+ 2 ds − (s+2)
...
LAPLACE TRANSFORM

c

c
g(t)

f(t)

a

d

d+a

Figure 10
...
Let 0 ≤ t < a
...

If t ≥ a, then Ua (t) = 1 and Ua (t)f (t − a) = f (t − a) = g(t)
...
Thus,
∞

L(g(t))

=

e

∞

g(t)dt =
a

0
∞

=

−st

e

−s(t+a)

e−st f (t − a)dt

f (t)dt = e

0

Solution: Let G(s) =

e−5s
s2 −4s−5

e−5s
s2 −4s−5

=e

−5s

e−st f (t)dt

0

e−as F (s)
...
3
...

F (s), with F (s) =

L−1 (F (s)) = L−1
Hence, by Lemma 10
...
19
L−1 (G(s)) =

1
s2 −4s−5
...

3

1
U5 (t) sinh 3(t − 5) e2(t−5)
...
3
...

Solution: Note that
f (t) =

Thus, L(f (t)) = e−2πs

0
t < 2π
(t − 2π) cos(t − 2π) + 2π cos(t − 2π) t > 2π
...
4
10
...
1

Some Useful Results
Limiting Theorems

The following two theorems give us the behaviour of the function f (t) when t −→ 0+ and when t −→ ∞
...
4
...
4
...
Then
lim f (t) = lim sF (s)
...
We know sF (s) − f (0) = L (f ′ (t))
...

s−
→∞

0

as lim e−st = 0
...
4
...
For t ≥ 0, let Y (s) = L(y(t)) = a(1 + s2 )−1/2
...

Solution: Theorem 10
...
1 implies
as
a
1 = lim sY (s) = lim

...

= lim
s−→∞
s−
→∞ (1 + s2 )1/2
s−→∞ ( 1 + 1)1/2
s2
(s + 1)(s + 3)
ﬁnd f (0+ )
...
4
...
If F (s) =

f (0+ ) = lim sF (s) = lim s ·
s−→∞

s−
→∞

(s + 1)(s + 3)
= 1
...
But this theorem is valid only when f (t) is bounded
as t approaches inﬁnity
...
4
...
Then
lim f (t) = lim sF (s)

t−
→∞

s−→0

provided that sF (s) converges to a ﬁnite limit as s tends to 0
...

lim sF (s) =

s−→0

∞

f (0) + lim

s−→0

e−st f ′ (t)dt

0
t

=

e−sτ f ′ (τ )dτ

f (0) + lim lim

s−→0 t−→∞

0

t

=

lim e−sτ f ′ (τ )dτ = lim f (t)
...

t−
→∞
s(s + 2)(s + 8)
Solution: From Theorem 10
...
3, we have

Example 10
...
4 If F (s) =

lim f (t) = lim sF (s) =

t−→∞

s−
→0

lim s ·

s−
→0

2(s + 3)
6
3
=
=
...

Deﬁnition 10
...
5 (Convolution of Functions) Let f (t) and g(t) be two smooth functions
...

202

CHAPTER 10
...
(f ⋆ g)(t) = g ⋆ f (t)
...

2

2
...
4
...

1
Remark 10
...
7 Let g(t) = 1 for all t ≥ 0
...
Thus, the
s
Convolution Theorem 10
...
6 reduces to the Integral Lemma 10
...
12
...
5

Application to Diﬀerential Equations

Consider the following example
...
5
...

Solution: Let L(g(t)) = G(s)
...

Hence,
F (s) =

as2

G(s)
+ bs + c

+

(as + b)f0
af1

...
5
...
3
...

Example 10
...
2

1
...

if t ≥ 5

with y(0) = 1 and y ′ (0) = 4
...
Thus,
L(f (t)) =

1
e−5s
+

...

s2
s

10
...
APPLICATION TO DIFFERENTIAL EQUATIONS

203

Which gives
s
e−5s
1
+
+ 2
(s + 1)(s − 5) s(s + 1)(s − 5) s (s + 1)(s − 5)
1
5
1
e−5s
6
5
1
+
+
− +
+
6 s−5 s+1
30
s s+1 s−5
1
30 24
25
1
+
− 2 +
−
+

...

150

=

Remark 10
...
3 Even though f (t) is a discontinuous function at t = 5, the solution y(t) and y ′ (t)
are continuous functions of t, as y ′′ exists
...
Then both y(t) and y ′ (t) are continuous functions of time
...
5
...
Consider the IVP ty ′′ (t) + y ′ (t) + ty(t) = 0, with y(0) = 1 and y ′ (0) = 0
...

Solution: Applying Laplace transform, we have
d 2
d
s Y (s) − sy(0) − y ′ (0) + (sY (s) − y(0)) − Y (s) = 0
...

ds
This equation after simpliﬁcation can be rewritten as
Y ′ (s)
s
=− 2

...
From Example 10
...
2
...

t

2
...

+ as + b

Solution: Here, Y (s) =

s2

F (s)
1
= F (s) · 2

...
Show that y(t) =

1
a

f (τ )g(t − τ )dτ
...

204

CHAPTER 10
...
Hence,

t
0

f (τ ) sin(a(t − τ ))dτ
...
Solve the following IVP
...

Solution: Taking Laplace transform of both sides and using Theorem 10
...
5, we get
sY (s) − 1 =

Y (s)
1
1
+ 2 −4 2

...
6

s2 − 1
1
1
= −2 2

...

0

Transform of the Unit-Impulse Function

Consider the following example
...
6
...

Solution: Note that δh (t) =

1
U0 (t) − Uh (t)
...

h
s

Remark 10
...
2
1
...
6
...
That is, let
δ(t) = lim δh (t)
...
At origin, this function tends to inﬁnity
...
This
new function, δ(t), is called the unit-impulse function (or Dirac’s delta function)
...
We can also write
δ(t) = lim δh (t) = lim
h−
→0

h−→0

1
U0 (t) − Uh (t)
...
In the strict mathematical sense lim δh (t) does not exist
...

h−
→0

4
...

10
...
TRANSFORM OF THE UNIT-IMPULSE FUNCTION

205

1 − e−hs
5
...
Now, if we take the limit of both sides, as h approaches
hs
zero (apply L’Hospital’s rule), we get
L(δ(t)) = lim

h−
→0

1 − e−hs
se−hs
= lim
= 1
...
LAPLACE TRANSFORM

Part IV

Numerical Applications

Chapter 11

Newton’s Interpolation Formulae
11
...
, N
...
This process is
known as Interpolation
...
Here, we shall consider only those functions which are suﬃciently smooth, i
...
, they are
diﬀerentiable suﬃcient number of times
...
Hence, in the following we introduce various diﬀerence
operators and study their properties before looking at the interpolation methods
...
, xN are equally spaced, i
...
, xk −
xk−1 = h for each k = 1, 2,
...
The real number h is called the step length
...
Further, yk = f (xk ) gives the value of the function y = f (x) at the k th tabular point
...
, yN are known as nodes or nodal values
...
2
11
...
1

Diﬀerence Operator
Forward Diﬀerence Operator

Deﬁnition 11
...
1 (First Forward Diﬀerence Operator) We deﬁne the forward difference operator, denoted by ∆, as
∆f (x) = f (x + h) − f (x)
...
Given the step size h, this formula uses the values
at x and x + h, the point at the next step
...

Backward

x0

x1

x k−1 x k

x k+1

Forward

xn

210

CHAPTER 11
...
2
...

We note that
∆2 f (x) =

∆f (x + h) − ∆f (x)
f (x + 2h) − f (x + h) − f (x + h) − f (x)

=
=

f (x + 2h) − 2f (x + h) + f (x)
...

Deﬁnition 11
...
3 (rth Forward Diﬀerence Operator) The rth forward diﬀerence operator, ∆r , is deﬁned
as
∆r f (x)

= ∆r−1 f (x + h) − ∆r−1 f (x),

r = 1, 2,
...

Exercise 11
...
4 Show that ∆3 yk = ∆2 (∆yk ) = ∆(∆2 yk )
...

Example 11
...
5 For the tabulated values of y = f (x) ﬁnd ∆y3 and ∆3 y2
i
xi
yi

0
0
0
...
1
0
...
2
0
...
3
0
...
4
0
...
5
...
67

Solution: Here,
∆y3 = y4 − y3 = 0
...
35 = 0
...
67 − 3 × 0
...
35 − 0
...
01
...
2
...

j

Thus the rth forward diﬀerence at yk uses the values at yk , yk+1 ,
...

Example 11
...
7 If f (x) = x2 + ax + b, where a and b are real constants, calculate ∆r f (x)
...
2
...

Now,
∆2 f (x)
∆3 f (x)

and

Thus, ∆r f (x) = 0

= ∆f (x + h) − ∆f (x) = [2(x + h)h + h2 + ah] − [2xh + h2 + ah] = 2h2 ,
= ∆2 f (x) − ∆2 f (x) = 2h2 − 2h2 = 0
...

Remark 11
...
8 In general, if f (x) = xn + a1 xn−1 + a2 xn−2 + · · · + an−1 x + an is a polynomial of degree
n, then it can be shown that
∆n f (x) = n! hn and ∆n+r f (x) = 0

for r = 1, 2,
...

Remark 11
...
9
x0
x1
x2

...

...
For a set of tabular values, the horizontal forward diﬀerence table is written as:
y0
y1
y2

∆y0 = y1 − y0
∆y1 = y2 − y1
∆y2 = y3 − y2

yn−1
yn

∆2 y0 = ∆y1 − ∆y0
∆2 y1 = ∆y2 − ∆y1
∆2 y2 = ∆y3 − ∆y2

···
···

∆n y0 = ∆n−1 y1 − ∆n−1 y0

∆yn−1 = yn − yn−1

2
...
This is written as:
x0

y0

x1

y1

∆y0
∆2 y 0
∆3 y 0

∆y1
x2

...

...

11
...
2

Backward Diﬀerence Operator

Deﬁnition 11
...
10 (First Backward Diﬀerence Operator) The first backward difference operator, denoted by ∇, is deﬁned as
∇f (x) = f (x) − f (x − h)
...
As it moves in the backward direction, it is called the backward diﬀerence operator
...
NEWTON’S INTERPOLATION FORMULAE

Deﬁnition 11
...
11 (rth Backward Diﬀerence Operator) The rth backward diﬀerence operator, ∇r , is
deﬁned as
∇r f (x)

= ∇r−1 f (x) − ∇r−1 f (x − h),

r = 1, 2,
...

with

In particular, for x = xk , we get
∇yk = yk − yk−1 and ∇2 yk = yk − 2yk−1 + yk−2
...

Example 11
...
12 Using the tabulated values in Example 11
...
5, ﬁnd ∇y4 and ∇3 y3
...
49 − 0
...
14, and
∇3 y3

= ∇2 y3 − ∇2 y2 = (y3 − 2y2 + y1 ) − (y2 − 2y1 + y0 )
= y3 − 3y2 + 3y1 − y0

= 0
...
26 + 3 × 0
...
05 = −0
...

Example 11
...
13 If f (x) = x2 + ax + b, where a and b are real constants, calculate ∇r f (x)
...

Now,
∇2 f (x) =

and

∇3 f (x) =

Thus, ∇r f (x) = 0

∇f (x) − ∆f (x − h) = [2xh − h2 + ah] − [2(x − h)h − h2 + ah] = 2h2 ,

∇2 f (x) − ∇2 f (x) = 2h2 − 2h2 = 0
...

Remark 11
...
14 For a set of tabular values, backward diﬀerence table in the horizontal form is written
as:
x0
x1
x2

...

...
2
...

xi
yi

9
5
...
4

11
6
...
8

13
7
...
7

11
...
DIFFERENCE OPERATOR
x
9
10
11
12
13
14

y
5
5
...
0
6
...
5
8
...
4 = 5
...
6
0
...
7
0
...
2 = 0
...
4
0
...
1
-0
...
2-0
...
3
0
...
3 = -0
...
0
0
...
6 = 0
...
3)

In the similar manner, the backward diﬀerence table is written as follows:
x
9
10
11
12
13
14

y
5
5
...
8
7
...
1

∇y

∇2 y

∇3 y

∇4 y

∇5 y

0
...
6
0
...
7
0
...
2
0
...
1
-0
...
0
- 0
...
0

-0
...
3

0
...

Exercise 11
...
16

1
...

2
...

3
...
, yk
...

Remark 11
...
17 In general it can be shown that ∆k f (x) = ∇k f (x + kh) or ∆k ym = ∇k yk+m
Remark 11
...
18 In view of the remarks (11
...
8) and (11
...
17) it is obvious that, if y = f (x) is a
polynomial function of degree n, then ∇n f (x) is constant and ∇n+r f (x) = 0 for r > 0
...
2
...
2
...

= δ r−1 f (x +

Thus, δ 2 f (x) = f (x + h) − 2f (x) + f (x − h)
...

Thus, δ 2 uses the table of (xk , yk )
...

214

CHAPTER 11
...
2
...
2
...
e
...

Thus,
Eyi = yi+1 , E 2 yi = yi+2 ,

11
...
5

and E k yi = yi+k
...
2
...
e
...

2
2
2

Thus µ yi = 1 (yi+ 1 + yi− 1 ) and
2
2
2
µ2 y i =

11
...

2
2
2
4

Relations between Diﬀerence operators

1
...

Thus,
E ≡1+∆

or ∆ ≡ E − 1
...
Further, ∇(E(f (x)) = ∇(f (x + h)) = f (x + h) − f (x)
...

Thus E ≡ 1 + ∆, gives us
(1 − ∇)(1 + ∆)f (x) = f (x) for all x
...

Similarly,
∆ = (1 − ∇)−1 − 1
...
Let us denote by E 2 f (x) = f (x + h )
...

2
2

Thus,
1

1

δ = E 2 − E− 2
...

11
...
NEWTON’S INTERPOLATION FORMULAE

215

So, we have,
µ2 ≡
That is, the action of

1+

δ2
4

+1

µ≡

or

1+

δ2
4

...

4

4
...

2

Thus,
∆f (x) =

1 2
δ + δµ f (x),
2

i
...
,
∆≡

1 2
1
δ + δµ ≡ δ 2 + δ
2
2

1+

δ2

...
3
...
Verify the validity of the above table
...
Obtain the relations between the averaging operator and other diﬀerence operators
...
Find ∆2 y2 , ∇2 y2 , δ 2 y2 and µ2 y2 for the following tabular values:
i
xi
yi

11
...
0
11
...
5
12
...
0
14
...
5
15
...
0
16
...
, N
...

In the following, we shall use forward and backward diﬀerences to obtain polynomial function approximating y = f (x), when the tabular points xi ’s are equally spaced
...
NEWTON’S INTERPOLATION FORMULAE

where the polynomial PN (x) is given in the following form:
PN (x)

=

a0 + a1 (x − x0 ) + a2 (x − x0 )(x − x1 ) + · · · + ak (x − x0 )(x − x1 ) · · · (x − xk−1 )

+aN (x − x0 )(x − x1 ) · · · (x − xN −1 )
...
4
...
aN , to be determined using the fact that PN (xi ) = yi for i = 0, 1,
...

So, for i = 0, substitute x = x0 in (11
...
1) to get PN (x0 ) = y0
...
Next,
PN (x1 ) = y1 ⇒ y1 = a0 + (x1 − x0 )a1
...
For i = 2, y2 = a0 + (x2 − x0 )a1 + (x2 − x1 )(x2 − x0 )a2 , or equivalently
h
2h2 a2 = y2 − y0 − 2h(

Thus, a2 =

∆y0
) = y2 − 2y1 + y0 = ∆2 y0
...
Now, using mathematical induction, we get
2h2
ak =

∆k y0
for k = 0, 1, 2,
...

k! hk

Thus,
∆y0
∆2 y0
∆k y0
(x − x0 ) +
(x − x0 )(x − x1 ) + · · · +
(x − x0 ) · · · (x − xk−1 )
2
h
2! h
k! hk
∆N y0
+
(x − x0 )
...

N ! hN

PN (x)

= y0 +

As this uses the forward diﬀerences, it is called Newton’s Forward difference formula for interpolation, or simply, forward interpolation formula
...
4
...
, N
...

x − x0
Let u =
, then
h
x − x1 = hu + x0 − (x0 + h) = h(u − 1), x − x2 = h(u − 2),
...

With this transformation the above forward interpolation formula is simpliﬁed to the following form:
PN (u) =

=

∆y0
∆2 y0
∆k y0 hk
(hu) +
{(hu)(h(u − 1))} + · · · +
u(u − 1) · · · (u − k + 1)
2
h
2! h
k! hk
∆N y0
+···+
(hu) h(u − 1) · · · h(u − N + 1)
...
(u − N + 1)
...
4
...

(11
...
3)

11
...
NEWTON’S INTERPOLATION FORMULAE

217

For N = 2, we get a quadratic interpolating polynomial:
f (u) ≈ y0 + ∆y0 (u) +

∆2 y0
[u(u − 1)]
2!

(11
...
4)

and so on
...
Otherwise, this gives only an approximation to the true values of f (x)
...

h
(N + 1)!
Similarly, if we assume, PN (x) is of the form
PN (x) = b0 + b1 (x − xN ) + b1 (x − xN )(x − xN −1 ) + · · · + bN (x − xN )(x − xN −1 ) · · · (x − x1 ),
then using the fact that PN (xi ) = yi , we have
b0

=

b1

=

b2

=

yN
1
1
(yN − yN −1 ) = ∇yN
h
h
yN − 2yN −1 + yN −2
1
= 2 (∇2 yN )
2
2h
2h

...

...

k! hk
Thus, using backward diﬀerences and the transformation x = xN + hu, we obtain the Newton’s
backward interpolation formula as follows:
bk

=

PN (u) = yN + u∇yN +

u(u + 1) · · · (u + N − 1) N
u(u + 1) 2
∇ yN + · · · +
∇ yN
...
4
...
4
...
4
...

Remark 11
...
3 If the interpolating point lies closer to the beginning of the interval then one uses the
Newton’s forward formula and if it lies towards the end of the interval then Newton’s backward formula
is used
...
4
...
In fact N is so chosen that N th forward/backward diﬀerence almost remains
constant
...

Example 11
...
5
1
...
0045
...
001 0
...
003 0
...
005
y 1
...
123 1
...
127 1
...
1285
Solution: For this data, we have the Forward diﬀerence diﬀerence table
xi
0

...
002

...
004

...
121
1
...
1255
1
...
128
1
...
002
0
...
0015
0
...
0005

∆2 y3
0
...
0010
-0
...
0005

∆3 yi
-0
...
0005
0
...
002
-0
...
0025

218

CHAPTER 11
...
001 and u =
P5 (x)

=

x − x0
, we get
h

u(u − 1)
u(u − 1)(u − 2)
(
...
0015)
2
3!
u(u − 1)(u − 2)(u − 3)
u(u − 1)(u − 2)(u − 3)(u − 4)
(
...
0025)
...
121 + u ×
...
0045) = P5 (0 + 0
...
5)

0
...
0015
× 4
...
5 −
× 4
...
5 × 2
...
002
0
...
5 × 3
...
5 × 1
...
5 × 3
...
5 × 1
...
5
24
120
= 1
...

= 1
...
002 × 4
...
Using the following table for tan x, approximate its value at 0
...
Also, ﬁnd an error estimate (Note
tan(0
...
85953)
...
70
0
...
87707

0
...
91309

0
...
95045

0
...
98926

Solution: As the point x = 0
...
The forward diﬀerence table is:
xi
0
...
72
0
...
76
0
...
84229
0
...
91309
0
...
98926

∆yi
0
...
03602
0
...
03881

∆2 yi
0
...
00134
0
...
0001
0
...
00001

In the above table, we note that ∆3 y is almost constant, so we shall attempt 3rd degree polynomial
interpolation
...
71 − 0
...
70, h = 0
...
5
...
02
polynomial of degree 3, we get
P3 (u) = 0
...
03478u +

Thus,

0
...
0001
u(u − 1) +
u(u − 1)(u − 2)
...
71) ≈ 0
...
03478(0
...
00124
× 0
...
5)
2!

0
...
5 × (−0
...
5)
3!
= 0
...

+

An error estimate for the approximate value is
∆4 y0
u(u − 1)(u − 2)(u − 3)
4!

= 0
...

u=0
...
71) (upto 5 decimal place) is 0
...
and the approximate value,
obtained using the Newton’s interpolating polynomial is very close to this value
...

11
...
NEWTON’S INTERPOLATION FORMULAE

219

3
...
2
...
3) by taking
(i) x0 = 9
...
0
...
5)
...
3 is closer to the values lying in the beginning of tabular values, while
x = 13
...
Therefore, we shall use forward diﬀerence formula for
x = 10
...
5
...

2!
3!

Therefore,
(a) for x0 = 9
...
0 and x = 10
...
3) ≈
=

5 +
...
3 +

10
...
0
= 1
...
This gives,
1

...
0
(1
...
3 + (1
...
3 × (−0
...
559
...
0, h = 1
...
3, we have u =
f (10
...
4 +
...
3 +

10
...
0
=
...
This gives,
1

...
3
(
...
7) +
(
...
7) × (−1
...
54115
...
3 is closer to x = 10
...
0 to
be a better approximation
...
5, we use the backward interpolating polynomial, which gives,
f (xN + hu) ≈ y0 + ∇yN u +

∇2 yN
∆3 yN
u(u + 1) +
u(u + 1)(u + 2)
...
0 and x = 13
...
5) ≈ 8
...
6 × (−0
...
5 − 14
= −0
...
This gives,
1

−0
...
0
(−0
...
5 +
(−0
...
5 × (1
...
8125
...
4
...
0

1
...
2
0
...
4
0
...
6
0
...
8
0
...
0
1
...
3 and x = 1
...
The speed of a train, running between two station is measured at diﬀerent distances from the starting
station
...
from the starting station, then v(x), the speed (in km/hr) of the
train at the distance x is given by the following table:
x
v(x)

0
0

50
60

100
80

150
110

200
90

250
0

Find the approximate speed of the train at the mid point between the two stations
...
NEWTON’S INTERPOLATION FORMULAE
x

3
...
04
0
...
08
0
...
12
0
...
16
0
...
20
0
...
Compute S(0
...

4
...
00 am to 8
...
on May 10, 2005 in
Kanpur:
Time
Temperature

8 am
30

12 noon
37

4 pm
43

8pm
38

Obtain Newton’s backward interpolating polynomial of degree 3 to compute the temperature in Kanpur
on that day at 5
...

Chapter 12

Lagrange’s Interpolation Formula
12
...
, xN
...
In view of this, it is desirable to derive an interpolating formula, which is applicable even for unequally distant points
...
Unlike the previous interpolating formulas, it does not use the notion of
diﬀerences, however we shall introduce the concept of divided diﬀerences before coming to it
...
2

Divided Diﬀerences

Deﬁnition 12
...
1 (First Divided Diﬀerence) The ratio
f (xi ) − f (xj )
xi − xj
for any two points xi and xj is called the first divided difference of f (x) relative to xi and xj
...

Let us assume that the function y = f (x) is linear
...
e
...
Hence,
δ[xi , xj ] =

f (xi ) − f (xj )
= δ[xj , xi ]
...
e
...

x − x0
Thus, f (x) = f (x0 ) + (x − x0 )δ[x0 , x1 ]
...

Deﬁnition 12
...
2 (Second Divided Diﬀerence) The ratio
δ[xi , xj , xk ] =

δ[xj , xk ] − δ[xi , xj ]
xk − xi

is deﬁned as second divided difference of f (x) relative to xi , xj and xk
...
LAGRANGE’S INTERPOLATION FORMULA

If f (x) is a second degree polynomial then δ[x0 , x] is a linear function of x
...

In view of the above, for a polynomial function of degree 2, we have δ[x, x0 , x1 ] = δ[x0 , x1 , x2 ]
...

x − x1
This gives,
δ[x, x0 ] = δ[x0 , x1 ] + (x − x1 )δ[x0 , x1 , x2 ]
...

So, whenever f (x) is approximated with a second degree polynomial, the value of f (x) at any point
x can be computed using the above polynomial, which uses the values at three points x0 , x1 and x2
...
2
...

i
xi
f (xi )

0
0
...
12

1
0
...
24

2
0
...
40

Also, ﬁnd the approximate value of the function at x = 0
...

Solution: We shall ﬁrst calculate the desired divided diﬀerences
...
24 − 1
...
16 − 0
...
40 − 1
...
2 − 0
...
2 − 0
...

x2 − x0

Thus,
f (x) ≈ P2 (x) = 1
...
1) + 20(x − 0
...
16)
...
13) ≈ 1
...
13 − 0
...
13 − 0
...
13 − 0
...
162
...
2
...
Using the following table, which gives values of log(x) corresponding to certain values
of x, ﬁnd approximate value of log(323
...

x
log(x)

322
...
50893

324
...
51081

325
2
...
Show that
δ[x0 , x1 , x2 ] =

f (x0 )
f (x1 )
f (x2 )
+
+

...
That is,
the second divided diﬀerence remains unchanged regardless of how its arguments are interchanged
...
Show that for equidistant points x0 , x1 and x2 , δ[x0 , x1 , x2 ] =
=
, where yk = f (xk ),
2h2
2h2
and h = x1 − x0 = x2 − x1
...
2
...
Show that for a linear function, the second divided diﬀerence with respect to any three points, xi , xj
and xk , is always zero
...

Deﬁnition 12
...
5 (k th Divided Diﬀerence) The k th divided difference of f (x) relative to the tabular points x0 , x1 ,
...
, xk ] =

δ[x1 , x2 ,
...
, xk−1 ]

...

In general,
∆n yi
δ[xi , xi+1 ,
...

δ[x0 , x1 ,
...
2
...
2
...
2
...
2
...

Example 12
...
7 Show that f (x) can be written as
f (x) = f (x0 ) + δ[x0 , x1 ](x − x0 ) + δ[x, x0 , x1 ](x − x0 )(x − x1 )
...
Now since,
δ[x, x0 ] =

f (x) − f (x0 )
,
(x − x0 )

we get the desired result
...
2
...

Further show that P2 (xi ) = f (xi ) for i = 0, 1
...
2
...
, xn ](x − x0 )(x − x1 )(x − x2 ) · · · (x − xn−1 ),

and Rn+1 (x) = (x − x0 )(x − x1 )(x − x2 ) · · · (x − xn )δ[x, x0 , x1 , x2 ,
...

Here, Rn+1 (x) is called the remainder term
...

Further, if f (x) is a polynomial of degree n, then in view of the Remark 12
...
6, the remainder term,
Rn+1 (x) = 0, as it is a multiple of the (n + 1)th divided diﬀerence, which is 0
...
LAGRANGE’S INTERPOLATION FORMULA

12
...
However, before going to that, we see below an important result
...
3
...
, xk ] can be written as:
δ[x0 , x1 ,
...
We will prove the result by induction on k
...
For k = 1,
δ[x0 , x1 ] =

f (x1 ) − f (x0 )
f (x0 )
f (x1 )
=
+

...
e
...
, xn ]

=

f (x0 )
f (x1 )
+
(x0 − x1 )(x0 − x2 ) · · · (x0 − xn )
(x1 − x0 )(x1 − x2 ) · · · (x1 − xn )
f (xn )
+··· +

...
, xn+1 ]

=
=

δ[x1 , x2 ,
...
, xn ]
xn+1 − x0
1
f (x1 )
f (x2 )
+
+
xn+1 − x0 (x1 − x2 ) · · · (x1 − xn+1 )
(x2 − x1 )(x2 − x3 ) · · · (x2 − xn+1 )
··· +

f (xn+1 )
1
−
(xn+1 − x1 ) · · · (xn+1 − xn )
xn+1 − x0

f (x0 )
+
(x0 − x1 ) · · · (x0 − xn )

f (x1 )
f (xn )
+ ··· +
(x1 − x0 )(x1 − x2 ) · · · (x1 − xn )
(xn − x0 ) · · · (xn − xn−1 )

which on rearranging the terms gives the desired result
...

Remark 12
...
2 In view of the theorem 12
...
1 the k th divided diﬀerence of a function f (x), remains
unchanged regardless of how its arguments are interchanged, i
...
, it is independent of the order of its
arguments
...
, xn will be zero,(Remark 12
...
6) i
...
,
δ[x, x0 , x1 ,
...
3
...
3
...

n
i=0

f (xi )

n
i=0

(xi − xj )

j=0, j=i

Note that the expression on the right is a polynomial of degree n and takes the value f (xi ) at x = xi
for i = 0, 1, · · · , (n − 1)
...

Remark 12
...
3 In view of the Remark (12
...
9), we can observe that Pn (x) is another form of Lagrange’s
Interpolation polynomial formula as obtained above
...

Remark 12
...
4 We have seen earlier that the divided diﬀerences are independent of the order of its
arguments
...
Thus one can use Lagrange’s formula even
when the points x0 , x1 , · · · , xk , · · · , xn are in any order, which was not possible in the case of Newton’s
Diﬀerence formulae
...
3
...
This is done by interchanging the roles of x and y, i
...
while using the table
of values, we take tabular points as yk and nodal points are taken as xk
...
3
...
3
11
...
6
12
...
2
14
...
4
17
...
8
19
...
00
...
3)(10 − 9
...
2)(10 − 10
...
8) = −0
...
0704, and

n

= 0
...
1728,

j=0, j=2

n

j=0, j=3

j=0, j=4

(x4 − xj ) = 0
...

(x2 − xj ) = 0
...
LAGRANGE’S INTERPOLATION FORMULA

Thus,
11
...
80
14
...
7 × 0
...
4 × (−0
...
2) × 0
...
00
19
...
4) × (−0
...
8) × 0
...
197845
...
01792 ×

Now to ﬁnd the value of x such that f (x) = 16, we interchange the roles of x and y and calculate the
following products:
4

4

j=0

(y − yj ) =
=

4

j=0

(16 − yj )

(16 − 11
...
8)(16 − 14
...
0)(16 − 19
...
7168,
n

(y0 − yj ) =

217
...
4688, and

n

j=1

j=0, j=1

(y1 − yj ) = −78
...
5471,

n

j=0, j=3

j=0, j=4

(y4 − yj ) = 839
...

Thus,the required value of x is obtained as:
x

9
...
6
10
...
6 × 217
...
2 × (−78
...
3 × 73
...
40
10
...
0) × (−151
...
8) × 839
...
39123
...
3248 ×

Exercise 12
...
7 The following table gives the data for steam pressure P vs temperature T :
T
P = f (T )

360
154
...
0

373
190
...
0

390
240
...

Exercise 12
...
8 Compute from following table the value of y for x = 6
...
60
2
...
90
1
...
50
1
...
90
1
...
20
2
...
00

12
...
The process involves renaming or re-designating the tabular points
...
This can be generalized for
more number of points
...
These six points in the given order are not equidistant
...
These
−2
−1
0
1
2
3

12
...
GAUSS’S AND STIRLING’S FORMULAS

227

re-designated tabular points in their given order are equidistant
...
3
...

0
1
−1
2
−2
3
0
1
−1
2
−2

Now note that the points x∗ , x∗ , x∗ , x∗ , x∗ and x∗ are equidistant and the divided diﬀerence are
−2
−1
0
1
2
3
independent of the order of their arguments
...
Now using the above relations and the transformation x =
i
∗
x0 + hu, we get

f (x∗ + hu) ≈
0

∗
∗
∗
∆2 y−1
∆3 y−1
∆y0
(hu) +
(hu)(hu − h) +
(hu)(hu − h)(hu + h)
h
2h2
3!h3
∗
∆4 y−2
+
(hu)(hu − h)(hu + h)(hu − 2h)
4!h4
5 ∗
∆ y−2
+
(hu)(hu − h)(hu + h)(hu − 2h)(hu + 2h)
...

4!
5!
∗
∗
y0 + u∆y0 + u(u − 1)

(12
...
1)

Similarly using the tabular points x0 , x1 = x0 −h, x2 = x0 +h, x3 = x0 −2h, x4 = x0 +2h, x5 = x0 −3h, and
the re-designating them, as x∗ , x∗ , x∗ , x∗ , x∗ and x∗ , we get another form of interpolating polynomial
−3
−2
−1
0
1
2
as follows:
f (x∗ + hu) ≈
0

∗
∗
∆2 y−1
∆3 y−2
+ u(u2 − 1)
2!
3!
∗
∗
∆4 y−2
∆5 y−3
+u(u2 − 1)(u + 2)
+ u(u2 − 1)(u2 − 22 )

...
4
...
LAGRANGE’S INTERPOLATION FORMULA

Now taking the average of the two interpoating polynomials (12
...
1) and (12
...
2) (called Gauss’s first
and second interpolating formulas, respectively), we obtain Sterling’s Formula of interpolation:
∗
∗
∗
∗
∗
∆y−1 + ∆y0
∆2 y−1
u(u2 − 1) ∆3 y−2 + ∆3 y−1
+ u2
+
2
2!
2
3!
4 ∗
5 ∗
∗
2
2
2
∆ y−2
u(u − 1)(u − 2 ) ∆ y−3 + ∆5 y−2
+u2 (u2 − 1)
+
+ · · ·
...
4
...

In this case one usually writes the diagonal form of the diﬀerence table
...
4
...
225 :
x
f (x)

0
...
37638

0
...
28919

0
...
20879

0
...
13427

0
...
06489

Here the point x = 0
...
22
...
20, x−1 =
0
...
22, x1 = 0
...
24, to get the diﬀerence table in diagonal form as:
x−2 = 0
...
37638

x−1 =
...
28919

∆y−2 = −
...
00679
∆3 y−2 = −
...
08040
x0 = 0
...
20879

−1

x1 = 0
...
00017

=
...
00074

∆y0 = −
...
00514

y1 = 1
...
06938

x2 = 0
...
06489

(here, ∆y0 = y1 − y0 = 1
...
20879 = −
...
20879 − 1
...
08040; and
∆2 y−1 = ∆y0 − ∆y−1 =
...

0
...
22
Using the Sterling’s formula with u =
= 0
...
225) as follows:
0
...
225) =
+
=

−
...
07452

...
5)2
2
2
−0
...
52 − 1) (−
...
00074) 2

...
5 (0
...
1708457
1
...
5

Note that tabulated value of cot(πx) at x = 0
...
1708496
...
4
...
05 :
x
y

0
...
00000

0
...
02256

0
...
04511

0
...
06762

0
...
09007

Chapter 13

Numerical Diﬀerentiation and
Integration
13
...

13
...
and the
diﬀerentiate the polynomial term by term to get an approximated polynomial to the derivative of the
function
...
Newton’s Forward/ Backward
formula is used depending upon the location of the point at which the derivative is to be computed
...
We illustrate
the process by taking (i) Newton’s Forward formula, and (ii) Sterling’s formula
...
(u − n + 1)}
...
2
...
2
...

h

≈

1
∆2 y0
∆3 y0
∆y0 +
(2u − 1) +
(3u2 − 6u + 2) + · · ·
h
2!
3!
∆n y0
n(n − 1)2 n−2
+
nun−1 −
u
+ · · · + (−1)(n−1) (n − 1)!
n!
2

...
2
...
NUMERICAL DIFFERENTIATION AND INTEGRATION

Thus, an approximation to the value of ﬁrst derivative at x = x0 i
...
u = 0 is obtained as :
df
dx

=
x=x0

1
∆2 y0
∆3 y0
∆n y0
∆y0 −
+
− · · · + (−1)(n−1)

...
2
...
2
...
2
...

h
2
2
3!
2 × 5!

(13
...
6)

Remark 13
...
1 Numerical diﬀerentiation using Stirling’s formula is found to be more accurate than
that with the Newton’s diﬀerence formulae
...

Now higher derivatives can be found by successively diﬀerentiating the interpolating polynomials
...
g
...
2
...

h2
4!

Example 13
...
2 Compute from following table the value of the derivative of y = f (x) at x = 1
...
73
1
...
74
1
...
75
1
...
76
1
...
77
1
...
75, h = 0
...
7489 − 1
...
01 = −0
...
0017290797, ∆2y0 =
...
0000001712,
∆y−1 = −
...
0000173774, ∆3y−1 = −
...
0000001749, ∆4y−2 = −
...
7489) is obtained as:
(i) Using Newton’s Forward diﬀerence formula,
f ′ (1
...
0000172047
−0
...
11 − 1) ×
0
...
0000001712
= −0
...

+ (3 × (−0
...
11 + 2) ×
3!

(ii) Using Stirling’s formula, we get:
f ′ (1
...
0017464571) + (−
...
11) ×
...
01
2
(3 × (−0
...
0000001749) + (−
...
0000000022)
2
+ 2 × (−0
...
11) − 1) ×
4!
= −0
...
2
...
1739652000 at x = 1
...

Example 13
...
3 Using only the ﬁrst term in the formula (13
...
6) show that
f ′ (x∗ ) ≈
0

∗
∗
y1 − y−1

...
15 :
x
ex

1
...
8577

1
...
1582

1
...
4903

Solution: Truncating the formula (13
...
6)after the ﬁrst term, we get:
f ′ (x∗ ) ≈
0
=
=

∗
∗
1 ∆y−1 + ∆y0
h
2
∗
∗
∗
∗
(y0 − y−1 ) + (y1 − y0 )
2h
∗
∗
y1 − y−1

...
15, we have
0
f ′ (1
...
4903 − 2
...
1630
...
1

Note the error between the computed value and the true value is 3
...
1582 = 0
...

Exercise 13
...
4 Retaining only the ﬁrst two terms in the formula (13
...
3), show that
f ′ (x0 ) ≈

−3y0 + 4y1 − y2

...
15 from the following table:
x
ex

1
...
1582

1
...
3201

1
...
4903

Also compare your result with the computed value in the example (13
...
3)
...
2
...
2
...

12h

Hence compute from following table the value of the derivative of y = ex at x = 1
...
05
2
...
10
3
...
15
3
...
20
3
...
25
3
...
2
...
05 × k,
k = 0, 1, 2, 3, 4, 5
...
00
0
...
05
0
...
10
0
...
15
0
...
20
0
...
25
0
...
0 by using the formula (13
...
2)
...
1 by using the formula (13
...
6)
...
NUMERICAL DIFFERENTIATION AND INTEGRATION

Similarly, if we have tabular points which are not equidistant, one can use Lagrange’s interpolating
polynomial, which is diﬀerentiated to get an estimate of ﬁrst derivative
...
Let x0 , x1 , x2 , x3 be the tabular points, then the
corresponding Lagrange’s formula gives us:
f (x)

(x − x1 )(x − x2 )(x − x3 )
(x − x0 )(x − x2 )(x − x3 )
f (x0 ) +
f (x1 )
(x0 − x1 )(x0 − x2 )(x0 − x3 )
(x1 − x0 )(x1 − x2 )(x1 − x3 )
(x − x0 )(x − x1 )(x − x2 )
(x − x0 )(x − x1 )(x − x3 )
f (x2 ) +
f (x3 )
+
(x2 − x0 )(x2 − x1 )(x2 − x3 )
(x3 − x0 )(x3 − x1 )(x3 − x2 )

≈

Diﬀerentiation of the above interpolating polynomial gives:
df (x)
dx

≈

=

(x − x2 )(x − x3 ) + (x − x1 )(x − x2 ) + (x − x1 )(x − x3 )
f (x0 )
(x0 − x1 )(x0 − x2 )(x0 − x3 )
(x − x2 )(x − x3 ) + (x − x0 )(x − x2 ) + (x − x0 )(x − x3 )
+
f (x1 )
(x1 − x0 )(x1 − x2 )(x1 − x3 )
(x − x1 )(x − x2 ) + (x − x0 )(x − x1 ) + (x − x0 )(x − x3 )
+
f (x2 )
(x2 − x0 )(x2 − x1 )(x2 − x3 )
(x − x1 )(x − x2 ) + (x − x0 )(x − x2 ) + (x − x0 )(x − x1 )
+
f (x3 )
(x3 − x0 )(x3 − x1 )(x3 − x2 )




3
3
3


f (xi )
1


...
2
...

(x2 − x0 )(x2 − x1 )(x2 − x3 )
(x3 − x0 )(x3 − x1 )(x3 − x2 )

Now, generalizing Equation (13
...
7) for n + 1 tabular points x0 , x1 , · · · , xn we get:
df
dx



n

=


(x − xr ) 


r=0

n

f (xi )
n

i=0

(x − xi )

(xi − xj )

j=0, j=i



n



k=0, k=i




1

...
2
...
6 :
x
y

0
...
3836494

0
...
2442376

0
...
7275054

Solution: The given tabular points are not equidistant, so we use Lagrange’s interpolating polynomial with
three points: x0 = 0
...
6, x2 = 0
...
Now diﬀerentiating this polynomial the derivative of the function
at x = x1 is obtained in the following form:
df
dx

x=x1

≈

(x1 − x2 )
1
1
(x1 − x0 )
f (x0 ) +
+
f (x1 ) +
f (x2 )
...

Now, using the values from the table, we get:
df
dx

x=0
...
6 − 0
...
3836494 +
+
× 4
...
4 − 0
...
4 − 0
...
6 − 0
...
6 − 0
...
6 − 0
...
7225054
(0
...
4)(0
...
6)
−5
...
221188 + 31
...
6227656
...
3
...
6 is 4
...

Exercise 13
...
8 For the function, whose tabular values are given in the above example(13
...
8), compute the
value of its derivative at x = 0
...

Remark 13
...
9 It may be remarked here that the numerical diﬀerentiation for higher derivatives does
not give very accurate results and so is not much preferred
...
3

Numerical Integration
b

f (x)dx, when

Numerical Integration is the process of computing the value of a deﬁnite integral,
a

the values of the integrand function, y = f (x) are given at some tabular points
...
This
gives us ’quadrature formula’ for numerical integration
...
Otherwise, one uses Lagrange’s formula for the
interpolating polynomial
...

13
...
1

A General Quadrature Formula

Let f (xk ) = yk be the nodal value at the tabular point xk for k = 0, 1, · · · , xn , where x0 = a and
xn = x0 + nh = b
...
Thus, we get,
b

b

f (x)dx

=

a

y0 +
a

+

∆y0
∆2 y0
∆3 y0
(x − x0 ) +
(x − x0 )(x − x1 ) +
(x − x0 )(x − x1 )(x − x2 )
2
h
2!h
3!h3

∆4 y0
(x − x0 )(x − x1 )(x − x2 )(x − x3 ) + · · · dx
4!h4

This on using the transformation x = x0 + hu gives:
b

n

f (x)dx

=

h

y0 + u∆y0 +

a

0

+

∆3 y0
∆2 y0
u(u − 1) +
u(u − 1)(u − 2)
2!
3!

∆4 y0
u(u − 1)(u − 2)(u − 3) + · · · du
4!

which on term by term integration gives,
b

f (x)dx

=

h ny0 +

n2
∆2 y0
∆y0 +
2
2!

a

+

∆4 y0
4!

n3
n2
−
3
2

n5
3n4
11n3
−
+
− 3n2
5
2
3

+

∆3 y0
3!

+ ···

n4
− n3 + n2
4
(13
...
1)

For n = 1, i
...
, when linear interpolating polynomial is used then, we have
b

f (x)dx = h y0 +
a

∆y0
h
= [y0 + y1 ]
...
3
...
NUMERICAL DIFFERENTIATION AND INTEGRATION

Similarly, using interpolating polynomial of degree 2 (i
...
n = 2), we obtain,
b

f (x)dx

8 4
−
3 2

=

h 2y0 + 2∆y0 +

=

2h y0 + (y1 − y0 ) +

a

∆2 y0
2

1 y2 − 2y1 + y0
h
×
= [y0 + 4y1 + y2 ]
...
3
...
However, this process is not very useful
...

13
...
2

Trapezoidal Rule

Here, the integral is computed on each of the sub-intervals by using linear interpolating formula, i
...
for
n = 1 and then summing them up to obtain the desired integral
...
3
...

2

Thus, we have,
b

f (x)dx =
a

h
h
h
h
h
[y0 + y1 ] + [y1 + y2 ] + · · · + [yk + yk+1 ] + · · · + [yn−2 + yn−1 ] + [yn−1 + yn ]
2
2
2
2
2

i
...

b

=

h
[y0 + 2y1 + 2y2 + · · · + 2yk + · · · + 2yn−1 + yn ]
2

=

f (x)dx

h

a

y0 + yn
+
2

n−1

yi
...
3
...
It is a simple quadrature formula, but is not very accurate
...
3
...

Recall that in the case of linear function, the second forward diﬀerences is zero, hence, the Trapezoidal
rule gives exact value of the integral if the integrand is a linear function
...
3
...
3
...
0
0
...
2
0
...
4
0
...
6
0
...
8
0
...
0
1
...
01005 1
...
09417 1
...
28402 1
...
63231 1
...
2479 2
...
1, n = 10,
y0 + y10
1
...
71828
=
= 1
...
81257
...
3
...
1 × [1
...
81257] = 1
...
e
...
Note that for each of these sub-intervals, we have
the three tabular points x2k , x2k+1 , x2k+2 and so the integrand is replaced with a quadratic interpolating
polynomial
...
3
...

3

x2k

In view of this, we have
x2

b

f (x)dx

=

a

x2k+2

x4

f (x)dx +
x0

x2

f (x)dx + · · · +

x2k

f (x)dx + · · · +

xn

f (x)dx
xn−2

h
[(y0 + 4y1 + y2 ) + (y2 + 4y3 + y4 ) + · · · + (yn−2 + 4yn−1 + yn )]
3
h
[y0 + 4y1 + 2y2 + 4y3 + 2y4 + · · · + 2yn−2 + 4yn−1 + yn ] ,
3

=
=

which gives the second quadrature formula as follows:
b

f (x)dx
a

=

h
[(y0 + yn ) + 4 × (y1 + y3 + · · · + y2k+1 + · · · + yn−1 )
3

+ 2 × (y2 + y4 + · · · + y2k + · · · + yn−2 )]





n−1
n−2
h
=
(y0 + yn ) + 4 × 
yi  + 2 × 
yi 
...
3
...

Remark 13
...
3 An estimate for the error E2 in numerical integration using the Simpson’s rule is given
by
b−a 4
E2 = −
∆ y,
(13
...
6)
180
where ∆4 y is the average value of the forth forward diﬀerences
...
NUMERICAL DIFFERENTIATION AND INTEGRATION
2

Example 13
...
4 Using the table for the values of y = ex as is given in Example 13
...
2, compute the integral
1

2

ex dx, by Simpson’s rule
...

Solution: Here, h = 0
...
Further,
9

y0 + y10 = 1
...
71828 = 3
...
26845,
i=1, i−odd

and

8

yi = y2 + y4 + y6 + y8 = 5
...

i=2, i−even

Thus,
1
2

ex dx =
0

0
...
71828 + 4 × 7
...
54412] = 1
...
0 1
...
01005 0
...
00189
0
...
01005 0
...
02260 0
...
2 1
...
05336 0
...
00519
0
...
09417 0
...
03117 0
...
4 1
...
11051 0
...
01090
0
...
28402 0
...
04969 0
...
6 1
...
19899 0
...
02207
0
...
63231 0
...
08725 0
...
8 1
...
35142 0
...
9 2
...
47038
1
...
71828
Thus, error due to Trapezoidal rule is,
E1

diﬀerence table, which is given below:
∆4 yi
0
...
00171
0
...
00320
0
...
00658
0
...
02071 + 0
...
02598 + 0
...
03879 + 0
...
06518 + 0
...
11896
= − ×
12
9
= −0
...

= −

Similarly, error due to Simpson’s rule is,
E2

1−0 4
∆ y
180
1
0
...
00171 + 0
...
00328 + 0
...
00658 + 0
...
35873 × 10−5
...

1

Example 13
...
5 Compute the integral

f (x)dx, where the table for the values of y = f (x) is given below:
0
...
05
0
...
15
0
...
3
0
...
5
0
...
7
0
...
9
1
...
0785 0
...
2334 0
...
4540 0
...
7071 0
...
8910 0
...
9877 1
...
However, we notice that the tabular points 0
...
10, 0, 15 and 0
...
3
...
2, 0
...
4, 0
...
6, 0
...
8, 0
...
0
...
05, 0
...
2, 1
...
2

f (x)dx =
0
...
05

f (x)dx
0
...
The integrals then can be evaluated in each interval
...
Thus, the ﬁrst integral is evaluated by using Trapezoidal rule and the second one by Simpson’s rule
(of course, one could have used Trapezoidal rule in both the subintervals)
...
05 and for the second one h = 0
...
Thus,
0
...
05

f (x)dx = 0
...
0

and

f (x)dx

0
...
3090
+ 0
...
2334 = 0
...
1
× (0
...
0000) + 4 × (0
...
7071 + 0
...
9877)
3

=

0
...
5878 + 0
...
9511)
=

0
...
0291775 + 0
...
6346442
0
...
6346526
...

b

Exercise 13
...
6

f (x)dx, where the table for the values

1
...
Also ﬁnd an error estimate for the computed value
...
09531 0
...
26236 0
...
40546 0
...
53063 0
...
64185 0
...
50
0
...
55
0
...
0
1
...
5
2
...
60
0
...
0
3
...
65
0
...
5
6
...
70
0
...
0
10
...
75
0
...
80
0
...
5
16
...
Using Simpson’s rule, compute the integral

f (x)dx
...

(a) Use the table given in Exercise 13
...
6
...

(b)

x
y

a = 0
...
493

1
...
946

1
...
325

2
...
605

2
...
778

3
...
849

b = 3
...
833

1
...
Compute the integral

f (x)dx, where the table for the values of y = f (x) is given below:
0

x
y

0
...
00

0
...
39

0
...
77

0
...
27

1
...
90

1
...
26

1
...
65

1
...
07

1
...
53

238

CHAPTER 13
...
1

System of Linear Equations

Theorem 14
...
1 (Existence and Non-existence) Consider a linear system Ax = b, where A is a m × n
matrix, and x, b are vectors with orders n × 1, and m × 1, respectively
...
Then exactly one of the following statement holds:
1
...
, un−r are n × 1 vectors satisfying Au0 = b and Aui = 0 for 1 ≤ i ≤ n − r
...
if ra = r = n, the solution set of the linear system has a unique n × 1 vector x0 satisfying Ax0 = 0
...
If r < ra , the linear system has no solution
...
Suppose [C d] is the row reduced echelon form of the augmented matrix [A b]
...
3
...
So, the proof consists of understanding the solution set of the linear system Cx = d
...
Let r = ra < n
...
So, by Remark 2
...
5, the matrix C = [cij ]
has r leading columns
...
Then we observe
the following:
(a) the entries clil for 1 ≤ l ≤ r are leading terms
...
The entry clil = 1;
(b) corresponding is each leading column, we have r basic variables, xi1 , xi2 ,
...
4
...
, xjn−r
...

For 1 ≤ l ≤ r, consider the lth row of [C d]
...
Also, the
ﬁrst r rows of the augmented matrix [C d] give rise to the linear equations
n−r

xil +

cljk xjk = dl ,
k=1

for 1 ≤ l ≤ r
...
APPENDIX
These equations can be rewritten as
n−r

xil = dl −

cljk xjk = dl ,
k=1

for 1 ≤ l ≤ r
...
, xir , xj1 ,
...
Then the set of solutions consists of


n−r


d1 −
c1jk xjk 

xi1
k=1


...


...



...
 




n−r

 xir  
dr −

crjk xjk 
...
 
xj1



...
 

...




...
1
...
That is,
for 1 ≤ s ≤ n − r, xjs = ks
...

...

...

...

...

...

...


...


...


...


...


...


...

 − · · · − kn−r 
 − k2 
 0 
 −1 
 0 
0






 

...


...


...


...


...


...

 0 
 0 
 0 
0






 
−1
0
0
0

Let us write v0 t = (d1 , d2 ,
...
, 0)t
...
Observe the following:
(a) if we assign ks = 0, for 1 ≤ s ≤ n − r, we get
Cv0 = Cy = d
...
1
...

(14
...
3)

So, using (14
...
2), we get Cv1 = 0
...

So, using (14
...
2), we get Cvt = 0
...
1
...
1
...
, xn )t
...
Therefore, we have Cu0 = d and for
1 ≤ i ≤ n − r, Cui = 0
...

Thus, we have obtained the desired result for the case r = r1 < n
...
r = ra = n, m ≥ n
...
Also, the
number of columns in C equals n = rank (A) = rank (C)
...
4
...
, xn are basic variables
...

0

˜
Therefore, the solution set of the linear system Cx = d is obtained using the equation In x = d
...
Also, by Theorem 2
...
11, the row reduced form of a given
matrix is unique, the solution obtained above is the only solution
...

3
...

As C has n columns, the row reduced echelon matrix [C d] has n + 1 columns
...
We now observe the following:
(a) as rank(C) = r, the (r + 1)th row of C consists of only zeros
...

Thus, the (r + 1)th row of [C d] is of the form (0,
...
Or in other words, dr+1 = 1
...

This linear equation has no solution
...

Therefore, by Theorem 2
...
4, the linear system Ax = b has no solution
...

Corollary 14
...
2 Consider the linear system Ax = b
...

1
...

2
...

242

CHAPTER 14
...
2

Determinant

In this section, S denotes the set {1, 2,
...

Deﬁnition 14
...
1
and onto
...
A function σ : S−→S is called a permutation on n elements if σ is both one to one

2
...
That is,
Sn is the set of all permutations of the set {1, 2,
...

2
···
σ(2) · · ·

1
σ(1)
This representation of a permutation is called a two row notation for σ
...
2
...
In general, we represent a permutation σ by σ =

n
σ(n)

...
For each positive integer n, Sn has a special permutation called the identity permutation, denoted Idn ,
1 2 ··· n
such that Idn (i) = i for 1 ≤ i ≤ n
...

1 2 ··· n
3
...
Then
S3

=

τ1 =

1
1
τ4 =

2 3
2 3
1
2

, τ2 =
2 3
3 1

1
1
, τ5 =

2 3
3 2
1
3

, τ3 =
2 3
1 2

1
2
, τ6 =

2 3
1 3
1
3

,
2 3
2 1

(14
...
5)

Remark 14
...
3
1
...
Then σ is determined if σ(i) is known for i = 1, 2,
...
As σ is
both one to one and onto, {σ(1), σ(2),
...
So, there are n choices for σ(1) (any element
of S), n − 1 choices for σ(2) (any element of S diﬀerent from σ(1)), and so on
...
Thus, the number of elements in Sn is n!
...

2
...
Then both σ and τ are one to one and onto
...
Hence, σ ◦ τ is also a
permutation
...

3
...
Then σ is both one to one and onto
...
Hence, for every element σ ∈ Sn , σ −1 ∈ Sn and is the inverse of σ
...
Observe that for any σ ∈ Sn , the compositions σ ◦ σ −1 = σ −1 ◦ σ = Idn
...
2
...
Then the following holds:
1
...
Then the sets {σ ◦ τ : σ ∈ Sn } and {τ ◦ σ : σ ∈ Sn } have exactly n! elements
...

2
...

Proof
...
It can easily be veriﬁed that β = τ −1 ◦ α and γ = α ◦ τ −1
...
Hence the result holds
...
2
...
2
...
Then the number of inversions of σ, denoted n(σ), equals
|{(i, j) : i < j, σ(i) > σ(j) }|
...
, n}|
...
2
...
, n} such that σ(m) = r, σ(r) = m and σ(i) = i for 1 ≤ i = m, r ≤ n
...
Also, note that for any transposition
σ ∈ Sn , σ −1 = σ
...

1 2 3 4
is a transposition as τ (1) = 3, τ (3) =
3 2 1 4
1, τ (2) = 2 and τ (4) = 4
...
Also, check that

Example 14
...
7

1
...

2
...
Then check that

n(τ ) = 3 + 1 + 1 + 1 + 0 + 3 + 2 + 1 = 12
...
Let ℓ, m and r be distinct element from {1, 2,
...
Suppose τ = (m r) and σ = (m ℓ)
...

Therefore, τ ◦ σ = (m r) ◦ (m ℓ) =
Similarly check that σ ◦ τ =

1 2
1 2

1 2
1 2
···
···

···
···

ℓ
m

ℓ
r
···
···

···
···

m
r

m
ℓ
···
···

···
···

r
ℓ

···
···

r ···
m ···
n
n

n
n

= (r l) ◦ (r m)
...

With the above deﬁnitions, we state and prove two important results
...
2
...

Proof
...
If n(σ) = 0, then
σ = Idn = (1 2) ◦ (1 2)
...

For the next step of the induction, suppose that τ ∈ Sn with n(τ ) = k + 1
...
, ℓ − 1 and τ (ℓ) = ℓ
...
Also, note that m > ℓ
...
Then note that
(σ ◦ τ )(i) = i, for i = 1, 2,
...
APPENDIX

So, the deﬁnition of “number of inversions” and m > ℓ implies that
n

n(σ ◦ τ )

=
i=1

|{(σ ◦ τ )(j) < (σ ◦ τ )(i), for j = i + 1, i + 2,
...
, n}|
n

+
i=ℓ+1

|{(σ ◦ τ )(j) < (σ ◦ τ )(i), for j = i + 1, i + 2,
...
, n}|
|{τ (j) < τ (i), for j = i + 1, i + 2,
...
, n}|

= n(τ )
...
Hence, by the induction hypothesis, the permutation σ ◦ τ is a composition of
transpositions
...

Hence, τ = σ ◦ α1 ◦ α2 ◦ · · · ◦ αt as σ ◦ σ = Idn for any transposition σ ∈ Sn
...

Before coming to our next important result, we state and prove the following lemma
...
2
...

Proof
...
Hence, t ≥ 2
...
So, let us assume that t ≥ 3
...
The result clearly holds for t = 2
...
Now, let t = k + 1
...
Note that the possible choices for the composition α1 ◦ α2 are
(m r) ◦ (m r) = Idn , (m r) ◦ (m ℓ) = (r ℓ) ◦ (r m), (m r) ◦ (r ℓ) = (ℓ r) ◦ (ℓ m) and (m r) ◦ (ℓ s) = (ℓ s) ◦ (m r),

where ℓ and s are distinct elements of {1, 2,
...
In the ﬁrst case, we
can remove α1 ◦ α2 and obtain Idn = α3 ◦ α4 ◦ · · · ◦ αt
...
So, by mathematical induction, t − 2 is even and hence t is also even
...
But note that in the
new expression for identity, the positive integer m doesn’t appear in the ﬁrst transposition, but appears
in the second transposition
...

At this step, either the number of transpositions will reduce by 2 (giving us the result by mathematical
induction) or the positive number m will get shifted to the third transposition
...
2
...
In the
later case, the positive integer m appears exactly once in the expression for identity and hence this
expression does not ﬁx m whereas for the identity permutation Idn (m) = m
...

Hence, the process will surely lead to an expression in which the number of transpositions at some
stage is t − 2 = k − 1
...

Theorem 14
...
10 Let α ∈ Sn
...
, τk and σ1 , σ2 ,
...

Proof
...

Hence by Lemma 14
...
9, k + ℓ is even
...
Thus the result
follows
...
2
...
A permutation σ ∈ Sn is called an odd permutation if σ can
be written as a composition (product) of an odd number of transpositions
...
2
...
Whereas if one of them is odd and the other even then the
permutations σ ◦ τ and τ ◦ σ are both odd
...
2
...

if σ is an odd permutation

Example 14
...
14
1
...
Thus, sgn(Idn ) = 1 and for any transposition σ ∈ Sn , sgn(σ) = −1
...
Using Remark 14
...
12, sgn(σ ◦ τ ) = sgn(σ) · sgn(τ ) for any two permutations σ, τ ∈ Sn
...

Deﬁnition 14
...
15 Let A = [aij ] be an n × n matrix with entries from F
...
anσ(n) =
σ∈Sn

sgn(σ)
σ∈Sn

aiσ(i)
...
2
...
Observe that det(A) is a scalar quantity
...
But this expression is very helpful in proving the results related
with “properties of determinant”
...
APPENDIX

2
...
2
...

Observe that this expression for det(A) for a 3 × 3 matrix A is same as that given in (2
...
1)
...
3

Properties of Determinant

Theorem 14
...
1 (Properties of Determinant) Let A = [aij ] be an n × n matrix
...
if B is obtained from A by interchanging two rows, then
det(B) = − det(A)
...
if B is obtained from A by multiplying a row by c then
det(B) = c det(A)
...
if all the elements of one row is 0 then det(A) = 0
...
if A is a square matrix having two rows equal then det(A) = 0
...
Let B = [bij ] and C = [cij ] be two matrices which diﬀer from the matrix A = [aij ] only in the mth
row for some m
...

6
...

7
...

8
...

9
...

10
...

11
...

Proof
...
Suppose B = [bij ] is obtained from A = [aij ] by the interchange of the ℓth
and mth row
...

14
...
PROPERTIES OF DETERMINANT

247

Let τ = (ℓ m) be a transposition
...
2
...
Hence by the
deﬁnition of determinant and Example 14
...
14
...

Proof of Part 2
...
Then
bmj = c amj and bij = aij for 1 ≤ i = m ≤ n, 1 ≤ j ≤ n
...

Proof of Part 3
...
anσ(n)
...
Hence, from the condition that A has a row consisting
of all zeros, the value of each term is 0
...

Proof of Part 4
...
Let B be the matrix obtained
from A by interchanging the ℓth and mth rows
...
But the

assumption implies that B = A
...
So, we have det(B) = − det(A) = det(A)
...

Proof of Part 5
...

Proof of Part 6
...
Then bℓj = aℓj + k amj and bij = aij for 1 ≤ i = m ≤ n, 1 ≤ j ≤ n
...
APPENDIX

Then
det(B)

=
σ∈Sn

=
σ∈Sn

=
σ∈Sn

sgn(σ)b1σ(1) b2σ(2) · · · bℓσ(ℓ) · · · bmσ(m) · · · bnσ(n)
sgn(σ)a1σ(1) a2σ(2) · · · (aℓσ(ℓ) + kamσ(m) ) · · · amσ(m) · · · anσ(n)
sgn(σ)a1σ(1) a2σ(2) · · · aℓσ(ℓ) · · · amσ(m) · · · anσ(n)
+k
σ∈Sn

=
σ∈Sn

=

sgn(σ)a1σ(1) a2σ(2) · · · amσ(m) · · · amσ(m) · · · anσ(n)

sgn(σ)a1σ(1) a2σ(2) · · · aℓσ(ℓ) · · · amσ(m) · · · anσ(n)

use Part 4

det(A)
...
First let us assume that A is an upper triangular matrix
...
So, for every σ = Idn ∈ Sn , there exists a
positive integer m, 1 ≤ m ≤ n − 1 (depending on σ) such that m > σ(m)
...
Hence the result follows
...

Proof of Part 8
...
Then using Part 7, det(In ) = 1
...
4
...
Again using Parts 1, 2
and 6, we get det(EA) = det(E) det(A)
...
Suppose A is invertible
...
7
...
That is, there exist elementary matrices E1 , E2 ,
...
Now a
repeated application of Part 8 implies that det(A) = det(E1 ) det(E2 ) · · · det(Ek )
...
Hence, det(A) = 0
...
We show that A is invertible
...
Then by Theorem 2
...
7, the matrix A is not of full rank
...
So, there exist elementary matrices E1 , E2 ,
...
Therefore, by Part 3 and a repeated application of Part 8,
0

det(E1 ) det(E2 ) · · · det(Ek ) det(A) = det(E1 E2 · · · Ek A) = det

B
0

= 0
...
Hence, det(A) = 0
...

Hence our assumption is false and therefore A is invertible
...
Suppose A is not invertible
...
Also, the product matrix
AB is also not invertible
...
Thus, det(AB) = det(A) det(B)
...
Then by Theorem 2
...
7, A is a product of elementary matrices
...
, Ek such that A = E1 E2 · · · Ek
...

14
...
PROPERTIES OF DETERMINANT

249

Proof of Part 11
...
Then bij = aji for 1 ≤ i, j ≤ n
...
2
...
Also sgn(σ) = sgn(σ −1 )
...

Remark 14
...
2
1
...
3
...

2
...
Let B be the submatrix
of A obtained by removing the ﬁrst row and the ﬁrst column
...
The reason being is as follows:
for every σ ∈ Sn with σ(1) = 1 is equivalent to saying that σ is a permutation of the elements
{2, 3,
...
That is, σ ∈ Sn−1
...

We are now ready to relate this deﬁnition of determinant with the one given in Deﬁnition 2
...
2
...
3
...
Then det(A) =

(−1)1+j a1j det A(1|j) , where recall that

j=1

A(1|j) is the submatrix of A obtained by removing the 1st row and the j th column
...
For 1 ≤ j ≤ n, deﬁne two matrices


0
0 · · · a1j · · ·
0


 a21 a22 · · · a2j · · · a2n 
Bj = 
...

...


...

...

...


...

...

an1

an2

···

anj

···

ann



n×n

Then by Theorem 14
...
1
...


...

anj

0
a21

...

...

...

an2

···
···

...

···


0

a2n 

...


...

ann n×n

n

det(A) =

det(Bj )
...
3
...
Note that the matrix Bj can be transformed into Cj by j − 1
interchanges of columns done in the following manner:
ﬁrst interchange the 1st and 2nd column, then interchange the 2nd and 3rd column and so on (the last
process consists of interchanging the (j − 1)th column with the j th column
...
3
...
3
...
Therefore by (14
...
6),
n

n

(−1)j−1 a1j det A(1|j) =

det(A) =
j=1

(−1)j+1 a1j det A(1|j)
...
4

CHAPTER 14
...
4
...

Then
dim(M ) + dim(N ) = dim(M + N ) + dim(M ∩ N )
...
4
...
Since M ∩ N is a vector subspace of V, consider a basis B1 = {u1 , u2 ,
...

As, M ∩ N is a subspace of the vector spaces M and N, we extend the basis B1 to form a basis
BM = {u1 , u2 ,
...
, vr } of M and also a basis BN = {u1 , u2 ,
...
, ws } of N
...
, uk , w1 ,
...
, vr } is a basis
of M + N
...
the set B2 is linearly independent subset of V, and
2
...

The second part can be easily veriﬁed
...

(14
...
8)

This system can be rewritten as
α1 u1 + · · · + αk uk + β1 w1 + · · · + βs ws = −(γ1 v1 + · · · + γr vr )
...
, vr ∈ BM
...
, uk , w1 ,
...
Hence, v ∈ M ∩ N and
therefore, there exists scalars δ1 ,
...

Substituting this representation of v in Equation (14
...
8), we get

(α1 − δ1 )u1 + · · · + (αk − δk )uk + β1 w1 + · · · + βs ws = 0
...
, uk , w1 ,
...
Therefore,
by the deﬁnition of linear independence, we get
αi − δi = 0, for 1 ≤ i ≤ k and βj = 0 for 1 ≤ j ≤ s
...
4
...

The only solution for this linear system is
αi = 0, for 1 ≤ i ≤ k and γj = 0 for 1 ≤ j ≤ r
...
4
...
And therefore,
the vectors are linearly independent
...
We now count the vectors in the sets B1 , B2 , BM and BN to
get the required result
...
5
...
5

251

Proof of Rank-Nullity Theorem

Theorem 14
...
1 Let T : V −→W be a linear transformation and {u1 , u2 ,
...
Then
1
...
, T (un ))
...
T is one-one ⇐⇒
R(T )
...
If V is ﬁnite dimensional vector space then dim(R(T )) ≤ dim(V )
...

Proof
...
For 2), let T be one-one
...
This means that
T (u) = 0 = T (0)
...
If N (T ) = {0} then T (u) = T (v) ⇐⇒
T (u − v) = 0 implies that u = v
...

The other parts can be similarly proved
...

The proof of the next theorem is immediate from the fact that T (0) = 0 and the deﬁnition of linear
independence/dependence
...
5
...
If {T (u1), T (u2 ),
...
, un } ⊂ V is linearly independent
...
5
...
Then
dim( Range(T )) + dim(N (T )) = dim(V ),
or ρ(T ) + ν(T ) = n
...
Let dim(V ) = n and dim(N (T )) = r
...
, ur } is a basis of N (T )
...
, ur } is a linearly independent set in V, we can extend it to form a basis of V
...
, un } such that the set {u1 ,
...
, un } is a basis of V
...
, T (un ))

=

L(0,
...
, T (un ))

=

L(T (ur+1), T (ur+2 ),
...
, T (un )}
...
, T (un )} is a linearly independent set
...
Then, there exists scalars, αr+1 , αr+2 ,
...

Or T (αr+1 ur+1 + αr+2 ur+2 + · · · + αn un ) = 0 which in turn implies αr+1 ur+1 + αr+2 ur+2 + · · · + αn un ∈
N (T ) = L(u1 ,
...
So, there exists scalars αi , 1 ≤ i ≤ r such that
αr+1 ur+1 + αr+2 ur+2 + · · · + αn un = α1 u1 + α2 u2 + · · · + αr ur
...

Thus αi = 0 for 1 ≤ i ≤ n as {u1 , u2 ,
...
In other words, we have shown that the set
{T (ur+1), T (ur+2 ),
...
Now, the required result follows
...

252

CHAPTER 14
...
5
...
Then
T is one-one ⇐⇒ T is onto ⇐⇒ T has an inverse
...
Let dim(V ) = n and let T be one-one
...
Hence, by the rank-nullity
Theorem 14
...
3 dim( Range (T )) = n = dim(V )
...
Hence, Range(T ) =
V
...

Suppose T is onto
...
Hence, dim( Range (T )) = n
...
5
...
That is, T is one-one
...
Hence, for every vector u in the range, there is a
unique vectors v in the domain such that T (v) = u
...

That is, T has an inverse
...
Then it is clear that T is one-one and onto
...
6

Condition for Exactness

Let D be a region in xy-plane and let M and N be real valued functions deﬁned on D
...

(14
...
9)
Deﬁnition 14
...
1 (Exact Equation) The Equation (14
...
9) is called Exact if there exists a real valued twice
continuously diﬀerentiable function f such that
∂f
∂f
= M and
= N
...
6
...
The equation (14
...
9) is exact if and only if
∂M
∂N
=

...
6
...
Let Equation (14
...
9) be exact
...
So, ∂M = ∂y∂x = ∂x∂y = ∂N and so Equation (14
...
10) holds
...
6
...
We now show that Equation (14
...
10) is exact
...
Then

M (x, y)dx + g(y)
∂G
∂x

= M (x, y) which shows that

∂ ∂G
∂ ∂G
∂M
∂N
·
=
·
=
=

...
Let φ(y) = N −

=
=
=
=

∂G
∂G
+
+ φ(y)
∂x
∂y
∂G ∂G dy
+
·
+
∂x
∂y dx
d
d
G(x, y(x)) +
dx
dx
d
f (x, y)
dx

dy
dx
d
dy

φ(y)dy
φ(y)dy

∂G
∂y

or N = φ(y) +

∂G
∂y

Title: Notes on Mathematics - 1021
Description: Lecture is on I Linear Algebra II Ordinary Differential Equation III Laplace Transform IV Numerical Applications

Buy These Notes Preview

Notesale: Turn your study into money

Already a Member? >

Search for notes by fellow students, in your own course and all over the country.

My Basket

Document Preview