Search for notes by fellow students, in your own course and all over the country.

Browse our notes for titles which look like what you need, you can preview any of the notes via a sample of the contents. After you're happy these are the notes you're after simply pop them into your shopping cart.

My Basket

You have nothing in your shopping cart yet.

Title: Linear Algebra
Description: This is a simplified version of linear algebra. It's concise and self-explanatory and can used to get in-depth understanding without the need of further assistance. The lecture was delivered by S. J Wadsley in 2015 in Cambridge University. It is organized such that all sub-topics are tabulated on a table of contents for easy navigation. It is expected that students who use this notes must not have problems with linear algebra going forward.

Document Preview

Extracts from the notes are below, to see the PDF you'll receive please use the links above


Part IB — Linear Algebra
Based on lectures by S
...
Wadsley
Notes taken by Dexter Chua

Michaelmas 2015
These notes are not endorsed by the lecturers, and I have modified them (often
significantly) after lectures
...


Definition of a vector space (over R or C), subspaces, the space spanned by a subset
...
Direct sums and complementary subspaces
...
Relation between rank and nullity
...
Change of basis
...

[4]
Determinant and trace of a square matrix
...
Determinant of an endomorphism
...
[3]
Eigenvalues and eigenvectors
...
Characteristic and
minimal polynomials
...
Algebraic and geometric
multiplicity of eigenvalues
...

[4]
Dual of a finite-dimensional vector space, dual bases and maps
...

[2]
Bilinear forms
...
Symmetric forms and their link
with quadratic forms
...
Law of inertia, classification
by rank and signature
...

[4]
Inner product spaces, orthonormal sets, orthogonal projection, V = W ⊕ W ⊥
...
Adjoints
...
Orthogonality of eigenvectors and properties of eigenvalues
...
1 Definitions and examples
...
2 Linear independence, bases and the Steinitz exchange lemma
...
3 Direct sums
...
1 Definitions and examples
...
2 Linear maps and matrices
...
3 The first isomorphism theorem and
2
...

2
...


15
15
19
21
24
27


...

the rank-nullity theorem
...


...


...


...



...


...


...
1 Dual space
...
2 Dual maps
...
1 Invariants
...
2 The minimal polynomial
...
2
...

6
...
2 Minimal polynomial
...
3 The Cayley-Hamilton theorem
6
...


...


...

Jordan normal form


...


...


...



...


...


...



...


...


...



...


...


...



...


...


...



...


...


...



...


...


...


49
49
53
53
54
57
63

7 Bilinear forms II
72
7
...
72
7
...
80
8 Inner product spaces
8
...

8
...

8
...
4 Spectral theory
...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...
A vector was just treated as a “list of numbers” representing
a point in space
...
A matrix is treated as a “physical operation” on
vectors that stretches and rotates them
...
We also used matrices to express
and solve systems of linear equations
...

In IB Linear Algebra, we are going to study vectors in an abstract setting
...
We will write down axioms governing how these
operations should behave, just like how we wrote down the axioms of group
theory
...

In the course, we will, of course, prove that this abstract treatment of linear
algebra is just “the same as” our previous study of “vectors as a list of numbers”
...
However, most of the time, looking at these abstractly
will provide a much better fundamental understanding of how things work
...
1

Definitions and examples

Notation
...

Intuitively, a vector space V over a field F (or an F-vector space) is a space
with two operations:
– We can add two vectors v1 , v2 ∈ V to obtain v1 + v2 ∈ V
...

Of course, these two operations must satisfy certain axioms before we can
call it a vector space
...

Example
...

An m × n matrix A with coefficients in R can be viewed as a linear map
from Rm to Rn via v 7→ Av
...
When confused about
definitions, we can often think what the definition means in terms of Rn
and matrices to get some intuition
...
This is a vector
space
...

(iii) Let [a, b] ⊆ R be a closed interval, then
C([a, b], R) = {f ∈ R[a,b] : f is continuous}
is a vector space, with operations as above
...

Of course, we cannot take a random set, define some random operations called
addition and scalar multiplication, and call it a vector space
...

Definition (Vector space)
...

By abuse of notation, we also write 0 for the trivial vector space {0}
...
For example, it would be difficult to assign these quantities to the
vector space of real-valued continuous functions in [a, b]
...

Proposition
...

Proof is left as an exercise
...
In the case of vector spaces, this is a subspace
...
If V is an F-vector space, then U ⊆ V is an (F-linear)
subspace if
(i) u, v ∈ U implies u + v ∈ U
...

(iii) 0 ∈ U
...

Alternatively, U is a subspace of V if it is itself a vector space, inheriting the
operations from V
...

Example
...

(ii) Let X be a set
...
Then the set of functions with finite support
forms a vector subspace
...

If we have two subspaces U and V , there are several things we can do with
them
...
We will shortly show
that this will be a subspace
...
Instead, we need the sum:
Definition (Sum of subspaces)
...
The sum of U and V is
U + W = {u + w : u ∈ U, w ∈ W }
...
Let U, W be subspaces of V
...


5

1

Vector spaces

IB Linear Algebra

Proof
...
Then
λ(u1 + w1 ) + µ(u2 + w2 ) = (λu1 + µu2 ) + (λw1 + µw2 ) ∈ U + W
...
So
λv1 + µv2 ∈ U ∩ W
...
So done
...

Definition (Quotient spaces)
...

Then the quotient group V /U can be made into a vector space called the quotient
space, where scalar multiplication is given by (λ, v + U ) = (λv) + U
...
Hence
for λ ∈ F, we have λv − λw ∈ U
...


1
...
We
call this a basis because everything in Rn can be (uniquely) written as a sum
of (scalar multiples of) these basis elements
...

We would like to capture this idea in general vector spaces
...
This means we can define the
“dimension” of a vector space as the number of elements in the basis
...
We will
in fact prove a slightly stronger statement than what was stated above, and this
ensures that the dimension of a vector space is well-behaved
...

This is not the case when we study modules in IB Groups, Rings and Modules,
which are generalizations of vector spaces
...
Even for those that have basis, the
behaviour of the “dimension” is complicated when, say, we take submodules
...

Definition (Span)
...
The span of S
is defined as
( n
)
X
hSi =
λi si : λi ∈ F, si ∈ S, n ≥ 0
i=1

This is the smallest subspace of V containing S
...
We will not play with infinite sums, since
the notion of convergence is not even well defined in a general vector space
...


6

1

Vector spaces

IB Linear Algebra

     
0
1 
 1
(i) Let V = R3 and S = 0 , 1 , 2
...



b
Note that any subset of S of order 2 has the same span as S
...
Define the function δx : X → F by
(
1 y=x

...

Definition (Spanning set)
...
S spans
V if hSi = V
...
Let V be a vector space over F and S ⊆ V
...

If S is not linearly independent, we say it is linearly dependent
...
Let V be a vector space over F and S ⊆ V
...

Definition (Finite dimensional)
...

Ideally, we would want to define the dimension as the number of vectors in
the basis
...
It is certainly
plausible that a vector space has a basis of size 7 as well as a basis of size 3
...

We will first have an example:
     
0
1 
 1
Example
...
Then S is


0
1
2
linearly dependent since
 
 
 
1
0
1
1 0 + 2 1 + (−1) 2 = 0
...

1
7

1

Vector spaces

IB Linear Algebra

Note that no linearly independent set can contain 0, as 1 · 0 = 0
...

There is an alternative way in which we can define linear independence
...
S ⊆ V is linearly dependent if and only if there are distinct s0 , · · · , sn ∈
S and λ1 , · · · , λn ∈ F such that
n
X

λi si = s0
...
If S is linearly dependent, P
zero and s1 , · · · , sn ∈ S such that
λi si = 0
...
Then
s1 =

n
X



i=2

Conversely, if s0 =

Pn

i=1

λi
si
...


i=1

So S is linearly dependent
...
If S = {e1 , · · · , en } is a subset of V over F, then it is a basis if
and only if every v ∈ V can be written uniquely as a finite linear combination
of elements in S, i
...
as
n
X
v=
λi ei
...
We can view this as a combination of two statements: it can be spanned
in at least one way, and it can be spanned in at most one way
...

In fact, S spanning V is defined exactly to mean that every item v ∈ V can
be written as a finite linear combination in at least one way
...


i=1

Then we have
0=v−v =

n
X
(λi − µi )ei
...
Hence λi = µi
...

On the other hand, if S is not linearly independent, then we have
0=

n
X
i=1

8

λi ei

1

Vector spaces

IB Linear Algebra

where λi 6= 0 for some i
...


i=1

So there are two ways to write 0 as a linear combination
...

Now we come to the key theorem:
Theorem (Steinitz exchange lemma)
...
Then there is some T 0 ⊆ T of order n such that (T \ T 0 ) ∪ S still
spans V
...

What does this actually say? This says if T is spanning and S is independent,
there is a way of grabbing |S| many elements away from T and replace them
with S, and the result will still be spanning
...
It tells us that
we cannot have a independent set larger than a spanning set, and most of our
corollaries later will only use this remark
...

Corollary
...
Then there is a re-ordering of the {fi } such that
{e1 , · · · , en , fn+1 , · · · , fm } spans V
...
So it
helps to give a brief overview of what we are going to do in words first
...
The first one is easy
...
Since
T is spanning, we can write
X
e1 =
λi ti
for some ti ∈ T, λi ∈ F non-zero
...
The result is
still spanning, since the above formula allows us to write t1 in terms of e1 and
the other ti
...
For the rth element, we again write
X
er =
λi ti
...
However, we
cannot do this arbitrarily, since the lemma wants us to replace something in T
with with er
...

This is where the linear independence of S kicks in
...
Hence there is something genuinely
from T , and we can safely replace it with er
...

Proof
...

(Note that the case r = 0 is trivial, since we can take Tr0 = ∅, and the case
r = n is the theorem which we want to achieve
...
Since Tr spans V , we can write
er+1 =

k
X

λi ∈ F, ti ∈ Tr
...
So there is
some j such that tj ∈ (T \ Tr0 )
...

λj
λj
i6=j

0
We let Tr+1
= Tr0 ∪ {tj } of order r + 1, and
0
Tr+1 = (T \ Tr+1
) ∪ {e1 , · · · , er+1 } = (Tr \ {tj }} ∪ {er+1 }

Since tj is in the span of Tr ∪ {er+1 }, we have tj ∈ hTr+1 i
...

So hTr+1 i = V
...

From this lemma, we can immediately deduce a lot of important corollaries
...
Suppose V is a vector space over F with a basis of order n
...

(ii) Any linearly independent set of order n is a basis
...

(iv) Every finite spanning set contains a basis
...

Proof
...

(i) Suppose T is another basis
...

The other direction is less trivial, since T might be infinite, and Steinitz
does not immediately apply
...
Also, S is
spanning
...
So |T | ≤ |S|
...

(ii) Suppose now that T is a linearly independent subset of order n, but
hT i =
6 V
...
We now show that T ∪ {v} is
independent
...

i=1

Then λ0 v ∈ hT i
...
As T is linearly independent, we have
λ0 = · · · = λm = 0
...
This is a contradiction since S is a spanning set of size n
...
If T were linearly dependent, then
there is some t0 , · · · , tm ∈ T distinct and λ1 , · · · , λm ∈ F such that
X
t0 =
λi ti
...
e
...
So T \ {t0 } is a spanning set of
order n − 1, which is a contradiction
...
Let T 0 ⊆ T be a spanning set of least
possible size
...
If |T 0 | has size n, then done
by (iii)
...

So T 0 must be linearly dependent because S is spanning
...
Then
T 0 \ {t0 } is a smaller spanning set
...

(v) Suppose T is a linearly independent set
...
So by (ii), (S \ S 0 ) ∪ T is a basis of V containing T
...

Finally, we can use this to define the dimension
...
If V is a vector space over F with finite basis S, then
the dimension of V , written
dim V = dimF V = |S|
...
However, it does
depend on F
...

After defining the dimension, we can prove a few things about dimensions
...
If V is a finite dimensional vector space over F, U ⊆ V is a proper
subspace, then U is finite dimensional and dim U < dim V
...
Every linearly independent subset of V has size at most dim V
...
We want to show that S
spans U and |S| < dim V
...
So v 6∈ U by maximality
of S
...

Since U =
6 V , there is some v ∈ V \ U = V \ hSi
...
So |S| + 1 ≤ dim V
...


11

1

Vector spaces

IB Linear Algebra

Proposition
...

The proof is not hard, as long as we manage to pick the right basis to do the
proof
...

We need a basis for all four of them, and we want to compare the bases
...

Proof
...
This is a linearly independent
subset of U
...

Similarly, for W , we can obtain a basis
T = {v1 , · · · , vr , wr+1 , · · · , wt }
...
It is sufficient to prove that
S ∪ T is a basis for U + W
...
Suppose u + w ∈ U + W , u ∈ U, w ∈ W
...
So u + w ∈ hS ∪ T i
...

To show linear independence, suppose we have a linear relation
r
X

λi vi +

s
X

µj uj +

j=r+1

i=1

t
X

νk wk = 0
...


Since the left hand side is something in U , and the right hand side is something
in W , they both lie in U ∩ W
...
However, since R is a basis of U ∩ W , we can write the
left hand vector just as a sum of vi ’s
...
Then
we have
X
X
λi vi +
νk wk = 0
...
So S ∪ T is
linearly independent
...
If V is a finite dimensional vector space over F and U ∪ V is a
subspace, then
dim V = dim U + dim V /U
...
Combined
with the first isomorphism theorem for vector spaces, this gives the rank-nullity
theorem
...
Let {u1 , · · · , um } be a basis for U and extend this to a basis {u1 , · · · , um ,
vm+1 , · · · , vn } for V
...

It is easy to see that this spans V /U
...

Then
v+U =

X

µi (vi + U ) +

X

λi (ui + U ) =

X

µi (vi + U )
...

To show that they are linearly independent, suppose that
X
λi (vi + U ) = 0 + U = U
...


Then we can write this as a linear combination of the ui ’s
...
Since {u1 , · · · , um , vn+1 , · · · , vn } is a basis for V , we must have
λi = µj = 0 for all i, j
...


1
...

Definition ((Internal) direct sum)
...
We say that V is the (internal) direct sum of U and
W if
(i) U + W = V
(ii) U ∩ W = 0
...

Equivalently, this requires that every v ∈ V can be written uniquely as u + w
with u ∈ U, w ∈ W
...

You will show in the example sheets that given any subspace U ⊆ V , U must
have a complementary subspace in V
...
Let V = R , and U = h
i
...

Definition ((External) direct sum)
...


1

Vector spaces

IB Linear Algebra

The difference between these two definitions is that the first is decomposing
V into smaller spaces, while the second is building a bigger space based on two
spaces
...
e
...
So these two are indeed compatible
notions, and this is why we give them the same name and notation
...
If U1 , · · · , Un ⊆ V are subspaces
of V , then V is the (internal) direct sum
V = U1 ⊕ · · · ⊕ Un =

n
M

Ui

i=1

P
if every v ∈ V can be written uniquely as v =
ui with ui ∈ Ui
...

For more details,
see example sheet 1 Q
...

Definition ((Multiple) (external) direct sum)
...

This can be made into an infinite sum if we require that all but finitely many
of the ui have to be zero
...
In particular, we would like to study functions that
respect the structure of the objects
...


2
...
Let U, V be vector spaces over F
...

(ii) α(λu) = λα(u) for all λ ∈ F, u ∈ U
...

There are a few things we should take note of:
– If we are lazy, we can combine the two requirements to the single requirement that
α(λu1 + µu2 ) = λα(u1 ) + µα(u2 )
...
In particular, α(0) = 0
...
For example,
complex conjugation is a map C → C that is R-linear but not C-linear
...

(i) Let A be an n×m matrix with coefficients in F
...

Then α : Fm → Fn defined by v → Av is linear
...
So we have
α(λu + µv)i =

m
X

Aij (λu + µv)j

j=1
m
X



j=1

Aij uj + µ

m
X

Aij vj

j=1

= λα(u)i + µα(v)i
...

(ii) Let X be a set and g ∈ FX
...
Then mg is linear
...

Rx
(iii) Integration I : (C([a, b]), R) → (C([a, b]), R) defined by f 7→ a f (t) dt is
linear
...

15

2

Linear maps

IB Linear Algebra

(v) If α, β ∈ L(U, V ), then α + β defined by (α + β)(u) = α(u) + β(u) is linear
...

In this way, L(U, V ) is also a vector space over F
...
Using this, we can show that many
things are linear, like differentiating twice, or adding and then multiplying
linear maps
...

Definition (Isomorphism)
...

If there exists an isomorphism U → V , we say U and V are isomorphic, and
write U ∼
=V
...
If U and V are vector spaces over F and α : U → V , then α is an
isomorphism iff α is a bijective linear map
...
If α is an isomorphism, then it is clearly bijective since it has an inverse
function
...
Then as a function, it has an inverse
β : V → U
...
Let v1 , v2 ∈ V , λ, µ ∈ F
...

Since α is injective, we have
β(λv1 + µv2 ) = λβ(v1 ) + µβ(v2 )
...

Definition (Image and kernel)
...
Then the
image of α is
im α = {α(u) : u ∈ U }
...

It is easy to show that these are subspaces of V and U respectively
...

(i) Let A ∈ Mm,n (F) and α : Fn → Fm be the linear map v 7→ Av
...

The kernel of α contains all solutions to

16

P

j

Aij xj = 0
...

for some p, q ∈ C ∞ (R, R)
...

Similarly, ker β contains the solutions to the homogeneous equation
f 00 (t) + p(t)f 0 (t) + q(t)f (t) = 0
...
Indeed this is
what we are going to show
...
Let α : U → V be an F-linear map
...

(ii) If α is surjective and S ⊆ U spans U , then α(S) spans V
...

Here (iii) immediately shows that two isomorphic spaces have the same
dimension
...

(i) We prove the contrapositive
...
So there are s0 , · · · , sn ∈ S distinct and λ1 , · · · , λn ∈ F not all
zero such that
!
n
n
X
X
λi α(si ) = α
λi si
...


i=1

This is a non-trivial relation of the si in U
...

(ii) Suppose α is surjective and S spans U
...
Then there is some
u ∈ U such that α(u) = v
...


I=1

Then
v = α(u) =

n
X
i=1

So α(S) spans V
...


2

Linear maps

IB Linear Algebra

(iii) Follows immediately from (i) and (ii)
...
If U and V are finite-dimensional vector spaces over F and α : U → V
is an isomorphism, then dim U = dim V
...
Otherwise, the
proof works just fine for infinite dimensional spaces
...
Let S be a basis for U
...
Since α is injective,
|S| = |α(S)|
...

How about the other way round? If two vector spaces have the same dimension, are they necessarily isomorphic? The answer is yes, at least for
finite-dimensional ones
...
We will show that
they are isomorphic in many ways
...
Suppose V is a F-vector space of dimension n < ∞
...

Proof
...
So
(α(e1 ), · · · , α(en )) is indeed a basis for V
...

Suppose α, β : Fn → V are isomorphism such that Φ(α) = Φ(β)
...
We want to show that α = β
...
 = α
xi ei =
xi α(ei ) =
xi β(ei ) = β 
...

Next, suppose that (v1 , · · · , vn ) is an ordered basis for V
...
 X
α 
...

xn
It is easy to check that this is well-defined and linear
...
So if
xi vi = yi vi , then
xi = yi
...
So α is an isomorphism,
and by construction Φ(α) = (v1 , · · · , vn )
...
2

IB Linear Algebra

Linear maps and matrices

Recall that our first example of linear maps is matrices acting on Fn
...
Since we know that all
vector spaces are isomorphic to Fn , this means we can represent arbitrary linear
maps on vector spaces by matrices
...

Proposition
...
Then every function f : S → V extends uniquely to a linear map
U →V
...

Proof
...
We have sort-of proved thisP
already just now
...
Then
X
 X
X
α(u) = α
ui ei =
ui α(ei ) =
ui f (ei )
...


So α(u) = β(u) for every u
...

P
For existence, if u ∈ U , we can write u = ui ei in a unique way
...
To show linearity, let λ, µ ∈ F, u, v ∈ U
...

Moreover, α does extend f
...
If U and V are finite-dimensional vector spaces over F with bases
(e1 , · · · , em ) and (f1 , · · · , fn ) respectively, then there is a bijection
Matn,m (F) → L(U, V ),
sending A to the unique linear map α such that α(ei ) =

P

aji fj
...

We can also draw a fancy diagram to display this result
...
Similarly,
we get an isomorphism s(fi ) : V → Fn
...


We can put this into a square:

s(ei )

s(fi )
α

U

V

Then the corollary tells us that every A gives rise to an α, and every α corresponds
to an A that fit into this diagram
...
If α is a linear map U → V , then for each 1 ≤ i ≤ m, we can write α(ei )
uniquely as
n
X
aji fj
α(ei ) =
j=1

for some aji ∈ F
...
The previous proposition tells
us that every matrix A arises in this way, and α is determined by A
...
We call the matrix corresponding to a
linear map α ∈ L(U, V ) under the corollary the matrix representing α with
respect to the bases (e1 , · · · , em ) and (f1 , · · · , fn )
...

Proposition
...
, vs ) and T = (w1 , · · · , wt ) respectively
...

Fr

A

s(R)

U

Fs

B

s(S)
α

V

Ft
s(T )

β

W

Proof
...
Next we write βα(ui ) as a linear

20

2

Linear maps

IB Linear Algebra

combination of w1 , · · · , wt :
!
βα(ui ) = β

X

Aki vk

k

=

X

Aki β(vk )

k

=

X

Aki

X

Bjk wj

j

k

!
=

X X
j

=

X

Bjk Aki

wj

k

(BA)ji wj

j

2
...
This is in fact an easy
corollary of a stronger result, known as the first isomorphism theorem, which
directly relates the kernel and image themselves
...
We will also
provide another proof that does not involve quotients
...
Let α : U → V be a linear map
...
Moreover, α induces an
isomorphism
α
¯ : U/ ker α → im α
(u + ker α) 7→ α(u)
Note that if we view a vector space as an abelian group, then this is exactly
the first isomorphism theorem of groups
...
We know that 0 ∈ ker α and 0 ∈ im α
...
Then
α(λ1 u1 + λ2 u2 ) = λ1 α(u1 ) + λ2 α(u2 ) = 0
...
So ker α is a subspace
...
So im α is a subspace
...
So it remains to show that α
¯ is a linear map
...

So α
¯ is a linear map
...
If α : U → V is a linear map between finitedimensional vector spaces over F (in fact we just need U to be finite-dimensional),
the rank of α is the number r(α) = dim im α
...

Corollary (Rank-nullity theorem)
...

Proof
...
So we
have
dim im α = dim(U/ ker α) = dim U − dim ker α
...

We can also prove this result without the first isomorphism theorem, and say
a bit more in the meantime
...
If α : U → V is a linear map between finite-dimensional vector
spaces over F, then there are bases (e1 , · · · , em ) for U and (f1 , · · · , fn ) for V
such that α is represented by the matrix


Ir 0
,
0 0
where r = r(α) and Ir is the r × r identity matrix
...

Proof
...
Then we can extend this
to a basis of the (e1 , · · · , em )
...
We now show that (f1 , · · · , fk ) is a basis for
im α (and thus k = r)
...
Suppose v ∈ im α
...
By linearity, we can write this as
v=

m
X

λi α(ei ) =

i=1

k
X

λi fi + 0
...

To show linear dependence, suppose that
k
X

µi fi = 0
...


2

Linear maps

So

Pk

i=1

IB Linear Algebra

µi ei ∈ ker α
...
Since (e1 , · · · , em ) is a basis, we must have
µi = 0 for all i
...

Now we extend (f1 , · · · , fr ) to a basis for V , and
(
fi 1 ≤ i ≤ k

...
Let
W = {x ∈ R5 : x1 + x2 + x3 = 0 = x3 − x4 − x5 }
...


...

3
4
5
x5
Then ker α = W
...
We know that α(1, 0, 0, 0, 0) = (1, 0)
and α(0, 0, 1, 0, 0) = (0, 1)
...
So dim W = 3
...

Example
...
We let
α:U ⊕W →V
(u, w) 7→ u + w,
where the ⊕ is the external direct sum
...

Then we have
dim U + dim W = dim(U ⊕ W ) = r(α) + n(α) = dim(U + W ) + dim(U ∩ W )
...

Corollary
...
Then the following are equivalent
(i) α is injective;
(ii) α is surjective;
23

2

Linear maps

IB Linear Algebra

(iii) α is an isomorphism
...
It is clear that, (iii) implies (i) and (ii), and (i) and (ii) together implies
(iii)
...

Note that α is injective iff n(α) = 0, and α is surjective iff r(α) = dim V = n
...
So the result follows immediately
...
Let A ∈ Mn,n (F) = Mn (F) be a square matrix
...

(ii) There exists C ∈ Mn (F) such that AC = In
...
We call A invertible or non-singular, and write
A−1 = B = C
...
Let α, β, γ, ι : Fn → Fn be the linear maps represented by matrices
A, B, C, In respectively with respect to the standard basis
...

This is true iff α is injective, which is true iff α is an isomorphism, which is true
iff α has an inverse α−1
...

This is true iff α is injective, which is true iff α is isomorphism, which is true iff
α has an inverse α−1
...


2
...
Given a basis {ei } for U , and a basis
{fi } for V , we can obtain a matrix A
...
These will then give rise to two different maps from Fm to our
space U , and the two basis can be related by a change-of-basis map P
...
If we perform a change of basis for both U and V ,
we can stitch the diagrams together as
ιU

U
(ui )

α

U

V

(ei )
P

Fm

ιV

(fi )
A

Fm

V
(vi )

Fn

Q

Fn

B

Then if we want a matrix representing the map U → V with respect to bases
(ui ) and (vi ), we can write it as the composition
B = Q−1 AP
...
Suppose that (e1 , · · · , em ) and (u1 , · · · , um ) are basis for a finitedimensional vector space U over F, and (f1 , · · · , fn ) and (v1 , · · · , vn ) are basis
of a finite-dimensional vector space V over F
...
Then
B = Q−1 AP,
where P and Q are given by
ui =

m
X

Pki ek ,

vi =

n
X

Qki fk
...
So both are
invertible
...
On the one hand, we have
α(ui ) =

n
X

Bji vj =

j=1

XX
j

Bji Q`j f` =

`

X
[QB]`i f`
...

k=1

k=1

`

`

Since the f` are linearly independent, we conclude that
QB = AP
...

Definition (Equivalent matrices)
...

25

2

Linear maps

IB Linear Algebra

Since GLK (F) = {A ∈ Matk (F) : A is invertible} is a group, for each k ≥ 1,
this is indeed an equivalence relation
...

Two matrices are equivalent if and only if they represent the same linear map
with respect to different basis
...
If A ∈ Matn,m (F), then there
GLm (F), Q ∈ GLn (F) so that

I
Q−1 AP = r
0

exists invertible matrices P ∈

0
0



for some 0 ≤ r ≤ min(m, n)
...
But this tells
us there are min(m, n) + 1 orbits of the action above parametrized by r
...
If A ∈ Matn,m (F), then
– The column rank of A, written r(A), is the dimension of the subspace of
Fn spanned by the columns of A
...
Alternatively, it is the column rank of AT
...
However,
it turns out they are always equal
...
e
...
Moreover,
since the rank of a map is independent of the basis, equivalent matrices have
the same column rank
...
If A ∈ Matn,m (F), then r(A) = r(AT ), i
...
the row rank is equivalent
to the column rank
...
We know that there are some invertible P, Q such that


Ir 0
−1
Q AP =
,
0 0
where r = r(A)
...


26

2

Linear maps

2
...
This will give a
0 0
concrete way to find P and Q, but is less elegant
...

Definition (Elementary matrices)
...




...

Sij = 


...


...



1



...





1
λ




n

...

Eij (λ) = 





1





...

1
This is called a shear, where λ appears at the i, jth entry
...




...




...

Observe that if A is a m × n matrix, then
n
(i) ASij
is obtained from A by swapping the i and j columns
...


(iii) ATin (λ) is obtained from A by rescaling the ith column by λ
...

Proposition
...

We are going to start with A, and then apply these operations to get it into
this form
...
We claim that there are elementary matrices E1m , · · · , Eam and F1n , · · · , Fbn
(these E are not necessarily the shears, but any elementary matrix) such that


I 0
E1m · · · Eam AF1n · · · Fbn = r
0 0
This suffices since the Eim ∈ GLM (F) and Fjn ∈ GLn (F)
...

If A = 0, then done
...
By swapping
row 1 and row i; and then column 1 and column j, we can assume A11 6= 0
...

Now we can add −A1j times column 1 to column j for each j 6= 1, and then
add −Ai1 times row 1 to row i 6= 1
...



...
So by induction on the size of A, we can reduce B to
a matrix of the required form, so done
...


28

3

Duality

3

IB Linear Algebra

Duality

Duality is a principle we will find throughout mathematics
...
Here we will
look for the dual of vector spaces
...

At first, the definition of the dual might see a bit arbitrary and weird
...
Despite their usefulness, though, they can be
confusing to work with at times, since the dual space of a vector space V will be
constructed by considering linear maps on V , and when we work with maps on
dual spaces, things explode
...
1

Dual space

To specify a subspace of Fn , we can write down linearequations
that its elements

1
satisfy
...

However, characterizing a space in terms of equations involves picking some
particular equations out of the many possibilities
...
Hence the solution is to consider all possible such
equations
...

n
We can interpret these equations
 interms of linear maps F → F
...

x3
This works well with the vector space operations
...

So the set of all maps Fn → F that vanishes on U forms a vector space
...

Definition (Dual space)
...
The dual of V is
defined as
V ∗ = L(V, F) = {θ : V → F : θ linear}
...

By convention, we use Roman letters for elements in V , and Greek letters
for elements in V ∗
...

 
x1
– If V = R3 and θ : V → R that sends x2  7→ x1 − x3 , then θ ∈ V ∗
...
Then for any fixed x, θ : V → F defined by f 7→ f (x) is in
V ∗
...
Then f 7→ 0 f (t) dt ∈ V ∗
...

It turns out it is rather easy to specify how the dual space looks like, at least
in the case where V is finite dimensional
...
If V is a finite-dimensional vector space over f with basis (e1 , · · · , en ),
then there is a basis (ε1 , · · · , εn ) for V ∗ (called the dual basis to (e1 , · · · , en ))
such that
εi (ej ) = δij
...
Since linear maps are characterized by their values on a basis, there exists
unique choices for ε1 , · · · , εn ∈ V ∗
...

Suppose θ ∈ V ∗
...
We have θ = i=1 λi εi if and only if θ(ej ) = i=1 λi εi (ej ) (for all
j) if and only if λj = θ(ej )
...

Corollary
...

When V is not finite dimensional, this need not be true
...
In fact for any infinite
dimensional vector space, dim V ∗ is strictly larger than dim V , if we manage to
define dimensions for infinite-dimensional vector spaces
...
Consider the vector space Fn , where we treat each element as a column

vector (with respect to the standard
Pnbasis)
...
We
have
 
!
x1
n
X
 X
X
X

...

xi
i,j
i=1
xn
This is exactly what we want
...
Let V be a finite-dimensional vector space over F with bases
(e1 , · · · , en ) and (f1 , · · · , fn ), and that P is the change of basis matrix so that
fi =

n
X

Pki ek
...

Then the change of basis matrix from (ε1 , · · · , εn ) to (η1 , · · · , ηn ) is (P −1 )T , i
...

εi =

n
X

P`iT η`
...
For convenience, write Q = P −1 so that
ej =

n
X

Qkj fk
...

So εi =

Pn

`=1

P`iT η`
...

Definition (Annihilator)
...
Then the annihilator of U is
U 0 = {θ ∈ V ∗ : θ(u) = 0, ∀u ∈ U }
...

One might object that W 0 should be a subset of V ∗∗ and not V
...

Example
...
If U = he1 + 2e2 + e3 i and W = hε1 − ε3 , 2ε1 − ε2 i, then U 0 = W
and W 0 = U
...
This is typical
...
Let V be a vector space over F and U a subspace
...

31

3

Duality

IB Linear Algebra

We are going to prove this in many ways
...
Let (e1 , · · · , ek ) be a basis for U and extend to (e1 , · · · , en ) a basis for
V
...
Then we will show that
U 0 = hεk+1 , · · · , εn i
...
This is easy to prove — if j > k, then εj (ei ) = 0
for all i ≤ k
...
On the other hand, suppose θ ∈ U 0
...

j=1

But then 0 = θ(ei ) = λi for i ≤ k
...

Proof
...
This is
obviously linear
...
Moreover, the kernel is U 0
...

Since dim V ∗ = dim V and dim U ∗ = dim U , we’re done
...
We can show that U 0 ' (V /U )∗ , and then deduce the result
...


3
...

If we have a map α : V → W , then after dualizing, the map will go the other
direction, i
...
α∗ : W ∗ → V ∗
...

Definition (Dual map)
...
The dual map to α, written α∗ : W ∗ → V ∗ is given by θ 7→ θ ◦ α
...
So this is a genuine
map
...
Let α ∈ L(V, W ) be a linear map
...

This is not the same as what we remarked at the end of the definition of
the dual map
...

What we want to show here is that α∗ itself as a map W ∗ → V ∗ is linear
...
Let λ, µ ∈ F and θ1 , θ2 ∈ W ∗
...

To show this, we show that for every v ∈ V , the left and right give the same
result
...

So α∗ ∈ L(W ∗ , V ∗ )
...

Proposition
...
Let (e1 , · · · , en ) be a basis for V and (f1 , · · · , fm ) be a basis
for W ; (ε1 , · · · , εn ) and (η1 , · · · , ηm ) the corresponding dual bases
...

Then α∗ is represented by AT with respect to the corresponding dual bases
...
We are given that
m
X

α(ei ) =

Aki fk
...
To do so, we evaluate it at ej
...

k=1

k=1

We can also write this as
α∗ (ηi )(ej ) =

n
X

Aik εk (ej )
...


k=1

So done
...

So we have (βα)∗ = α∗ β ∗
...

Similarly, if α, β : U → V , then (λα + µβ)∗ = λα∗ + µβ ∗
...

So in the dual space, we conjugate by the dual of the change-of-basis matrices
...
The following lemma gives us some good tools to do so:
Lemma
...

Then
(i) ker α∗ = (im α)0
...

33

3

Duality

IB Linear Algebra

(iii) im α∗ = (ker α)0
...
However, (i) is almost trivial to
prove, but (iii) is rather hard
...

(i) If θ ∈ W ∗ , then
θ ∈ ker α∗ ⇔ α∗ (θ) = 0
⇔ (∀v ∈ V ) θα(v) = 0
⇔ (∀w ∈ im α) θ(w) = 0
⇔ θ ∈ (im α)0
...

Using (i), we see
n(α∗ ) = dim(im α)0
...

By the rank-nullity theorem, we have r(α) = r(α∗ )
...
We can only show that one
includes the other
...

Let θ ∈ im α∗
...
If v ∈ ker α, then
θ(v) = φ(α(v)) = φ(0) = 0
...

But we know
dim(ker α)0 + dim ker α = dim V,
So we have
dim(ker α)0 = dim V − n(α) = r(α) = r(α∗ ) = dim im α∗
...

Not only do we want to get from V to V ∗ , we want to get back from V ∗ to V
...
We already know that V ∗∗ is isomorphic
to V , since V ∗ is isomorphic to V already
...
To define such an isomorphism, we needed to pick
a basis for V and consider a dual basis
...
There is no natural, canonical, uniquely-defined
isomorphism between V and V ∗
...
The construction of this isomorphism is obvious once we think hard
what V ∗∗ actually means
...

34

3

Duality

IB Linear Algebra

Our isomorphism has to produce something in V ∗∗ given any v ∈ V
...

This is easy, by definition θ ∈ V ∗ is just a linear map V → F
...
We now just have to show that this is linear and is
bijective
...
Let V be a vector space over F
...

We call this the evaluation map
...
It is in some sense a “natural” map
...
We first show that ev(v) ∈ V ∗∗ for all v ∈ V , i
...
ev(v) is linear for any
v
...

So done
...
Let λ, µ ∈ F, v1 , v2 ∈ V
...

To show these are equal, pick θ ∈ V ∗
...

So done
...

Lemma
...

This is very false for infinite dimensional spaces
...

Proof
...
Suppose ev(v) = 0 for some v ∈ V
...
So dimhvi0 = dim V ∗ = dim V
...
So v = 0
...
Since V and V ∗∗ have the same
dimension, this is also surjective
...

From now on, we will just pretend that V and V ∗∗ are the same thing, at
least when V is finite dimensional
...
This says there is a
completely canonical way to choose the isomorphism
...

So we can think of V as a subspace of V ∗∗ in a canonical way
...
Let V, W be finite-dimensional vector spaces over F after identifying
(V and V ∗∗ ) and (W and W ∗∗ ) by the evaluation map
...

(ii) If α ∈ L(V, W ), then α∗∗ = α
...

(i) Let u ∈ U
...
So u annihilates
everything in U 0
...
So U ⊆ U 00
...

So we must have U = U 00
...
The only work we have to do is to show that the dual of
the dual basis is the original basis
...
We know
that
ei (εj ) = δij = εj (ei ), fi (ηj ) = δij = ηj (fi )
...

If α is represented by A, then α∗ is represented by AT
...
So done
...
Let V be a finite-dimensional vector space F and U1 , U2 are
subspaces of V
...

(i) Suppose θ ∈ V ∗
...

(ii) We have
(U1 ∩ U2 )0 = ((U10 )0 ∩ (U20 )0 )0 = (U10 + U20 )00 = U10 + U20
...


36

4

4

Bilinear forms I

IB Linear Algebra

Bilinear forms I

So far, we have been looking at linear things only
...

For a change, we look at bi linear maps instead
...
It turns out there isn’t much we can say about them,
and hence this chapter is rather short
...

Definition (Bilinear form)
...
Then a function
φ : V × W → F is a bilinear form if it is linear in each variable, i
...
for each
v ∈ V , φ(v, · ) : W → F is linear; for each w ∈ W , φ( · , w) : V → F is linear
...
The map defined by
V ×V∗ →F
(v, θ) 7→ θ(v) = ev(v)(θ)
is a bilinear form
...
Let V = W = Fn
...


Example
...

Example
...
Then
φ : Fm × Fn → F
(v, w) 7→ vT Aw
is bilinear
...

In fact, this is the most general form of bilinear forms on finite-dimensional
vector spaces
...
Let (e1 , · · · , en ) be a basis for
V and (f1 , · · · , fm ) be a basis for W , and ψ : V × W → F
...

P
λi ei and w = µj fj , then by linearity, we get
X

ψ(v, w) = ψ
λi ei , w
X
=
λi ψ(ei , w)
i

=

X

=

X

 X

λi ψ ei ,
µj fj

i

λi µj ψ(ei , fj )

i,j

= λT Aµ
...

We have identified linear maps with matrices, and we have identified bilinear
maps with matrices
...

They are, obviously, two different things
...

Proposition
...

Let ψ : V × W → F be a bilinear form represented by A with respect to
(e1 , · · · , en ) and (f1 , · · · , fm ), and by B with respect to the bases (v1 , · · · , vn )
and (w1 , · · · , wm )
...

The difference with the transformation laws of matrices is this time we are
taking transposes, not inverses
...
We have
Bij = φ(vi , wj )
X

X

Pki ek ,
Q`j f`
X
=
Pki Q`j φ(ek , f` )
X
T
=
Pik
Ak` Q`j
k,`

= (P T AQ)ij
...

If we are given a bilinear form ψ : V × W → F, we immediately get two linear
maps:
ψL : V → W ∗ , ψR : W → V ∗ ,
defined by ψL (v) = ψ(v, · ) and ψR (w) = ψ( · , w)
...
On the other hand, ψR : V ∗ → V ∗ is the identity
map
...
Let (ε1 , · · · , εn ) be a basis for V ∗ dual to (e1 , · · · , en ) of V ; (η1 , · · · , ηn )
be a basis for W ∗ dual to (f1 , · · · , fn ) of W
...


38

4

Bilinear forms I

IB Linear Algebra

Proof
...


So we get
ψL (ei ) =

X

AT`i η`
...

We also have
ψR (fj )(ei ) = Aij
...


Definition (Left and right kernel)
...

Then by definition, v is in the left kernel if ψ(v, w) = 0 for all w ∈ W
...

Similarly, if U ⊆ W , then we write


U = {v ∈ V : ψ(v, u) = 0 for all u ∈ U }
...

If we have a non-trivial left (or right) kernel, then in some sense, some
elements in V (or W ) are “useless”, and we don’t like these
...
ψ is non-degenerate if the left and
right kernels are both trivial
...

Definition (Rank of bilinear form)
...
This is well-defined since r(P T AQ) = r(A) if P and Q are
invertible
...

Lemma
...

Let ψ : V × W → F be a bilinear form represented by A with respect to these
bases
...
In
particular, V and W have the same dimension
...

Proof
...
So we need r(A) = dim V
and r(AT ) = dim W
...
e
...
So done
...
The map
F2 × F2 → F
   
a
b
,
7→ ad − bc
c
d
is a bilinear form
...
We have ψ(v, w) = −ψ(w, v) for all v, w ∈ F2
...
Here we are going to give a
slightly more abstract definition, and spend quite a lot of time trying motivate
this definition
...
It is proved in IA Groups that this
is well-defined
...
Let A ∈ Matn,n (F)
...


i=1

σ∈Sn

This is a big scary definition
...
We
will eventually prove a formula that is useful for computing the determinant,
which is probably how you were first exposed to the determinant
...
If n = 2, then S2 = {id, (1 2)}
...

When n = 3, then S3 has 6 elements, and
det A = A11 A22 A33 + A12 A23 A31 + A13 A21 A32
− A11 A23 A32 − A22 A31 A13 − A33 A12 A21
...

Lemma
...

Proof
...


41

Ajτ (j)

5

Determinants of matrices

IB Linear Algebra

Lemma
...
e
...


...


...


...


...


i=1

Proof
...
So
n
Y

Aiσ(i) = 0

i=1

if there is some i ∈ {1, · · · , n} such that i > σ(i)
...
So
the only thing that contributes in the sum is σ = id
...


i=1

To motivate this definition, we need a notion of volume
...
For example,
saying the volume is “1” is meaningless unless we provide the units, e
...
cm3
...

Definition (Volume form)
...
e
...

(ii) Alternating, i
...
if vi = vj for some i 6= j, then
d(v1 , · · · , vn ) = 0
...

We can view A ∈ Matn (F) as n-many vectors in Fn by considering its columns
A = (A(1) A(2) · · · A(n) ), with A(i) ∈ Fn
...
det A is a volume form
...
To see that det is multilinear, it is sufficient to show that each
n
Y

Aiσ(i)

i=1

is multilinear for all σ ∈ Sn , since linear combinations of multilinear forms are
multilinear
...

To show it is alternating, suppose now there are some k, ` distinct such that
A(k) = A(`)
...
By Lagrange’s theorem, we
can write
Sn = An q τ An ,
where An = ker ε and q is the disjoint union
...
So we get
n
X Y
σ∈An i=1

n
X Y

Aiσ(i) =

Aiσ0 (i) ,

σ 0 ∈τ An i=1

But we know that
det A = LHS − RHS = 0
...

We have shown that determinants are volume forms, but is this the only
volume form? Well obviously not, since 2 det A is also a valid volume form
...

Before we show that, we need the following
Lemma
...
Then swapping two entries changes
the sign, i
...

d(v1 , · · · , vi , · · · , vj , · · · , vn ) = −d(v1 , · · · , vj , · · · , vi , · · · , vn )
...
By linearity, we have
0 = d(v1 , · · · , vi + vj , · · · , vi + vj , · · · , vn )
= d(v1 , · · · , vi , · · · , vi , · · · , vn )
+ d(v1 , · · · , vi , · · · , vj , · · · , vn )
+ d(v1 , · · · , vj , · · · , vi , · · · , vn )
+ d(v1 , · · · , vj , · · · , vj , · · · , vn )
= d(v1 , · · · , vi , · · · , vj , · · · , vn )
+ d(v1 , · · · , vj , · · · , vi , · · · , vn )
...


43

5

Determinants of matrices

IB Linear Algebra

Corollary
...

Theorem
...
Then
d(A(1) , · · · , A(n) ) = (det A)d(e1 , · · · , en ),
where {e1 , · · · , en } is the standard basis
...
We can compute
d(A

(1)

(n)

,··· ,A

n
X

)=d

!
(2)

Ai1 ei , A

(n)

,··· ,A

i=1

=
=

n
X

Ai1 d(ei , A(2) , · · · , A(n) )

i=1
n
X

Ai1 Aj2 d(ei , ej , A(3) , · · · , A(n) )

i,j=1

=

X

d(ei1 , · · · , ein )

i1 ,··· ,in

n
Y

Ai j j
...
So we are just summing over distinct tuples, i
...
when there is some σ
such that ij = σ(j)
...


j=1

σ∈Sn

However, by our corollary up there, this is just
d(A(1) , · · · , A(n) ) =

X

ε(σ)d(e1 , · · · , en )

n
Y

Aσ(j)j = (det A)d(e1 , · · · , en )
...

We can rewrite the formula as
d(Ae1 , · · · , Aen ) = (det A)d(e1 , · · · , en )
...

So we know that det A is the volume rescaling factor of an arbitrary parallelopiped,
and this is true for any volume form d
...
Let A, B ∈ Matn (F)
...

44

5

Determinants of matrices

IB Linear Algebra

Proof
...
g
...
Then
we can compute
d(ABe1 , · · · , ABen ) = (det AB)d(e1 , · · · , en ),
but we also have
d(ABe1 , · · · , ABen ) = (det A)d(Be1 , · · · , Ben ) = (det A)(det B)d(e1 , · · · , en )
...

Corollary
...
In fact, when A is
invertible, then det(A−1 ) = (det A)−1
...
We have
1 = det I = det(AA−1 ) = det A det A−1
...

Definition (Singular matrices)
...
Otherwise,
it is non-singular
...
Is the converse
true? If det A 6= 0, then can we conclude that A is invertible? The answer is
yes
...
We will later
prove this fact again by constructing an explicit formula for the inverse, which
involves dividing by the determinant
...

Theorem
...
Then the following are equivalent:
(i) A is invertible
...

(iii) r(A) = n
...
We have proved that (i) ⇒ (ii) above, and the rank-nullity theorem implies
(iii) ⇒ (i)
...
In fact we will show the contrapositive
...
By rank-nullity theorem, n(A) > 0
...
 such that Ax = 0
...
We define B as follows:
λn

1





B=







...


...

1

λk−1
λk
λk+1

...

λn

1

...













1

So AB has the kth column identically zero
...
So it is sufficient
to prove that det(B) 6= 0
...
So done
...
To do so, we introduce the
following notation:
Notation
...

Lemma
...
Then
(i) We can expand det A along the jth column by
det A =

n
X

(−1)i+j Aij det Aˆij
...


j=1

We could prove this directly from the definition, but that is messy and scary,
so let’s use volume forms instead
...
Since det A = det AT , (i) and (ii) are equivalent
...
We have
det A = d(A(1) , · · · , A(n) ),
where d is the volume form induced by the determinant
...
We can move our columns around so that our matrix becomes


Aˆij 0
B=
stuff 1
We get that det B = det Aˆij , since the only permutations that give a non-zero
sum are those that send n to n
...
So we have
det A =

n
X

Aij (−1)n−j (−1)n−i det B

i=1

=

n
X

Aij (−1)i+j det Aˆij
...

46

5

Determinants of matrices

IB Linear Algebra

Definition (Adjugate matrix)
...
The adjugate matrix of A,
written adj A, is the n × n matrix such that (adj A)ij = (−1)i+j det Aˆji
...
If A ∈ Matn (F), then A(adj A) = (det A)In = (adj A)A
...

det A
Note that this is not an efficient way to compute the inverse
...
We compute
[(adj A)A]jk =

n
n
X
X
(adj A)ji Aik =
(−1)i+j det Aˆij Aik
...

Otherwise, if j 6= k, consider the matrix B obtained from A by replacing the
jth column by the kth column
...
But we know that if two columns are the same, the determinant is
zero
...
So
[(adj A)A]jk = det Aδjk
The calculation for [A adj A] = (det A)In can be done in a similar manner, or by
considering (A adj A)T = (adj A)T AT = (adj(AT ))AT = (det A)In
...
So if A is invertible, then its inverse is
given by a rational function (i
...
ratio of two polynomials) in the entries of A
...
There are better ways computationally, such as Gaussian
elimination
...

Lemma
...
Then for any C, we have


A C
det
= (det A)(det B)
...
Suppose A ∈ Matk (F), and B ∈ Mat` (F), so C ∈ Matk,` (F)
...

0 B
Then by definition, we have
det X =

X

ε(σ)

σ∈Sk+`

k+`
Y

Xiσ(i)
...
We only want to sum over permutations σ such
that σ(i) > k if i > k
...
So we can decompose this into
47

5

Determinants of matrices

IB Linear Algebra

σ = σ1 σ2 , where σ1 is a permutation of {1, · · · , k} and fixes the remaining things,
while σ2 fixes {1, · · · , k}, and permutes the remaining
...



A1

stuff

n
 Y

det Ai
=
 i=1

A2



det 



...


0

An

48


Bjσ2 (j) 

6

Endomorphisms

6

IB Linear Algebra

Endomorphisms

Endomorphisms are linear maps from a vector space V to itself
...
Then we proved that by choosing the right
bases, we can put matrices into a nice form with only 1’s in the diagonal
...
One major objective is to classify all matrices up to similarity, where two
matrices are similar if they represent the same endomorphism under different
bases
...
1

Invariants

Definition
...
An endomorphism
of V is a linear map α : V → V
...

When we think about matrices representing an endomorphism of V , we’ll
use the same basis for the domain and the range
...

Lemma
...

If A represents α with respect to (e1 , · · · , en ) and B represents α with respect
to (f1 , · · · , fn ), then
B = P −1 AP,
where P is given by
fi =

n
X

Pji ej
...
This is merely a special case of an earlier more general result for arbitrary
maps and spaces
...
We say matrices A and B are similar or conjugate
if there is some P invertible such that B = P −1 AP
...
GLn (F) acts on
Matn (F) by conjugation:
(P, A) 7→ P AP −1
...
Then A and B are similar iff they
are in the same orbit
...

Our main goal is to classify the orbits, i
...
find a “nice” representative for
each orbit
...
For example, we will show that the rank, trace, determinant and
characteristic polynomial are all such invariants
...
The trace of a matrix of A ∈ Matn (F) is defined by
tr A =

n
X

Aii
...
In fact, we will show a
stronger statement (as well as the corresponding statement for determinants):
Lemma
...

(ii) If A, B ∈ Matn (F) are similar, then tr A = tr B
...

Proof
...


j=1 i=1

(ii) Suppose B = P −1 AP
...

(iii) We have
det(P −1 AP ) = det P −1 det A det P = (det P )−1 det A det P = det A
...

Definition (Trace and determinant of endomorphism)
...
Then the trace of α is tr α = tr A,
and the determinant is det α = det A
...
We
can also define the determinant without reference to a basis, by defining more
general volume forms and define the determinant as a scaling factor
...

To talk about the characteristic polynomial, we need to know what eigenvalues
are
...
Let α ∈ End(V )
...

v is an eigenvector if α(v) = λv for some λ ∈ F
...
e
...

where ι is the identity function
...
The characteristic polynomial of α is
defined by
χα (t) = det(tι − α)
...
These two
definitions are obviously equivalent up to a factor of −1, but this definition
has an advantage that χα (t) is always monic, i
...
the leading coefficient is 1
...

We know that λ is an eigenvalue of α iff n(α − λι) > 0 iff r(α − λι) < dim V
iff χα (λ) = det(λι − α) = 0
...

If A ∈ Matn (F), we can define χA (t) = det(tI − A)
...
If A and B are similar, then they have the same characteristic polynomial
...

det(tI − P −1 AP ) = det(P −1 (tI − A)P ) = det(tI − A)
...
Let α ∈ End(V ) and λ1 , · · · , λk distinct eigenvalues of α
...

Proof
...
We want to show that they are equal
...
Consider βj ∈ End(V ) defined
by
Y
βj =
(α − λr ι)
...


6

Endomorphisms

IB Linear Algebra

Each summand is zero, unless i 6= j
...

i=1

r6=j

Similarly, we obtain
βj

k
X
i=1

Since we know that

P

!
yi

=

Y

(λj − λr )(yj )
...

r6=j

r6=j

Q
Since we know that r6=j (λ
Pr − λj ) 6= 0, we must have xi = yi for all i
...

So each expression for
The proof shows that any set of non-zero eigenvectors with distinct eigenvalues
is linearly independent
...
We say α ∈ End(V ) is diagonalizable if there is
some basis for V such that α is represented by a diagonal matrix, i
...
all terms
not on the diagonal are zero
...

Theorem
...
Write
Ei for E(λi )
...

(ii) V has a basis of eigenvectors for α
...

Pk
(iv) dim V = i=1 dim Ei
...

– (i) ⇔ (ii): Suppose (e1 , · · · , en ) is a basis for V
...
Then A is diagonal iff each ei is an eigenvector
...
So done
...


52

6

Endomorphisms

6
...
2
...

Definition (Polynomial)
...

We write F[t] for the set of polynomials over F
...
For example, if F = Z/pZ, then tp and t are different polynomials,
even though they define the same function (by Fermat’s little theorem/Lagrange’s
theorem)
...

However, we will later see that if F is R or C, then polynomials are equal
if and only if they represent the same function, and this distinction is not as
important
...
Let f ∈ F[t]
...
In particular, deg 0 = −∞
...

6 0), then there exists
Lemma (Polynomial division)
...

Proof is omitted
...
If λ ∈ F is a root of f , i
...
f (λ) = 0, then there is some g such that
f (t) = (t − λ)g(t)
...
By polynomial division, let
f (t) = (t − λ)g(t) + r(t)
for some g(t), r(t) ∈ F[t] with deg r < deg(t − λ) = 1
...
e
...
Now evaluate this at λ
...

So a0 = 0
...
So done
...
Let f ∈ F[t] and λ a root of f
...
e
...

53

6

Endomorphisms

IB Linear Algebra

We can use the last lemma and induction to show that any non-zero f ∈ F[t]
can be written as
k
Y
f = g(t) (t − λi )ai ,
i=1

where λ1 , · · · , λk are all distinct, ai > 1, and g is a polynomial with no roots in
F
...
A non-zero polynomial f ∈ F[t] has at most deg f roots, counted with
multiplicity
...
Let f, g ∈ F[t] have degree < n
...

Proof
...
This has degree less than n, and
(f − g)(λi ) = 0 for i = 1, · · · , n
...
So f = g
...
If F is infinite, then f and g are equal if and only if they agree on
all points
...
Every non-constant polynomial over C has a root in C
...

We say C is an algebraically closed field
...
So the number of roots is
ai = deg f ,
counted with multiplicity
...

6
...
2

Minimal polynomial
Pm
Notation
...

Theorem (Diagonalizability theorem)
...
Then α is diagonalizable if and only if there exists non-zero p(t) ∈ F[t] such that p(α) = 0, and
p(t) can be factored as a product of distinct linear factors
...
Suppose α is diagonalizable
...
We have
k
M
V =
E(λi )
...


i=1

Now let
p(t) =

k
Y

(t − λi )
...


i=1

So p(α) = 0
...

Conversely, suppose we have our polynomial
p(t) =

k
Y

(t − λi ),

i=1

with λ1 , · · · , λk ∈ F distinct, and p(α) = 0 (we can wlog assume p is monic, i
...

the leading coefficient is 1)
...


i=1

In other words, we want to show
P that for all v ∈ V , there is some vi ∈ Eα (λi )
for i = 1, · · · , k such that v =
vi
...

λj − λi
i6=j

This is a polynomial of degree k − 1, and qj (λi ) = δij
...

i=1

We still have deg q ≤ k − 1, but q(λi ) = 1 for any i
...

Let πj : V → V be given by πj = qj (α)
...


j=1

P
Hence given v ∈ V , we know that v = πj v
...
This is true since
k
Y
1
1
(α − λι )(v) = Q
p(α)(v) = 0
...

So done
...
So πi is a projection
onto the Eα (λi )
...
The minimal polynomial of α ∈ End(V ) is
the non-zero monic polynomial Mα (t) of least degree such that Mα (α) = 0
...

Note that if A represents α, then for all p ∈ F[t], p(A) represents p(α)
...
So the minimal polynomial of α is the minimal
polynomial of A if we define MA analogously
...

Existence is always guaranteed in finite-dimensional cases
...
So ι, α, α2 , · · · , αn are linearly dependent
...


i=0

So deg Mα ≤ n2
...

To show that the minimal polynomial is unique, we will prove the following
stronger characterization of the minimal polynomial:
Lemma
...
Then p(α) = 0 if and only if Mα (t) is
a factor of p(t)
...

Proof
...
Then
p(α) = q(α)Mα (α) + r(α)
...
But deg r < deg Mα
...
So p(α) = 0 iff Mα (t) | p(t)
...
So M2 is just a scalar multiple of M1
...

Example
...

0 1
0 1
Consider the polynomial p(t) = (t − 1)2
...
So
MA (t) and MB (t) are factors of (t − 1)2
...

So the minimal polynomials are either (t − 1) or (t − 1)2
...

56

6

Endomorphisms

IB Linear Algebra

We can now re-state our diagonalizability theorem
...
0)
...
Then α is diagonalizable if and only if Mα (t) is a product of its distinct linear factors
...
(⇐) This follows directly from the previous diagonalizability theorem
...
Then there is some p ∈ F[t] non-zero such
that p(α) = 0 and p is a product of distinct linear factors
...

Theorem
...
Then α and β are
simultaneously diagonalizable (i
...
there exists a basis with respect to which
both are diagonal) if and only if αβ = βα
...
This means that if two operators
do not commute, then they do not have a common eigenbasis
...

Proof
...
But AB represents αβ and BA represents βα
...

(⇐) Suppose αβ = βα
...
Since α is diagonalizable,
we can write
k
M
V =
Eα (λi ),
i=1

where λi are the eigenvalues of V
...
We want to show
that β sends Ei to itself, i
...
β(Ei ) ⊆ Ei
...
Then we want β(v) to be
in Ei
...

So β(v) is an eigenvector of α with eigenvalue λi
...
Note that
Mβ (β|Ei ) = Mβ (β)|Ei = 0
...
So we can choose a basis Bi of eigenvectors for β|Ei
...

Sk
Then since V is a direct sum of the Ei ’s, we know that B = i=1 Bi is a
basis for V consisting of eigenvectors for both α and β
...


6
...

Recall that χα (t) = det(tι − α) for α ∈ End(V )
...
Let V be a finite-dimensional vector
space and α ∈ End(V )
...
e
...
In particular,
deg Mα ≤ n
...
It is tempting to prove
this by substituting t = α into det(tι − α) and get det(α − α) = 0, but this is
meaningless, since what the statement χα (t) = det(tι − α) tells us to do is to
expand the determinant of the matrix


t − a11
a12
···
a1n
 a21
t − a22 · · ·
a2n 



...


...


...


...

an1

···

an2

t − ann

to obtain a polynomial, and we clearly cannot substitute t = A in this expression
...

Note also that if ρ(t) ∈ F[t] and


λ1



...

λn
then





ρ(λ1 )

...



...
So if α is
diagonalizable, then the theorem is clear
...
Diagonalizable matrices are nice
...

Definition (Triangulable)
...


...


...


...


...

Lemma
...
In particular, if F = C
(or any algebraically closed field), then every endomorphism is triangulable
...
Suppose that α is triangulable and represented by


λ1 ∗ · · · ∗
 0 λ2 · · · ∗ 



...


...


...

0
0 · · · λn

58

6

Endomorphisms

Then

IB Linear Algebra



t − λ1
 0

χα (t) = det 
...

0


t − λ2

...


···
···

...





...


0

···

t − λn


n
 Y

(t − λi )
...

We are going to prove the converse by induction on the dimension of our
space
...

Suppose α ∈ End(V ) and the result holds for all spaces of dimensions
< dim V , and χα is a product of linear factors
...

Now let U = E(λ) 6= 0, and let W be a complementary subspace to U in V ,
i
...
V = U ⊕ W
...
So χB (t) is also a product of linear factors
...

(Note that in general, β is not α|W in general, since α does not necessarily
map W to W
...
This can
be much more elegantly expressed in terms of quotient spaces, but unfortunately
that is not officially part of the course)
Since dim W < dim V , there is a basis vr+1 , · · · , vn for W such that β is
represented by C, which is upper triangular
...
So α is represented by


λIr stuff
0
C
with respect to (u1 , · · · , ur , vr+1 , · · · , vn ), which is upper triangular
...
Consider the real rotation matrix


cos θ
sin θ

...
This is since the eigenvalues are e±iθ and are not real
...

For this reason, in the rest of the section, we are mostly going to work in C
...

59

6

Endomorphisms

IB Linear Algebra

Theorem (Cayley-Hamilton theorem)
...
Then χα (α) = 0, i
...
Mα (t) | χα (t)
...

Proof
...
By the lemma, we can choose a basis
{e1 , · · · , en } is represented by an upper triangular matrix
...


...


...


...


i=1

Write Vj = he1 , · · · , ej i
...

We also know that dim Vj = j
...

Now note that since A is upper-triangular, we get
α(ei ) =

i
X

Aki ek ∈ Vi
...

Moreover, we have
(α − λj ι)(ej ) =

j−1
X

Akj ek ⊆ Vj−1

k=1

for all j = 1, · · · , n
...
Hence by induction on n − j, we have
n
Y
(α − λi ι)(Vn ) ⊆ Vj−1
...


i=1

So χα (α) = 0 as required
...


60

6

Endomorphisms

IB Linear Algebra

We can see this proof more “visually” as follows: for simplicity of expression,
we suppose n = 4
...

0
0


∗ ∗ ∗

0
∗ ∗ ∗

0 0 ∗ 0
0 0 ∗
0

∗ ∗ ∗
∗ ∗ ∗

0 0 0
0 0 0


∗ ∗ ∗
∗ ∗ ∗

0 ∗ ∗
0 0 0

This is exactly what we showed in the proof — after multiplying out the first k
elements of the product (counting from the right), the image is contained in the
span of the first n − k basis vectors
...
We’ll now prove the theorem again, which is somewhat a formalization
of the “nonsense proof” where we just substitute t = α into det(α − tι)
...
Then
B adj B = det BIn = χα (t)In
...

So we can write
adj B = Bn−1 tn−1 + Bn−2 tn−2 + · · · + B0 ,
with Bi ∈ Matn (F)
...


61

6

Endomorphisms

IB Linear Algebra

Then we get the result
(tIn − A)(Bn−1 tn−1 + Bn−2 tn−2 + · · · + B0 ) = (tn + an−1 tn−1 + · · · + a0 )In
...


...


...


...


...

However, what we can do is to note that since this is true for all values of t,
the coefficients on both sides must be equal
...


...


...

Summing this up then gives χα (A) = 0
...
In fact, we can do this, after we develop sufficient machinery
...

Lemma
...
Then the following are equivalent:
(i) λ is an eigenvalue of α
...

(iii) λ is a root of Mα (t)
...

– (i) ⇔ (ii): λ is an eigenvalue of α if and only if (α − λι)(v) = 0 has a
non-trivial root, iff det(α − λι) = 0
...

62

6

Endomorphisms

IB Linear Algebra

– (i) ⇒ (iii): Let λ be an eigenvalue, and v be a corresponding eigenvector
...

We also know that
Mα (α)(v) = Mα (λ)v
...

– (iii) ⇒ (i): This is not necessary since it follows from the above, but
we could as well do it explicitly
...
Then
Mα (t) = (t − λ)g(t) for some g ∈ F[t]
...
Hence by
minimality of Mα , we must have g(α) 6= 0
...
Then
(α − λι)g(α)(v) = Mα (α)v = 0
...
So g(α)(v) ∈ Eα (λ) \ {0}
...

Example
...
So we know that the minimal polynomial
is one of (t − 1)2 (t − 2) and (t − 1)(t − 2)
...
So we
know that MA (t) = (t − 1)(t − 2)
...


6
...
The following
properties will help determine which Jordan normal form a matrix can have
...
Let α ∈ End(V ) and λ an
eigenvalue of α
...

(ii) The geometric multiplicity of λ, written gλ , is the dimension of the corresponding eigenspace, dim Eα (λ)
...

We will look at some extreme examples:
Example
...


...


...


...


...


...


1
λ

We will later show that aλ = n = cλ and gλ = 1
...
Then aλ = gλ = n, cλ = 1
...
If λ is an eigenvalue of α, then
(i) 1 ≤ gλ ≤ aλ
(ii) 1 ≤ cλ ≤ aλ
...

(i) The first inequality is easy
...
So
gλ = dim E(λ) ≥ 1
...
So aλ > g = gλ
...

Lemma
...
Then the following are equivalent:
(i) α is diagonalizable
...

(iii) cλ = 1 for all λ
...

P
– (i) ⇔ (ii): α is diagonalizable iff dim V =
dim Eα (λi )
...

P
P
So we must have
gλi =
aλi
...

– (i) ⇔ (iii): α is diagonalizable if and only if Mα (t) is a product of distinct
linear factors if and only if cλ = 1 for all eigenvalues λ
...
We say A ∈ MatN (C) is in Jordan normal
form if it is a block diagonal of the form


Jn1 (λ1 )
0


Jn2 (λ2 )





...

0
Jnk (λk )
64

6

Endomorphisms

IB Linear Algebra

where k ≥ 1, n1 , · · · , nk ∈ N such that n
distinct, and

λ 1

0 λ
Jm (λ) = 

...

0

=

0

P

···

...


...

···

ni , λ1 , · · · , λk not necessarily

0

...



1
λ

is an m × m matrix
...

For example, it might look something like


λ1 1
 λ1 1





λ
0
1




λ
0
2




λ
1
3




λ
0
3




...


...

Then we have the following theorem:
Theorem (Jordan normal form theorem)
...
Moreover, this Jordan normal form matrix
is unique up to permutation of the blocks
...
We will not prove this completely
...
The
remainder of the proof is left for IB Groups, Rings and Modules
...
If α ∈ End(V ) is an endomorphism of a finite-dimensional vector space V over C, then the theorem says
there exists a basis such that α is represented by a matrix in Jordan normal
form, and this is unique as before
...

Example
...
We see that MA determines the Jordan normal form of
A, but χA does not
...
Here λ1 , λ2
and λ3 are distinct complex numbers
...
We do indeed need χA in the second case, since if we are given
MA = (t − λ1 )(t − λ2 ), we know one of the roots is double, but not which one
...

We now want to understand the Jordan normal blocks better
...

 0 λ
...

Jn (λ) = 
...
1 
0 0 ··· λ
If (e1 , · · · , en ) is the standard basis for Cn , we have
Jn (0)(e1 ) = 0,
Thus we know

Jn (0)(ei ) = ei−1 for 2 ≤ i ≤ n
...

0

6

Endomorphisms

IB Linear Algebra

If k ≥ n, then we have (Jn (λ) − λI)k = 0
...

Note that if A = Jn (λ), then χA (t) = MA (t) = (t − λ)n
...
So aλ = cλ = n
...
So gλ = 1
...
We have just studied individual Jordan blocks
...
If A is the block diagonal
matrix


A1


A2


A=
,

...

Ak
then
χA (t) =

k
Y

χAi (k)
...



...


ρ(Ak )

Hence
MA (t) = lcm(MA1 (t), · · · , MAk (t))
...


i=1

Thus if A is in Jordan normal form, we get the following:
– gλ is the number of Jordan blocks in A with eigenvalue λ
...

– cλ is the size of the largest Jordan block with eigenvalue λ
...

Theorem
...
Then
the number of Jordan blocks Jn (λ) in A with n ≥ r is
n((α − λι)r ) − n((α − λι)r−1 )
...
Also, given this information,
we can figure out how many Jordan blocks of size exactly n by doing the right
subtraction
...

67

6

Endomorphisms

IB Linear Algebra

Proof
...





...

Jnk (λk )

We have previously computed
(
r

n((Jm (λ) − λIm ) ) =

r
m

r≤m

...


It is also easy to see that for µ 6= λ,
n((Jm (µ) − λIm )r ) = n(Jm (µ − λ)r ) = 0
Adding up for each block, for r ≥ 1, we have
n((α−λι)r )−n((α−λι)r−1 ) = number of Jordan blocks Jn (λ) with n ≥ r
...
So we kill off
0
0
0
0
one more column in the matrix, and the nullity increase by one
...
So when we look at the difference in nullity, we are counting the
number of blocks that are affected by the increase in power, which is the number
of blocks of size at least r
...
To show
this, we will reduce it to the case where there is exactly one eigenvalue
...
In
general, we need to work with “generalized eigenspaces”
...
Let V be a finite-dimensional
vector space C such that α ∈ End(V )
...
Then
V = V1 ⊕ · · · ⊕ Vk ,
where Vi = ker((α − λi ι)ci ) is the generalized eigenspace
...

Note that if c1 = · · · = ck = 1, then we recover the diagonalizability theorem
...
We will again prove this by constructing projection maps to each of
the Vi
...
Let
pj (t) =

Y
(t − λi )ci
...
e
...
Thus by Euclid’s
algorithm, there exists q1 , · · · , qk ∈ C[t] such that
X
pi qi = 1
...

Then
πj = ι
...

So im πj ⊆ Vj
...
Then
v = ι(v) =

k
X

πj (v) ∈

X

Vj
...


To show this is a direct sum, note that πi πj = 0, since the product contains
Mα (α) as a factor
...

P
So π is a projection, and πj |Vj = ιVj
...

Hence
there
is
a
unique
way
of writing v as a sum of
L
things in Vi
...

Note that we didn’t really use the fact that the vector space is over C, except
to get that the minimal polynomial is a product of linear factors
...
The converse is also
true — if it can be put into Jordan normal form, then the minimal polynomial
is a product of linear factors, since we’ve seen that a necessary and sufficient
condition for the minimal polynomial to be a product of linear factors is for
there to be a basis in which the matrix is upper triangular
...
Further by replacing α by α − λι, we can reduce this to the case where
0 is the only eigenvalue
...
We say α ∈ End(V ) is nilpotent if there is some r such
that αr = 0
...
This is
since α is nilpotent if the minimal polynomial is tr for some r
...
This is the point where we stop
...
There is in fact an
elementary proof of it, and we’re not doing it not because it’s hard, but because
we don’t have time
...
Let

3
A = 1
1

−2
0
0


0
0
1

We know we can find the Jordan normal form by just computing the minimal
polynomial and characteristic polynomial
...

We first compute the eigenvalues of A
...

det  1
1
0 t−1
We now compute the eigenspaces of A
...
We have


1 −2 0
A − 2I = 1 −2 0 
1 0 −1
This has rank 2 and

*2+
EA (2) =

1
...
So the minimal polynomial must also be MA (t) =
χA (t) = (t − 1)2 (t − 2)
...
We want a basis
(v1 , v2 , v3 ) of C3 such that
Av1 = v1 ,

Av2 = v1 + v2 ,

Av3 = 2v3
...


Equivalently, we have
(A − I)v1 = 0,

There is an obvious choices v3 , namely the eigenvector of eigenvalue 2
...
Then we can let v1 = (A − I)v1
...
We have


2 −2 0
(A − I)2 = 1 −1 0
2 −2 0
The kernel of this is
*0 1+
ker(A − I)2 = 0 , 1
...
So we have
 
 
 
1
0
2
v2 = 1 , v1 = 0 , v3 = 1
...

2

7

Bilinear forms II

7

IB Linear Algebra

Bilinear forms II

In Chapter 4, we have looked at bilinear forms in general
...
We are also not looking into general bilinear forms on a single
space, but just those that are symmetric
...
1

Symmetric bilinear forms and quadratic forms

Definition (Symmetric bilinear form)
...
A bilinear
form φ : V × V → F is symmetric if
φ(v, w) = φ(w, v)
for all v, w ∈ V
...
If S ∈ Matn (F) is a symmetric matrix, i
...
S T = S, the bilinear form
φ : Fn × Fn → F defined by
T

φ(x, y) = x Sx =

n
X

xi Sij yj

i,j=1

is a symmetric bilinear form
...
Let V be a finite-dimensional vector space over F, and φ : V × V → F
is a symmetric bilinear form
...
e
...
Then
φ is symmetric if and only if M is symmetric
...
If φ is symmetric, then
Mij = φ(ei , ej ) = φ(ej , ei ) = Mji
...
So M is symmetric
...

We are going to see what happens when we change basis
...

Lemma
...
Let (e1 , · · · , en ) and (f1 , · · · , fn ) be bases of V such that
fi =

n
X

Pki ek
...

Proof
...

This motivates the following definition:
Definition (Congruent matrices)
...

It is easy to see that congruence is an equivalence relation
...

Thus, to classify (symmetric) bilinear forms is the same as classifying (symmetric) matrices up to congruence
...

Definition (Quadratic form)
...

Note that quadratic forms are not linear maps (they are quadratic)
...
Let V = R2 and φ be represented by A with respect to the standard
basis
...

y
A21 A22
y
Notice that if A is replaced the symmetric matrix
1
(A + AT ),
2
then we get a different φ, but the same q
...

Proposition (Polarization identity)
...
e
...
g
...
If q : V → F is a quadratic form, then there exists a
unique symmetric bilinear form φ : V × V → F such that
q(v) = φ(v, v)
...
Let ψ : V × V → F be a bilinear form such that ψ(v, v) = q(v)
...
This is clearly a bilinear form, and it is also clearly symmetric
and satisfies the condition we wants
...

73

7

Bilinear forms II

IB Linear Algebra

To prove uniqueness, we want to find out the values of φ(v, w) in terms of
what q tells us
...
We compute
q(v + w) = φ(v + w, v + w)
= φ(v, v) + φ(v, w) + φ(w, v) + φ(w, w)
= q(v) + 2φ(v, w) + q(w)
...

2
So it is determined by q, and hence unique
...
Let V be a finite-dimensional vector space over F, and φ : V ×V → F
a symmetric bilinear form
...

This tells us classifying symmetric bilinear forms is easier than classifying
endomorphisms, since for endomorphisms, even over C, we cannot always make
it diagonal, but we can for bilinear forms over arbitrary fields
...
We induct over n = dim V
...

Suppose we have proven the result for all spaces of dimension less than n
...
We want to show that
we must have φ = 0
...
Since we know that the zero bilinear
form also induces the zero quadratic form, we must have φ = 0
...

If not, pick e1 ∈ V such that φ(e1 , e1 ) 6= 0
...

Since φ(e1 , · ) ∈ V ∗ \ {0}, we know that dim U = n − 1 by the rank-nullity
theorem
...
For this to happen, we need to find them inside U
...
By the
induction hypothesis, we can find a basis e2 , · · · , en for U such that φ|U ×U is
represented by a diagonal matrix with respect to this basis
...
So we’re done
...
Let q be a quadratic form on R3 given by
 
x
q y  = x2 + y 2 + z 2 + 2xy + 4yz + 6xz
...

There are two ways to do this
...

We first find our symmetric bilinear form
...

3 2 1
We then find f1 such that φ(f1 , f1 ) 6= 0
...
So we pick
 
1
f1 = e1 = 0
...

1
v3

Next we need to pick our f2
...

To continue our proof inductively, we also have to pick an f2 such that
φ(f2 , f2 ) 6= 0
...

−1
Then we have q(f2 ) = −8
...
Then
 
5
f3 = −8
1
works
...

With these basis elements, we have
q(af1 + bf2 + cf3 ) = φ(af1 + bf2 + cf3 , af1 + bf2 + cf3 )
= a2 q(f1 ) + b2 q(f2 ) + c2 q(f3 )
= a2 − 8b2 + 8c2
...
We have
x2 + y 2 + z 2 + 2xy + 4yz + 6xz = (x + y + 3z)2 − 2yz − 8z 2

y 2 1 2
= (x + y + 3z)2 − 8 z +
+ y
...

φ y  , y 0  = (x + y + 3z)(x0 + y 0 + 3z 0 ) − 8 z +
8
8
8
z
z0
Why do we know this? This is clearly a symmetric bilinear form, and this also
clearly induces the q given above
...

We now use this form to find our f1 , f2 , f3 such that φ(fi , fj ) = δij
...

This gives our first vector as
 
1
f1 = 0
...

So we have

 
−3
f2 =  0 
...

This gives



−5/8
f3 =  1 
...

Then we can see that the result follows, and we get
1
q(af1 + bf2 + cf3 ) = a2 − 8b2 + c2
...
We can re-scale our
basis by any constant, and get an equivalent expression
...
Let φ be a symmetric bilinear form over a complex vector space V
...

Proof
...
By reordering the ei , we can assume that
λ1 , · · · , λr 6= 0 and λr+1 , · · · , λn = 0
...
For r + 1 ≤ r ≤ n,
we let µi = 1 (or anything non-zero)
...

µi

Then
1
φ(vi , vj ) =
φ(ei , ej ) =
µi µj

(
0
1

i 6= j or i = j > r
i = j < r
...

Note that it follows that for the corresponding quadratic form q, we have
!
n
r
X
X
q
ai vi =
a2i
...
Every symmetric A ∈ Matn (C) is congruent to a unique matrix of
the form


Ir 0

...
Before that, we do
the equivalent result for real vector spaces
...
Let φ be a symmetric bilinear
space over R
...
Equivalently, the corresponding quadratic forms is
given by
!
p
p+q
n
X
X
X
q
ai vi =
a2i −
a2j
...

Proof
...
By reordering, we may assume


λi > 0 1 ≤ i ≤ p
λi < 0 p + 1 ≤ i ≤ r


λi = 0 i > r
We let µi be defined by
√

 √λ i
µi =
−λi


1

1≤i≤p
p+1≤i≤r
i>r

Defining
vi =

1
ei ,
µi

we find that φ is indeed represented by

Ip

−Iq


,
0

We will later show that this form is indeed unique
...

Definition (Positive/negative (semi-)definite)
...
We say
(i) φ is positive definite if φ(v, v) > 0 for all v ∈ V \ {0}
...

(iii) φ is negative definite if φ(v, v) < 0 for all v ∈ V \ {0}
...

We are going to use these notions to prove uniqueness
...


78

7

Bilinear forms II

IB Linear Algebra

Example
...

0 0n−p
Then φ is positive semi-definite
...

If instead φ is represented by


−Ip
0
,
0
0n−p
then φ is negative semi-definite
...

We are going to use this to prove the uniqueness part of our previous theorem
...
Let φ be a symmetric bilinear form on a
finite-dimensional real vector space V
...

Proof
...
To do so, we characterize p and q in a basis-independent way
...
So it suffices to
show p is unique
...

First we show we can find such at P
...
Then φ restricted to he1 , · · · , ep i is represented by
Ip with respect to e1 , · · · , ep
...

Now suppose P is any subspace of V such that φ|P ×P is positive definite
...

Let Q = hep+1 , · · · , en i
...

0
0
Now if v ∈ P ∩ Q \ {0}, then φ(v, v) > 0 since v ∈ P \ {0} and φ(v, v) ≤ 0 since
v ∈ Q, which is a contradiction
...

We have
dim V ≥ dim(P + Q) = dim P + dim Q = dim P + (n − p)
...

A similar argument shows that q is the maximal dimension of a subspace Q ⊆ V
such that φ|Q×Q is negative definite
...
The signature of a bilinear form φ is the number p − q,
where p and q are as above
...

Corollary
...
2

is congruent to precisely one matrix

0
0
...
However, if φ is a bilinear form
on a C-vector space V , then φ(iv, iv) = −φ(v, v)
...
To make them work
for complex vector spaces, we need to modify the definition slightly to obtain
Hermitian forms
...
Let V, W be complex vector spaces
...

1 , w) + µ
(ii) φ(v, λw1 + µw2 ) = λφ(v, w1 ) + µφ(vw2 )
...

Note that some people have an opposite definition, where we have linearity
in the first argument and conjugate linearity in the second
...

Alternatively, to define a sesquilinear form, we can define a new complex
vector space V¯ structure on V by taking the same abelian group (i
...
the same
underlying set and addition), but with the scalar multiplication C × V¯ → V¯
defined as
¯
(λ, v) 7→ λv
...
Alternatively,
this is a linear map W → V¯ ∗
...
Let V, W be finite-dimensional
complex vector spaces with basis (v1 , · · · , vn ) and (w1 , · · · , wm ) respectively,
and φ : V × W → C be a sesquilinear form
...

for 1 ≤ i ≤ n, 1 ≤ j ≤ m
...
This follows
from
¯ × W → C
...
Then we have
X
φ(v, w) =
λi µj φ(vi , wj ) = λ† Aµ
...
We cannot
just require φ(v, w) = φ(w, v), since φ is linear in the second variable and
conjugate linear on the first variable
...

Definition (Hermitian sesquilinear form)
...

Note that if φ is Hermitian, then φ(v, v) = φ(v, v) ∈ R for any v ∈ V
...
Moreover, for any complex
number λ, we have
φ(λv, λv) = |λ|2 φ(v, v)
...
So it makes sense to talk
about positive (semi-)definite and negative (semi-)definite Hermitian forms
...

Lemma
...
Then φ is Hermitian if and
only if the matrix A representing φ is Hermitian (i
...
A = A† )
...
If φ is Hermitian, then
Aij = φ(ei , ej ) = φ(ej , ei ) = A†ij
...


So done
...
Let φ be a Hermitian form on a finite dimensional vector space V ; (e1 , · · · , en ) and (v1 , · · · , vn ) are bases for V such
that
n
X
vi =
Pki ek ;
k=1

and A, B represent φ with respect to (e1 ,
...

Then
B = P † AP
...
We have
Bij = φ(vi , vj )
X

X

Pki ek ,
P`j e`
=

n
X

P¯ki P`j Ak`

k,`=1

= (P † AP )ij
...
A Hermitian form φ on V is determined
by the function ψ : v 7→ φ(v, v)
...

Proof
...

4

Theorem (Hermitian form of Sylvester’s law of inertia)
...
Then there
exists unique non-negative integers p and q such that φ is represented by


Ip
0
0
 0 −Iq 0
0
0
0
with respect to some basis
...
Same as for symmetric forms over R
...
Technically, an
inner product is just a special case of a positive-definite symmetric bilinear or
hermitian form
...

Many familiar notions such as orthogonality only make sense when we have an
inner product
...


8
...
Let V be a vector space
...
We usually write
(x, y) instead of φ(x, y)
...

We will see that if we have an inner product, then we can define lengths and
distances in a sensible way
...

(i) Rn or Cn with the usual inner product
(x, y) =

n
X

x
¯i yi

i=1

forms an inner product space
...
However, we would not like to think so, and
instead work with general inner products
...
Then the following is an inner product:
1

Z

f¯(t)g(t) dt
...


(f, g) =
0

If V is an inner product space, we can define a norm on V by
p
kvk = (v, v)
...
This gives the notion of
length in inner product spaces
...


83

8

Inner product spaces

IB Linear Algebra

Note also that the norm k · k determines the inner product by the polarization
identity
...
To prove this, we need to prove the Cauchy-Schwarz
inequality
...
Let V be an inner product space and
v, w ∈ V
...

Proof
...
Otherwise, since the norm is positive
definite, for any λ, we get
¯
0 ≤ (v − λw, v − λw) = (v, v) − λ(w,
v) − λ(v, w) + |λ|2 (w, w)
...
We let
λ=

(w, v)

...

(w, w)
(w, w)
(w, w)

So we get
|(w, v)|2 ≤ (v, v)(w, w)
...

With this, we can prove the triangle inequality
...
Let V be an inner product space and v, w ∈ V
...

Proof
...

So done
...
This generalizes the notion
of being “perpendicular”
...
Let V be an inner product space
...

Definition (Orthonormal set)
...
A set {vi :
i ∈ I} is an orthonormal set if for any i, j ∈ I, we have
(vi , vj ) = δij
84

8

Inner product spaces

IB Linear Algebra

It should be clear that an orthonormal set must be linearly independent
...
Let V be an inner product space
...

In an inner product space, we almost always want orthonormal basis only
...

However, we do not know there is always an orthonormal basis, even in the
finite-dimensional case
...
This is what we will do later
...
Suppose V is a finite-dimensional
inner product space with an orthonormal basis v1 , · · · , vn
...


i=1

So v ∈ V can always be written as
n
X
(vi , v)vi
...
Let V be a finite-dimensional inner product space
with an orthonormal basis v1 , · · · , vn , and v, w ∈ V
...


i=1

In particular,
kvk2 =

n
X

|(vi , v)|2
...

However, we will only care about finite-dimensional ones now
...



n
n
X
X
(v, w) =  (vi , v)vi ,
(vj , w)vj 
i=1

=

=

=

n
X
i,j=1
n
X

j=1

(vi , v)(vj , w)(vi , vj )
(vi , v)(vj , w)δij

i,j=1
n
X

(vi , v)(vi , w)
...
2

IB Linear Algebra

Gram-Schmidt orthogonalization

As mentioned, we want to make sure every vector space has an orthonormal
basis, and we can extend any orthonormal set to an orthonormal basis, at least
in the case of finite-dimensional vector spaces
...
The way to do this is the Gram-Schmidt process
...
Let V be an inner product space and
e1 , e2 , · · · a linearly independent set
...

Note that we are not requiring the set to be finite
...

Proof
...
The base
case k = 0 is contentless
...
We
define
k
X
uk+1 = ek+1 −
(vi , ei+1 )vi
...
We have
(vj , uk+1 ) = (vj , ek+1 ) −

k
X
(vi , ek+1 )δij = (vj , ek+1 ) − (vj , ek+1 ) = 0
...

We want to argue that uk+1 is non-zero
...
We also
know
hv1 , · · · , vk , ek+1 i = he1 , · · · , ek , ek+1 i
by assumption
...
So we must have uk+1 non-zero, or else hv1 , · · · , vk i will
be a set of size k spanning a space of dimension k + 1, which is clearly nonsense
...

kuk+1 k

Then v1 , · · · , vk+1 is orthonormal and hv1 , · · · , vk+1 i = he1 , · · · , ek+1 i as required
...
If V is a finite-dimensional inner product space, then any orthonormal set can be extended to an orthonormal basis
...
Let v1 , · · · , vk be an orthonormal set
...

We now apply the Gram-Schmidt process to this basis to get an orthonormal
basis of V , say (u1 , · · · , un )
...
e
...
So done
...
Let V be an inner product space
and V1 , V2 ≤ V
...
More precisely, we require
(i) V = V1 + V2
(ii) V1 ∩ V2 = 0
(iii) (v1 , v2 ) = 0 for all v1 ∈ V1 and v2 ∈ V2
...

We write V = V1 ⊥ V2
...
If W ≤ V is a subspace of an inner
product space V , then the orthogonal complement of W in V is the subspace
W ⊥ = {v ∈ V : (v, w) = 0, ∀w ∈ W }
...
e
...

Proposition
...

Then
V = W ⊥ W ⊥
...
There are three things to prove, and we know (iii) implies (ii)
...
So it remains to prove (i), i
...
V = W + W ⊥
...
Now let
w=

k
X
(wi , v)wi
...
So we need to show v − w ∈ W ⊥
...

Hence for any λi , we have
X


λj wj , v − w = 0
...
So done
...
Let V1 , V2 be inner product spaces
...

Here we write v1 + v2 ∈ V1 ⊕ V2 instead of (v1 , v2 ) to avoid confusion
...

Proposition
...

Let (e1 , · · · , ek ) be an orthonormal basis of W
...
e
...
Then
(i) π is given by the formula
k
X
π(v) =
(ei , v)ei
...
This says π(v) is the point on W
that is closest to v
...

(i) Let v ∈ V , and define
w=

X

(ei , v)ei
...
We need to show v − w ∈ W ⊥
...

i=1

So v − w is orthogonal to every basis vector in w, i
...
v − w ∈ W ⊥
...

88

8

Inner product spaces

IB Linear Algebra

(ii) This is just Pythagoras’ theorem
...
y)
= kxk2 + kyk2
...
For any w ∈ W , we have
kv − wk2 = kv − π(v)k2 + kπ(v) − wk2 ≥ kv − π(v)k2
with equality if and only if kπ(v) − wk = 0, i
...
π(v) = w
...
3

Adjoints, orthogonal and unitary maps

Adjoints
Lemma
...
Then there exists a unique linear map α∗ : W → V such that
(αv, w) = (v, α∗ w)

(∗)

for all v ∈ V , w ∈ W
...
There are two parts
...
We’ll
first prove it concretely using matrices, and then provide a conceptual reason of
what this means
...

Suppose α is represented by A
...


k

So we get
α∗ (wj ) =

X

(vi , α∗ (wj ))vi =

i

X

A¯ji vi
...
So α∗ is unique
...
Now let α∗
be represented by A†
...

We have
 X
 X
 X
¯ i µj (α(vi ), wj )
α
λi vi ,
µj wj =
λ
i,j

!
=

X
i,j

=

X
i,j

89

¯ i µj
λ

X
k

¯ i A¯ji µj
...

λ

i,j

So done
...
Similarly, we have an isomorphism
¯ ∗
...
So α∗ is in some sense the “dual” of
the map α
...
We call the map α∗ the adjoint of α
...

Definition (Self-adjoint)
...

Then α is self-adjoint if α = α∗ , i
...

(α(v), w) = (v, α(w))
for all v, w
...
e
...
If V = Cn with the usual
inner product, then A ∈ Matn (C) is self-adjoint if and only if A is Hermitian,
i
...
A = A†
...
We will later see that these have real eigenvalues with an
orthonormal basis of eigenvectors
...
We
will first do this for real vector spaces, since the real and complex versions have
different names
...
Let V be a real inner product space
...

90

8

Inner product spaces

IB Linear Algebra

By the polarization identity, α is orthogonal if and only if kα(v)k = kvk for
all v ∈ V
...

There is also an alternative way of characterizing these orthogonal maps
...
Let V be a finite-dimensional space and α ∈ End(V )
...

Proof
...
If α−1 = α∗ , then
(αv, αv) = (v, α∗ αv) = (v, α−1 αv) = (v, v)
...

So we know
α∗ α(vj ) =

n
X
(vi , α∗ αvj ))vi = vj
...
So α∗ = α−1
...
α ∈ End(V ) is orthogonal if and only if α is represented by an
orthogonal matrix, i
...
a matrix A such that AT A = AAT = I, with respect to
any orthonormal basis
...
Let (e1 , · · · , en ) be an orthonormal basis for V
...
So α∗ is represented by AT
...

Definition (Orthogonal group)
...
Then the
orthogonal group of V is
O(V ) = {α ∈ End(V ) : α is orthogonal}
...
So this is
indeed a group
...
Let V be a finite-dimensional real inner product space and
(e1 , · · · , en ) is an orthonormal basis of V
...

This is analogous to our result for general vector spaces and general bases,
where we replace O(V ) with GL(V )
...
Same as the case for general vector spaces and general bases
...
The proofs are almost always identical to the real case, and we will not
write the proofs again
...
Let V be a finite-dimensional complex vector space
...

By the polarization identity, α is unitary if and only if kα(v)k = kvk for all
v ∈V
...
Let V be a finite dimensional complex inner product space and α ∈
End(V )
...

Corollary
...
e
...

Definition (Unitary group)
...
Then the unitary group of V is
U (V ) = {α ∈ End(V ) : α is unitary}
...
Let V be a finite-dimensional complex inner product space
...


8
...
Recall that for general
vector spaces, what we effectively did was to find the orbits of the conjugation
action of GL(V ) on Matn (F)
...
In a more human language,
instead of allowing arbitrary basis transformations, we only allow transforming
between orthonormal basis
...

Lemma
...
Then
(i) α has a real eigenvalue, and all eigenvalues of α are real
...

Proof
...


92

8

Inner product spaces

IB Linear Algebra

(i) Suppose first V is a complex inner product space
...
We pick v ∈ V \ {0} such
that αv = λv
...

λ(v,
¯
Since v 6= 0, we know (v, v) 6= 0
...

For the real case, we pretend we are in the complex case
...
Then α is represented by a symmetric matrix
A (with respect to this basis)
...

By the complex case, A has real eigenvalues only
...
So done
...
We
know every irreducible factor of Mα (t) in R[t] must have degree 1 or 2,
since the roots are either real or come in complex conjugate pairs
...
Then



(α) 6= 0
f
since it has degree less than the minimal polynomial
...

f
So it must be that f (α)(v) = 0
...
Then this is an
α-invariant subspace of V since f has degree 2
...
So if (e1 , e2 ) is an orthonormal basis of
U , then α is represented by a real symmetric matrix, say


a b
b a
But then χα|U (t) = (t − a)2 − b2 , which has real roots, namely a ± b
...

(ii) Now suppose αv = λv, αw = µw and λ 6= µ
...

We know
(αv, w) = (v, αw)
by definition
...

Theorem
...
Then V has an orthonormal basis of eigenvectors of α
...
By the previous lemma, α has a real eigenvalue, say λ
...


93

8

Inner product spaces

IB Linear Algebra

Let U = hvi⊥
...

We now want to prove α sends U into U
...
Then
(v, α(u)) = (αv, u) = λ(v, u) = 0
...
So α|U ∈ End(U ) and is self-adjoint
...
Now let
v

...

Corollary
...
Then
V is the orthogonal (internal) direct sum of its α-eigenspaces
...
Let A ∈ Matn (R) be symmetric
...

Proof
...
Then A is self-adjoint
as an endomorphism of Rn
...
Taking P = (v1 v2 · · · vn ) gives the result
...
Let V be a finite-dimensional real inner product space and ψ :
V × V → R a symmetric bilinear form
...

Proof
...
Then ψ is represented
by a symmetric matrix A
...
Now let vi = Pki uk
...

Also, ψ is represented by P T AP with respect to (v1 , · · · , vn )
...
So the
signature of ψ is just the number of positive eigenvalues of A minus the number
of negative eigenvalues of A
...
Let V be a finite-dimensional real vector space and φ, ψ symmetric
bilinear forms on V such that φ is positive-definite
...

Proof
...
Choose an orthonormal basis for V
(equipped with φ) (v1 , · · · , vn ) with respect to which ψ is diagonal
...
So done
...
If A, B ∈ Matn (R) are symmetric and A is positive definitive (i
...

vT Av > 0 for all v ∈ Rn \ {0})
...

We can deduce similar results for complex finite-dimensional vector spaces,
with the same proofs
...

(i) If A ∈ Matn (C) is Hermitian, then there exists a unitary matrix U ∈
Matn (C) such that
U −1 AU = U † AU
is diagonal
...

(iii) If φ, ψ are Hermitian forms on a finite-dimensional complex vector space
and φ is positive definite, then there exists a basis for which φ and ψ are
diagonalized
...
e
...
Then there exists some invertible Q such that Q† AQ and
Q† BQ are diagonal
...
How about unitary matrices?
Theorem
...
Then V has an orthonormal basis of α eigenvectors
...
By the fundamental theorem of algebra, there exists v ∈ V \ {0} and
λ ∈ C such that αv = λv
...
Then
V = W ⊥ hvi
...
Let w ∈ W
...
We have
(αw, v) = (w, α−1 v) = (w, λ−1 v) = 0
...
Also, α|W is unitary since α is
...
If we add
v/kvk to this basis, we get an orthonormal basis of V itself comprised of α
eigenvectors
...
The key fact
that leads to the existence of an orthonormal basis of eigenvectors is that α and α∗
commute
...
It turns out this is also a sufficient condition, as you
will show in example sheet 4
...
For example,


cos θ
sin θ
∈ O(R2 )
− sin θ cos θ
cannot be diagonalized (if θ 6∈ πZ)
...


96


Title: Linear Algebra
Description: This is a simplified version of linear algebra. It's concise and self-explanatory and can used to get in-depth understanding without the need of further assistance. The lecture was delivered by S. J Wadsley in 2015 in Cambridge University. It is organized such that all sub-topics are tabulated on a table of contents for easy navigation. It is expected that students who use this notes must not have problems with linear algebra going forward.