Search for notes by fellow students, in your own course and all over the country.

Browse our notes for titles which look like what you need, you can preview any of the notes via a sample of the contents. After you're happy these are the notes you're after simply pop them into your shopping cart.

My Basket

You have nothing in your shopping cart yet.

Title: Econometrics using Matrix Algebra, a summary of key concepts
Description: Here's a neat little summary sheet(s) of all the concepts covered in the ES30027 module, namely Econometrics concepts with Matrix Algebra. Key topics include: Matrix Algebra Review, Probability Distributions, Hypothesis Testing, Heteroskedasticity, Instrumental Variables & Maximum Likelihood Estimation. Covers all the key concepts you need to remember at the last moment, great to review right before your exam!

Document Preview

Extracts from the notes are below, to see the PDF you'll receive please use the links above


ES30027 Egg Timer!
Topic

Sub-Topic

Summary of Key Points & Formulae

Key Takeaways!

Matrix Algebra
Review

Vectors &
Matrices; Matrix
Operations;
Transpose

๏‚ท A ๐‘š๐‘ฅ๐‘› matrix โŸถ bold uppercase letters
...

๏‚ท 2 matrices are said to be conformable if they have the appropriate dimensions for the same
operation
...

o Addition/Subtraction: same number of rows & columns
...
of columns for 1st matrix should equal no
...

๏‚ท Key properties & facts!
o ๐‘จ+๐‘ฉ =๐‘ฉ+๐‘จ
o (๐‘จ + ๐‘ฉ) + ๐‘ช = ๐‘จ + (๐‘ฉ + ๐‘ช)
o (๐ด๐ต)๐ถ = ๐ด(๐ต๐ถ) = ๐ด๐ต๐ถ
o ๐ด(๐ต + ๐ถ) = ๐ด๐ต + ๐ด๐ถ ๐‘Ž๐‘›๐‘‘ (๐ต + ๐ถ)๐ด = ๐ต๐ด + ๐ถ๐ด
o ๐ด๐ต not necessarily equal to ๐ต๐ด
o ๐ด๐ต = 0 possible even if ๐ด โ‰  0 and ๐ต โ‰  0
o ๐ถ๐ท = ๐ถ๐ธ possible even if ๐ถ = 0 and ๐ท โ‰  ๐ธ
๏‚ท Transpose of a ๐‘š๐‘ฅ๐‘› matrix A is the ๐‘›๐‘ฅ๐‘š matrix B such that ๐‘ = ๐‘Ž , โˆ€ ๐‘– = 1, โ€ฆ , ๐‘› ๐‘Ž๐‘›๐‘‘ ๐‘— =
1, โ€ฆ , ๐‘›
๏‚ท Key properties!
o (๐‘จ ) = ๐‘จ
o (๐‘จ + ๐‘ฉ) = ๐‘จ + ๐‘ฉ
o (๐‘จ๐‘ฉ) = ๐‘ฉ ๐‘จ
๏‚ท Square Matrix: ๐‘š๐‘ฅ๐‘› matrix where ๐‘š = ๐‘›
...
e
...
๐‘Ž =
1 โˆ€ ๐‘– = 1, โ€ฆ , ๐‘› ๐‘Ž๐‘›๐‘‘ ๐‘Ž = 0 โˆ€ ๐‘– โ‰  ๐‘—
...

๏‚ท Linear Dependence: A set of m-vectors ๐‘ฅ , โ€ฆ , ๐‘ฅ is linearly dependent if there exist numbers
๐›พ , โ€ฆ , ๐›พ such that โˆ‘ ๐›พ ๐‘ฅ = 0
...
(can express one vector as a weighted sum of all other

๏‚ท Key properties of matrices:
o ๐‘จ+๐‘ฉ =๐‘ฉ+๐‘จ
o (๐‘จ + ๐‘ฉ) + ๐‘ช = ๐‘จ + (๐‘ฉ +
๐‘ช)
o (๐‘จ๐‘ฉ)๐‘ช = ๐‘จ(๐‘ฉ๐‘ช) = ๐‘จ๐‘ฉ๐‘ช
o ๐‘จ(๐‘ฉ + ๐‘ช) = ๐‘จ๐‘ฉ +
๐‘จ๐‘ช ๐‘Ž๐‘›๐‘‘ (๐‘ฉ + ๐‘ช)๐‘จ =
๐‘ฉ๐‘จ + ๐‘ช๐‘จ
o ๐‘จ๐‘ฉ not necessarily equal to
๐‘ฉ๐‘จ
o ๐‘จ๐‘ฉ = ๐ŸŽ possible even if
๐‘จ โ‰  0 and ๐‘ฉ โ‰  0
o ๐‘ช๐‘ซ = ๐‘ช๐‘ฌ possible even if
๐‘ช = 0 and ๐‘ซ โ‰  ๐‘ฌ
๏‚ท Properties of Transposes
o (๐‘จ ) = ๐‘จ
o (๐‘จ + ๐‘ฉ) = ๐‘จ + ๐‘ฉ
o (๐‘จ๐‘ฉ) = ๐‘ฉ ๐‘จ
๏‚ท Properties of Inverses:

Some Special
Matrices

Rank; Inverse;
Trace

vectors)
๏‚ท When considering linear dependencies of matrices, can either look at the rows/columns of
vectors that make up the matrix, and see if they are linearly dependent, i
...


o ๐‘จ ๐Ÿ
o (๐‘จ๐‘ฉ)

๐Ÿ
๐Ÿ

=๐‘จ
= ๐‘ฉ ๐Ÿ๐‘จ

๐Ÿ

๐Ÿ ๐‘ป

๐Ÿ

o ๐‘จ๐‘ป
= ๐‘จ
๏‚ท Properties of Traces (where A,B &
C are nxn matrices and ๐›พ is a
scalar)
o ๐’•๐’“(๐‘จ + ๐‘ฉ) = ๐’•๐’“(๐‘จ) +
๐’•๐’“(๐‘ฉ)
o ๐’•๐’“(๐œธ๐‘จ) = ๐œธ๐’•๐’“(๐‘จ)
o ๐’•๐’“(๐‘จ๐‘ฉ๐‘ช) = ๐’•๐’“(๐‘ฉ๐‘ช๐‘จ) =
๐’•๐’“(๐‘ช๐‘จ๐‘ฉ)
๏‚ท Key results from matrix
differentiation!
๏‚ท

๐’‚๐’ƒ
๐’ƒ

=

๐’ƒ๐’‚
๐’ƒ

=๐’‚

๐‘‹=

1
3

2
โ†’ ๐‘ฅ = [1
4

2] ๐‘œ๐‘Ÿ

1
2
& ๐‘ฅ = [3 4] ๐‘œ๐‘Ÿ
3
4

๏‚ท Rank (of a set of vectors): The maximum number of linearly independent vectors that can be
chosen from the set
...

o For any matrix, rank of row vectors = rank of column vectors
...

o Singularity: An mxm matrix A is singular if the rank of A is less than m
...

๏‚ท Key properties!
๏‚ท (๐ด ) = ๐ด
๏‚ท (๐ด๐ต) = ๐ต ๐ด
๏‚ท (๐ด ) = (๐ด )
๏‚ท Inverse of a 2x2 matrix:
1
๐‘Ž ๐‘
๐‘‘ โˆ’๐‘
๐ด=
โ†’๐ด =
๐‘ ๐‘‘
๐‘Ž๐‘‘ โˆ’ ๐‘๐‘ โˆ’๐‘ ๐‘Ž
๏‚ท Inverse of a diagonal matrix
1 0 0
โŽก
โŽค
1
1 0 0
0โŽฅ
โŽข0
๐ด= 0 4 0 โ†’๐ด =โŽข
4
โŽฅ
1โŽฅ
0 0 2
โŽข
โŽฃ0 0 2โŽฆ
๏‚ท Trace [tr()]: The sum of the elements on the principal diagonal of a square matrix
...
r
...

๏‚ท Scalar w
...
t column vector โ†’ mxn matrix
...
r
...

๏‚ท Key results!

๏‚ท
๏‚ท
๏‚ท

๐‘จ๐’ƒ
๐’ƒ๐‘จ
= ๐‘จ ๐’‚๐’๐’…
=๐‘จ
๐’ƒ
๐’ƒ
๐’ƒ ๐‘จ๐’ƒ
= (๐‘จ + ๐‘จ )๐’ƒ
๐’ƒ
๐’ƒ ๐‘จ๐’ƒ
If A is symmetric,
=
๐’ƒ

๐Ÿ๐‘จ๐’ƒ

o

=

o

= ๐ด ๐‘Ž๐‘›๐‘‘

o

=๐‘Ž
=๐ด

= (๐ด + ๐ด )๐‘

o If A is symmetric,
Ordinary Least
Squares with
Matrices

Probability
Distributions

Random
Variables;
Random Vectors

= 2๐ด๐‘

๏‚ท Convert a scalar relationship into a vector matrix relationship by stacking all the rows for different
observations
...

๏‚ท Multivariate true linear model: ๐‘ฆ = ๐‘‹๐›ฝ + ๐‘ข
๐‘ฆ
๐‘ฆ= โ‹ฎ
๐‘ฆ
๐‘‹ = [1 ๐‘ฅ โ€ฆ ๐‘ฅ ]
๐›ฝ
๐›ฝ= โ‹ฎ
๐›ฝ
๐‘ข โ€ฆ ๐‘ข ]
๐‘ข = [๐‘ข
๏‚ท Multivariate Linear Fitted Model: ๐‘ข โ‰ก ๐‘ฆ โˆ’ ๐‘‹๐›ฝ
๏‚ท Concept of Ordinary Least Squares: min โˆ‘ ๐‘ข = min ๐‘ข ๐‘ข, given ๐‘ข ๐‘ข = โˆ‘ ๐‘ข
๏‚ท ๐›ฝ
= (๐‘‹ ๐‘‹) ๐‘‹ ๐‘ฆ (know how to prove this too!)
๏‚ท ๐‘‹ ๐‘‹ should be a positive definite matrix if ๐›ฝ
works
...

๏‚ท A random vector x is a vector whose elements are themselves random variables
...

๐ธ(๐‘ฅ )
โ‹ฎ
๏‚ท Expected Value: ๐ธ(๐‘ฅ) =
โ‹ฎ
๐ธ(๐‘ฅ )
๏‚ท Variance: A matrix containing variances of each element of x along the principal diagonal, and the
covariances between different elements of x off the diagonal
...

o X is a fixed, non-stochastic matrix with rank k
...

U is a random vector with ๐ธ(๐‘ข) = 0 and ๐‘ฃ๐‘Ž๐‘Ÿ(๐‘ข) = ๐ธ(๐‘ข๐‘ข ) = ๐œŽ ๐ผ
๐‘ฃ๐‘Ž๐‘Ÿ(๐‘ข) is the error covariance matrix, which gives the covariance between any 2 elements in u
...

o Assume that all observations have zero covariance (no autocorrelation)
...
e
...

o Linearity: ๐›ฝ
= ๐ถ๐‘ฆ, where ๐ถ denotes a matrix
o Unbiasedness: ๐ธ ๐›ฝ
= ๐›ฝ = ๐ธ(๐ถ๐‘ฆ)
Residuals: ๐‘ข โ‰ก ๐‘ฆ โˆ’ ๐‘‹๐›ฝ = (๐ผ โˆ’ ๐‘‹(๐‘‹ ๐‘‹) ๐‘‹ )๐‘ข
Expected Sum of Squared Residuals: ๐ธ( ๐‘ข ๐‘ข) = ๐œŽ (๐‘› โˆ’ ๐‘˜)
๐‘ข ๐‘ข
โŸน๐œŽ
=
๐‘›โˆ’๐‘˜
Since ๐œŽ
is an unbiased estimator, ๐ธ(๐œŽ ) = ๐ธ
=๐œŽ

๏‚ท Normal Distribution, N ๐, ๐ˆ๐Ÿ
o ๐œ‡ is the mean, controls center point
...
e
...

๏‚ท Linearity: ๐›ฝ
= ๐ถ๐‘ฆ, where ๐ถ
denotes a matrix
o Unbiasedness: ๐ธ ๐›ฝ
=๐›ฝ=
๐ธ(๐ถ๐‘ฆ)

๏‚ท Key relationships between
distributions:
o If ๐‘ง ~ ๐‘(0,1), ๐‘– = 1, โ€ฆ , ๐‘› and
๐‘ง are all independent, then
โˆ‘ ๐‘ง ~ ๐œ’ (๐‘›)

o If ๐‘ง ~ ๐‘(0,1), ๐‘ฆ ~ ๐œ’ (๐‘›) and ๐‘ง
& ๐‘ฆ are independent, then
โˆš
โˆš

~ ๐‘ก(๐‘›)

๏‚ท Key properties on joint PDFs:
๏‚ท ๐‘“ (๐‘ฅ, ๐‘ฆ) โ‰ฅ 0, for all x & y
๏‚ทโˆซ
๐‘“ (๐‘ฅ, ๐‘ฆ) ๐‘‘๐‘ฆ ๐‘‘๐‘ฅ =
โˆซ
1
๏‚ทโˆซ

โˆซ

๐‘“ (๐‘ฅ, ๐‘ฆ) ๐‘‘๐‘ฆ ๐‘‘๐‘ฅ =

๐‘ƒ(๐‘Ž โ‰ค ๐‘‹ โ‰ค ๐‘, ๐‘ โ‰ค ๐‘Œ โ‰ค ๐‘‘)

๏‚ท Studentโ€™s t-Distribution, ๐’•(๐’—)
o Has a single parameter
...

o As ๐‘ฃ โ†’ โˆž, ๐‘ก(๐‘ฃ) โ†’ ๐‘(0, 1) [standard normal distribution]

๏‚ท Key properties of multivariate
normal distributions:
๏‚ท An affine transformation (a type
of geometric transformation that
preserves collinearity) of a
normal vector is a normal
vector
...
e
...
vโ€™s
...

o Defined only for positive values of x
...


๏‚ท Chi-square Distribution, ๐Œ๐Ÿ (๐’‘)
o Right-skewed distribution
...

o Need one degree of freedom ๐‘
...

๏‚ท Joint P
...
vโ€™s move together
...
vโ€™s,
their PDF gives the probability of X and Y simultaneously lying in given ranges
...

๐‘“ (๐‘ฅ) =

๐‘“ (๐‘ฅ, ๐‘ฆ) ๐‘‘๐‘ฆ

๐‘“ (๐‘ฆ) =

๐‘“ (๐‘ฅ, ๐‘ฆ) ๐‘‘๐‘ฅ

๏‚ท Independence: X and Y are said to be independent iff: ๐‘“ (๐‘ฅ, ๐‘ฆ) = ๐‘“ (๐‘ฅ)
...
Not independent, in general
...

๐ด๐‘ฅ + ๐‘ ~ ๐‘(๐ด๐œ‡ + ๐‘, ๐ดฮฃ๐ด )

Distribution of
OLS Estimates

๏‚ท
๏‚ท
๏‚ท
๏‚ท

Testing A Single
Hypothesis

๏‚ท If covariance between 2 jointly normal random variables is 0, the random variables are
independent, i
...
, if the off-diagonal terms in ๐‘ฃ๐‘Ž๐‘Ÿ(๐‘ฅ) are 0, elements of x are independent
r
...
[useful for proving independence]
๐‘ข ~ ๐‘(0, ๐œŽ ๐ผ)
๐‘ข are independent r
...
โ€™s, variance matrix is a diagonal matrix [zero autocorrelation]
If ๐‘ข is multivariate normal, each of ๐‘ข will be univariate normal, where ๐›ฝ
is an affine
transformation of ๐‘ข, and ๐‘ฅ ๐ด๐‘ฅ is a scalar
...
e
...

๏‚ท The respecified statistic:
๏‚ท

~ ๐‘(0,1) โŸน

~ ๐‘ก(๐‘› โˆ’ ๐‘˜), which now follows a t-distribution and has ๐‘› โˆ’ ๐‘˜ degrees of freedom
...

1
...

2
...

3
...
Decide on a significance level ๐‘
...
i
...

5
...
If |๐‘ก | > ๐‘ก , reject ๐ป
...

P-values: They capture the probabilities of observing OLS estimates that are greater in absolute
value than the estimates actually obtained, under the assumption that the true parameter value is
0
...

Using a two-sided F-test!
Here, you can jointly test a set of ๐‘ž linear restrictions on the model parameters and express the
restrictions as ๐‘…๐›ฝ = ๐‘Ÿ
...

If ๐ป : ๐‘…๐›ฝ = ๐‘Ÿ is true, ๐‘…๐›ฝ โˆ’ ๐‘Ÿ ~๐‘(0, ๐œŽ ๐‘…(๐‘‹ ๐‘‹) ๐‘… )

๏‚ท F-statistic (learn to derive this!): ๐น(๐‘ž, ๐‘› โˆ’ ๐‘˜)~

Heteroskedastici
ty

OLS Under
Heteroskedasticity

Generalised Least
Squares (๐œท๐‘ฎ๐‘ณ๐‘บ )

Feasible General
Least Squares

๏‚ท Heteroskedasticity: Occurs when ๐‘ฃ๐‘Ž๐‘Ÿ(๐‘ข) is diagonal, but has different terms along the main
diagonal, i
...

๏‚ท Under heteroskedasticity, ๐ธ ๐›ฝ
= ๐›ฝ
...

๏‚ท How do you choose the method?
๏‚ท If we know the structure of the variance matrix, use feasible GLS
...

๏‚ท Both provide unbiased estimates of ๐›ฝ and standard errors in large samples, but perform poorly
in small samples
...

๏‚ท By standardization, we can re-estimate a new variance matrix, and the transformed model
would be homoskedastic
...
OLS is a type of GLS estimator
...

๏‚ท The special case of GLS under heteroskedasticity only is aka Weighted Least Squares (๐›ฝ
)
...
Rather, we only need to know the entries of ฮฉ up to a constant
...

๐‘ฃ๐‘Ž๐‘Ÿ(๐‘ข) = ๐œŽ ฮฉ
๏‚ท This is how GLS can be used to find the IV estimator, given that ๐‘ฃ๐‘Ž๐‘Ÿ(๐‘ข) = ๐œŽ (๐‘ ๐‘)
๏‚ท Key question: Problem Set 3, Q1(d)
...

๐œŽ ๐ผ
0
ฮฉ=
0
๐œŽ ๐ผ
๏‚ท How to perform feasible GLS?
1
...

2
...

๐œŽ =
๐œŽ =

โˆ‘

1
๐‘› โˆ’๐‘˜

๐‘ข

๐‘ข (the first observation from group B is ๐‘› + 1!)

๏‚ท How do you choose the method?
๏‚ท If we know the structure of the
variance matrix, use feasible
GLS
...

๏‚ท Both provide unbiased
estimates of ๐›ฝ and standard
errors in large samples, but
perform poorly in small
samples
...

๏‚ท The special case of GLS under
heteroskedasticity only is aka
Weighted Least Squares (๐›ฝ
)
...

Rather, we only need to know the
entries of ฮฉ up to a constant
...

๐‘ฃ๐‘Ž๐‘Ÿ(๐‘ข) = ๐œŽ ฮฉ
๏‚ท This is how GLS can be used to
find the IV estimator, given that
๐‘ฃ๐‘Ž๐‘Ÿ(๐‘ข) = ๐œŽ (๐‘ ๐‘)

ฮฉ=

๐œŽ ๐ผ

0

0

๐œŽ ๐ผ

3
...


White Standard
Errors
Instrumental
Variable
Estimation

The Endogeneity
Problem

๐›ฝ
= ๐‘‹ ฮฉ ๐‘‹ ๐‘‹ ฮฉ ๐‘ฆ
๐›ฝ
is not unbiased, but the bias disappears in large samples
...

๏‚ท Given that ๐‘ฃ๐‘Ž๐‘Ÿ ๐›ฝ
= (๐‘‹ ๐‘‹) ๐‘‹ ฮฉX(๐‘‹ ๐‘‹)
...

๏‚ท As the sample size tends to infinity, ๐‘‹ ฮฉX tends to converge to ๐‘‹ ฮฉX
...
Even if X is
random, OLS is still an unbiased estimator
...

๏‚ท ๐ธ ๐›ฝ
โ‰  ๐›ฝ, OLS is biased!
๏‚ท What if you could increase the sample size? Even if there is bias in a finite sample, will this bias
tend to 0 as sample size tends to infinity?
๏‚ท Consistency: An estimator ๐œƒ is said to be consistent if it converges in probability to the true
parameter value ๐œƒ, which requires that:
lim ๐‘ƒ ๐œƒ โˆ’ ๐œƒ โ‰ค ๐œ€ = 1, for all ๐œ€ > 0
...

๏‚ท What does convergence of probability require?
๏‚ท Limit ๐‘› โ†’ โˆž of the probability that the absolute difference between our estimator and our true
value is less than some value ๐œ€ going to 1 for any positive ๐œ€
...

๏‚ท An estimator is consistent if ๐‘๐‘™๐‘–๐‘š ๐›ฝ = ๐›ฝ
...

๏‚ท Using these rules (also need to calculate!): ๐‘๐‘™๐‘–๐‘š ๐›ฝ

= ๐›ฝ + ๐‘๐‘™๐‘–๐‘š

๐‘๐‘™๐‘–๐‘š

๏‚ท If endogeneity occurs, OLS is
biased and ๐ธ ๐›ฝ
โ‰  ๐›ฝ
...

๏‚ท If ๐‘๐‘™๐‘–๐‘š

=ฮฃ

, X is well-

behaved
...

๏‚ท Key conditions for validity of
instruments:
๏‚ท Relevance
๏‚ท Exogeneity

๏‚ท For specification of IVs, k is the
number of variables and l is the
If ๐‘๐‘™๐‘–๐‘š
= 0, ๐›ฝ
is consistent and will provide good estimates as long as we have a large
number of instruments
...
If not, (if equal to ฮฃ โ‰  0) then ๐‘๐‘™๐‘–๐‘š(๐›ฝ ) โ‰  ๐›ฝ and ๐›ฝ
is inconsistent even in large ๏‚ท The IV Estimates:
๏‚ท ๐›ฝ =
samples
...
(linearly dependent instruments
๏‚ท ๐‘๐‘™๐‘–๐‘š ๐›ฝ = ๐›ฝ, hence ๐›ฝ is
must be ruled out!)
consistent!
Key conditions for validity of instruments!
๏‚ท ๐‘ฃ๐‘Ž๐‘Ÿ(๐‘ ๐‘ข) = ๐œŽ ๐‘โ€ฒ๐‘
๏‚ท 2SLS Estimation:
๏‚ท Relevance: ๐‘๐‘™๐‘–๐‘š
= ฮฃ โ‰  0: variables in Z are correlated with those in X
...

variable ๐‘ฅ ๐œ– ๐‘‹ on Z and
โ€œExogenousโ€ variables are not correlated with u, and can be used as instruments for themselves
...

Endogenous Variables: Variables correlated with u
...
e
...
If ๐‘™ > ๐‘˜, over-identification
...

under-identification
...
e
...


๏‚ท
๏‚ท
๏‚ท ๐‘ฃ๐‘Ž๐‘Ÿ(๐‘ ๐‘ข) = ๐œŽ ๐‘โ€ฒ๐‘
๏‚ท 2SLS Estimation: Need to perform OLS in 2 stages
...
Regress each endogenous variable ๐‘ฅ ๐œ– ๐‘‹ on Z and obtain predicted values
...

๐‘ฅ = ๐‘(๐‘ ๐‘) ๐‘ ๐‘ฅ for ๐‘— = 1, โ€ฆ , ๐‘˜
๐‘ฅ = ๐‘ฅ for all exogenous variables
...
Regress y on ๐‘‹ and obtain the OLS slope estimate ๐›ฝ
= ๐‘‹ ๐‘‹ ๐‘‹ ๐‘ฆ [also need to prove
that ๐›ฝ = ๐›ฝ
]
๏‚ท Need to use the GLS variance estimator ๐‘ฃ๐‘Ž๐‘Ÿ ๐›ฝ
, since estimated variance matrix of OLS
is wrong
...

๏‚ท Are instruments uncorrelated with U? Use Sargan test
...

๏‚ท Key Characteristics:
๏‚ท The IV must be correlated with economic growth, but not directly affect the probability of
conflict
...


๐‘‹ ๐‘‹ ๐‘‹ ๐‘ฆ [also need to
prove that ๐›ฝ = ๐›ฝ
]
๏‚ท Standard errors may change when
using either approach, but IV
estimation gives better standard
errors
...

๏‚ท Is the IV relevant?
๏‚ท It should have a positive relationship with economic growth
...

๏‚ท Switching to the IV from OLS regressions should not result in loss of significance of the
coefficient, but should ideally be larger than the coefficient
...

๏‚ท Step 2: Regress dependent variables on predicted values from Step 1
...
But the coefficients shouldnโ€™t change a
lot, and they should still be significant
...

๏‚ท Use an F-test to regress the dependent variable on the dummy variables
...

1
...

๏‚ท Likelihood function takes y as given and tells the likelihood (probability) of obtaining some
parameter value ๐œƒ
...

๐ฟ(๐œƒ; ๐‘ฆ) = ๐‘“ (๐‘ฆ ; ๐œƒ)
...
Calculate the log likelihood (In ๐ฟ(๐œƒ; ๐‘ฆ)) by taking logs of the likelihood function
...
Set the derivative of the log likelihood function w
...
t ๐œƒ (the โ€œscoreโ€) equal to 0
...

4
...

These are the ML estimates, ๐œƒ
...
Write down the likelihood function
...

๐ฟ(๐›ฝ, ๐œŽ ; ๐‘ฆ) = ๐‘“ (๐‘ข; ๐›ฝ, ๐œŽ )
Since individual ๐‘ข are independent normal r
...
๐‘“ (๐‘ข ; ๐›ฝ, ๐œŽ )
Since error terms have same distribution,
๐ฟ(๐›ฝ, ๐œŽ ; ๐‘ฆ) =
2
...


๐‘“ (๐‘ข ; ๐›ฝ, ๐œŽ )

๏‚ท

Key steps:
๏‚ท Write down the likelihood
function
...

๏‚ท Set the derivative of the log
likelihood function (the
โ€œscoreโ€) to 0
...


๐‘›
๐‘›
1
(๐‘ฆ ๐‘ฆ โˆ’ 2๐‘ฆ ๐‘‹๐›ฝ + ๐›ฝ ๐‘‹ ๐‘‹๐›ฝ)
ln ๐ฟ(๐›ฝ, ๐œŽ ; ๐‘ฆ) = โˆ’ ln(2๐œ‹) โˆ’ ln ๐œŽ โˆ’
2
2
2๐œŽ
3
...

๐œ• ln ๐ฟ(๐›ฝ, ๐œŽ ; ๐‘ฆ|๐‘‹)
โˆ’1
โŽก
โŽค
(โˆ’๐‘‹ ๐‘ฆ + ๐‘‹ ๐‘‹๐›ฝ)
๐œ•๐›ฝ
0
โŽข
โŽฅ=
๐œŽ
=
โŽข๐œ• ln ๐ฟ(๐›ฝ, ๐œŽ ; ๐‘ฆ|๐‘‹)โŽฅ
โˆ’๐‘›
1
0
โŽข
โŽฅ
(๐‘ฆ โˆ’ ๐‘‹๐›ฝ) (๐‘ฆ โˆ’ ๐‘‹๐›ฝ)
+
2๐œŽ
2๐œŽ
โŽฃ
โŽฆ
๐œ•๐œŽ
4
...

๐›ฝ = (๐‘‹ ๐‘‹) ๐‘‹ ๐‘ฆ
1
๐œŽ = ๐‘ฆ โˆ’ ๐‘‹๐›ฝ
๐‘ฆ โˆ’ ๐‘‹๐›ฝ
๐‘›
๏‚ท ๐›ฝ is unbiased, but ๐œŽ is not
...
e
...

โ†’

๏‚ท Estimator variance disappears as sample size increases, i
...
, lim ๐‘ฃ๐‘Ž๐‘Ÿ ๐œƒ
โ†’

Regression with
Binary Dependent
Variables

=0

๏‚ท ๐œŽ is consistent
...

๏‚ท Major drawbacks of using regular OLS estimates:
๏‚ท OLS predictions of ๐‘ฆ can be less than 0 or greater than 1, which are hard to interpret when
dependent values are binary variables
...
This
problem may disappear in large samples
...
๐‘ฃ๐‘Ž๐‘Ÿ(๐‘ข ) is a function of ๐‘ฅ , the model exhibits heteroskedasticity
...
But can use White Standard Errors
...


Probit

๏‚ท An ML alternative to LPM
...

๏‚ท Define ๐‘ฆ โˆ— , a latent variable (which is unobserved) such that ๐‘ฆ โˆ— = ๐‘ฅ ๐›ฝ + ๐‘ข
...

๏‚ท Similarly, ๐‘ƒ(๐‘ฆ = 0) = 1 โˆ’ ฮฆ
๏‚ท These equations are non-linear in parameters, OLS canโ€™t be used
...

๏‚ท Order the data such that 1st m observations have ๐‘ฆ = 0 & n-m observations have ๐‘ฆ = 1
...
i
...

๐‘ฅ๐›ฝ
๐‘ฅ๐›ฝ
๐ฟ(๐›ฝ, ๐œŽ ; ๐‘ฆ|๐‘‹) =
1โˆ’ฮฆ
ฮฆ
๐œŽ
๐œŽ
๏‚ท Taking log likelihood, which needs to be maximised numerically (since there is no closed form
expression for the standard cdf):
๐‘ฅ๐›ฝ
๐‘ฅ๐›ฝ
ln ๐ฟ(๐›ฝ, ๐œŽ ; ๐‘ฆ|๐‘‹) =
(๐‘ฆ ln ฮฆ
) + (1 โˆ’ ๐‘ฆ )ln ( 1 โˆ’ ฮฆ
)
๐œŽ
๐œŽ
๏‚ท Probit estimates ๐›ฝ & ๐œŽ obtained
...


Marginal Effects

๏‚ท Helps to interpret ๐›ฝ , as a derivative indicating how much a one unit increase in each X variable
will increase the expected value of y by
...

( )
๏‚ท In probit, marginal effects:
=
ฮฆ
=๐œ™
๏‚ท Can be evaluated at any value of ๐‘ฅ , however, it is more common to use the mean values of each
X variable, denoted ๐‘ฅฬ… , a row vector
...
Then the estimated marginal effect just uses the probit coefficients in
place of the true model parameters
...



Title: Econometrics using Matrix Algebra, a summary of key concepts
Description: Here's a neat little summary sheet(s) of all the concepts covered in the ES30027 module, namely Econometrics concepts with Matrix Algebra. Key topics include: Matrix Algebra Review, Probability Distributions, Hypothesis Testing, Heteroskedasticity, Instrumental Variables & Maximum Likelihood Estimation. Covers all the key concepts you need to remember at the last moment, great to review right before your exam!