Search for notes by fellow students, in your own course and all over the country.

Browse our notes for titles which look like what you need, you can preview any of the notes via a sample of the contents. After you're happy these are the notes you're after simply pop them into your shopping cart.

My Basket

You have nothing in your shopping cart yet.

Title: Course notes for microeconometrics, econometrics & causality
Description: Notes on a course focused on causality in economics, microeconometrics with some notes on papers as well.

Document Preview

Extracts from the notes are below, to see the PDF you'll receive please use the links above


Microeconometrics Notes

Bal´azs Farag´o

Contents
1 Angrist Pischke Ch
...
1 Chapter 1 - Questions about questions
...
2 Chapter 2 - The experimental ideal
...
2
...


3
3
3
4

2 Duflo et
...
2008 - Using randomization in development economics - a
toolkit
6
2
...
1 Other methods to control for selection bias
...
al
...
0
...
12
5 Lecture 1- Causality I
...
0
...
13
5
...
2 Randomization
...

6
...

6
...

6
...
1 Bertrand paper on discrimination
...
2
...

6
...

7 OLS Regression & Causality
7
...
2 Why still use controls?
...
3 Good & Bad controls
...
3
...
al
...
3
...


...


...


...

The effect

...


...

of military service in

...


...

Denmark

...


...


...



...


...


...
15

...
15

...
16


...


...


...
V
...
0
...
21
1

CONTENTS

2

9 IV Part 2
22
9
...
1 IV with treatment heterogeneity
...
1 Sharpe RD
...
1
...

10
...

10
...

10
...
1 Within estimation
...
3
...

10
...
3 Pitfalls of Fixed Effects
...
4 Difference in Differences DiD
...
4
...

10
...
2 Two-way fixed effects
...
4
...

10
...

10
...
1 Nearest Neighbour matching
10
...
2 Kernel matching
...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...


...
1&2
1
...
R
...
How would you do it and can it be done?
Questions which cannot be answered by an experiment are F
...
Q
...

Identification strategy - the way in which you use observational data
Advice: Ask: What is your mode of statistical inference?
i
...
1
...
What is the sample? 3
...
2

Chapter 2 - The experimental ideal

RCTs are the ideal
...

But, this observed outcome can be written as:

Y
if Di = 1
1i
Yi =
Y0i
if Di = 0
= Y0i + (Y1i − Y0i )Di

3

CHAPTER 1
...
1&2

4

Figure 1
...

This term captures the averages diference between the health of the hospilized
...
2
...
(Y0i is the ”effect”
of no treatment
...
e
...
And this becomes simply:
= E[Y1i − Y0i |Di = 1] = E[Y1i − Y0i ]
The effect of randomly-assigned hospitalization on the hospitalized is the same as the
effect of hospitalization on a randomly chosen patient
...
2
...

Yi = α + ρDi + ηi

CHAPTER 1
...
1&2

5

With the tratement witched on and off:

This correlation reflects the difference in (no-treatment) potential outcomes between those
who get treated and those who don´t
...
If controls Xi are uncorrelated with the treatment Di , they will NOT affect the
estimate of ρ
...
They will reduce your standard error
...


Chapter 2

Duflo et
...
2008 - Using
randomization in development
economics - a toolkit
RCT via random assignment is: 1
...
Internally valid
Remember that:
E[Y1i − Y0i ]
(so the average difference in with and with out treatment) estimates only the overall effect
of the treatment which may be comprised of many factors
...
0
...

Controlling for observables
Theoretically, it could be that we have a set of observale variables X and we can condition
our data on these: E[Yi0 |X, Di = 1] − E[Yi0 |X, Di = 0] = 0
...


Definition 2
...
1: Fully non-parametric matching
If the dimension of X (variables we want to control for) is not large, we compute the
difference in between treatment and control within each cell formed by the various
values of X and the treatment effect is a weighed average of these
...
Propensity score matching is better
...


6

CHAPTER 2
...
AL
...

Controlling for the propensity score leads to unbiased estimate of the treatment effect
under assumption/equation from before
...


Chapter 3

Bertrand and Mullainathan
2004 - A Field Experiment on
Labor Market Discrimination
Basic summary about the paper:
• A study was conducted to analyze racial differences in callback rates for job
applicants
• Applicants with White names received more callbacks than those with
African-American names
• The study found that resume characteristics were less predictive of callback rates for
African-Americans
...
Jobs: sales, administrative support, clerical services,
and customer services
More specific details about setup, inference etc
...
BERTRAND AND MULLAINATHAN 2004 - A FIELD EXPERIMENT ON LABOR MARKET DISC

Spearman´s Rank Correlation Coefficient
rs ∈ {−1, 1}
P
6 d2i
cov(R(X), R(Y ))
1−
=
n(n2 − 1)
σR(X) σR(Y )
where di = difference between 2 ranks of each observation, n = number of observations,
R( Xi ) is the rank of observation Xi
Close to 0 =⇒ Weak monotonic relationship
...


Chapter 4

Chetty et
...
2016 - he Effects
of Exposure to Better
Neighborhoods on Children:
New Evidence from the Moving
to Opportunity Experiment
Main result from study/topic: Moving households in poverty to better neighbourhood
is good for them
...
This experiment was done via some voucher program that helped households in
impoverished neighbourhoods to move
...

- They estimate the treatment effects of growing up in these very different envi- ronments
by replicating the intent-to-treat (ITT) specifications used in prior work (e
...
, Kling,
Liebman, and Katz 2007)
...


Definition 4
...
1: Intend-To-Treat (ITT) / (encouragement design)
In field studies or RCTs, even if participants assigned to a treatment group do
not fully follow through with the prescribed treatment in reality, the ITT analysis
includes them in the evaluation
...


10

CHAPTER 4
...
AL
...


Definition 4
...
2: Treatment-On-The-Treated (TOT)
This refers to an analysis that specifically evaluates the impact of the treatment
among those who actually receive it
...
TOT, therefore,
focuses on the subset that complies with or ”takes” the treatment, providing insights
into the treatment’s effectiveness when implemented as intended
...
Our results consistently show that the benefits of relocating
to lower-poverty areas decrease as the child’s age at the move increases
...
While we don’t identify a distinct ”critical age” for moving to a better
neighborhood, precise estimates are limited due to small sample sizes at each child age in
the MTO data
...
This is because: Ages at
which children move are perfectly correlated with length of exposure, making it difficult to
distinguish age-related disruption effects from age-invariant disruption cost with an
exposure effect
...

- Despite underlying uncertainties, experimental results support the conclusion that
subsidized housing vouchers for moving to lower-poverty areas yield greater benefits for
younger children
...

- We find no systematic differences in the treatment effects of MTO on children’s
long-term outcomes by gender, race, or site
...

- To address this, we test the null hypothesis that treatment effects for main subgroups
(gender, race, site, and age) are all zero using F-tests and a nonparametric permutation
test
...
05 using F-tests and p <
0
...
CHETTY ET
...
2016 - HE EFFECTS OF EXPOSURE TO BETTER NEIGHBORHOODS ON CHI

children are not an artifact of analyzing multiple subgroups
...
0
...

The problem is an increased risk of obtaining false-positive results by chance alone
when performing numerous tests

4
...
1

Balancing and attrition

The studies compared 195(more or less) variables across randomized groups to see if the
mean differences between groups are statistically significant
...

There is a lot of partial compliance in the treatment group- i
...
only about half of the
people offered vouchers moved
...
X is a vector of baseline covariates and si are dummies for
site (city?)
The offer of voucher was used as an instrument IV for the treatment
...

T OT = IT T /treatment-uptake rate
Some more lingo:
ATE - Average treatment effect
ATET - Average treatment effect on the treated
(kinda like a more specific version of TOT)

Chapter 5

Lecture 1- Causality I
...
And this is potential outcomes, so this
is not dependent wether someone is actually treated
...
This is the difference in potential
outcomes
...


5
...
1

Statistical solution to the counterfactual problem

AT E = E[∆i ]= - Average treatment effect (How much on average, a populaiton is affected
by a treatment)
AT ET = E[∆i |Di = 1] = E[Y1i |Di = 1] − E[Y0i |Di = 1] - Average treatment effect on the
treated
...

s
...
s
...

Now we use a trick : E[Y0i |Di = 0] − E[Y0i |Di = 1] + E[Y0i |Di = 1] − E[Y1i |Di = 1]
...
This is ATET + bias
...
It can tell us about decision making as well
...


13

CHAPTER 5
...


5
...
2

Randomization

We randomize: (Y1i , Y0i )⊥D
With randomization, treatment status is independent of potential outcomes
...


14

Chapter 6

Lecture 2- Causality & II
...


6
...
t
...
2

Examples

Not including lab experiments
- Field experiments
- Natural experiments

6
...
1

Bertrand paper on discrimination

Measuring Discrimination on the labour market
...
Treatment effect:
E[Y1i − Y0i ]
People just used to include a lot of dummies, but nowadays, it is not common since we
don´t trust we can use all the right dummies
...
Yi = αi + γNi + ϵi
From randomization Cov(Ni , ϵ : i) = 0
...


15

CHAPTER 6
...


16

Ni is the binary variable describing wether a name is black or white stereotypical
...

They also distinguished between high vs low quality CVs sent and having high quality CV
didn´t make any difference if black but yes if white
...
2
...
3

Problems with experiments

Definition 6
...
1: Internal validity
Does the experiment provide an estimate of the causal effect in the population under
the study?

Definition 6
...
2: External validity
The extent to which the result can be generalized outside of the experimental framework
...
3
...
When not all the treatment group ends up treated or if
someone outside (in the control) do take the treatment by themselves
...
The experiment this way, also shows the effect on randomization, so
the result includes the contamination across treatment and control groups - which can be
interesting in itself especially given the imperfections of actual policy
...
The measured effect is that
of Z now (the ITT)
...
3
...

Random dropout - Just a problem for statistical power
...


CHAPTER 6
...


17

External validity problems
- Maybe your volunteers are non representative or otherwise sample is not representative
...
3
...
They might perform ”better”
...
3
...
Negative performance,
basically the opposite of the Hawthorne effects
...
3
...


Chapter 7

OLS Regression & Causality
7
...
Let´s say there is this model: yi = α + ρSi + γAi + ϵi
2
...
We can use the bivariate regression formula to derive the bias of ρ in the incorrectly
Cov(Si , yi )
specified model: ρOLS =
V ar(Si )
4
...
1
...
1
...

di

18

CHAPTER 7
...
2

19

Why still use controls?

1
...
Check for random assignment
...
Instead of checking all the mean differences like in the Chetty paper, you
can try to predict the treatment indicator with control variables- use an F-test to see if
they jointly are 0
...
3
7
...
1

Good & Bad controls
Lundborg et
...
2019 - The effect of military service in
Denmark

Military service was supposed to be randomly assigned so they checked this
...
They did this and
these variables were not predictive so all good
...

Military service was found to have an effect on earnings for those with the highest IQ-s
...
3
...


Definition 7
...
2: Good controls
Variables that you can think of as being fixed at the time the treatment variable was
determined
...
3
...

Without occupation, it seems like this has omitted variables but it seems, that this
actually would create selection bias if we included it
...
If you limit the study to
white collar jobs only, by control, in the control group you get people who manage to get a
white collar job without collage, in the treatment group you have those + those who only
can get a white collar job because they have a degree
...

Consider 3 groups, AB,AW,and BW i
...
always blue color job no matter the degree,
Always white, and Blue white (need the degree)
...

- If the treatment changes the control, it is not a control
...
V
...

Zero conditional mean assumption is unlikely to be fulfilled in many cases
...
We have the following (real) relationship:

ysi = fi (s) = π0 + π1

s
|{z}

+

Observed independent variable (ex: schooling)

ηi
|{z}

error/unmeasured

but E[ηi |si ] ̸= 0
ηi =

A′i
|{z}

γ + υi

Unobserved effect (ex: ability)

Note: Ai and υi are uncorrelated
Try estimating:
yi = α + ρsi + ηi
=⇒ OVB Since Ai in ηi is correlated with s
Use IV when:
∃ a variable zi that is
...
Correlated with si [1st stage]
2
...
Correlated w
...
Unorrelated w
...
INSTRUMENTAL VARIABLES I
...


21

si = Xi′ π10 + π11 zi + ζ1 i = 1st Stage, s reg zi
yi = Xi′ π20 + π21 zi + ζ1 i = Reduced form, yi reg zi
By the exclusion restriction
...
coefficient
1st stage reg
...
If zi is {0,1} then Cov(yi , zi ) = p(1 − p)
2SLS -Basically you are just allowed to have many instruments (overidentified
model)
...

The parameter of interest is estimated the same way as before basically, it is that ratio or
via the cov-s
...
0
...
So they started to think about factors in the variation that
are nor related to ability so institutional factors
...
This is just based on the specific date of birth (quarter
of the year you´re born in)
...
=⇒ some people will be
held in school for longer just because they were born later in the year so maybe those born
earlier in the year may be prone to drop out earlier from school
...

So quarter of birth is uncorrelated with ability but is with education so it could be an
instrument
...
People in the 1st
quarter earn less than the 4th etc
...
But this is impossible because ρ must be
consistent unbiased, which is also what we want to find etc
...

- overidentifications are not very useful because it still relies on the same assumption
- For the 1st stage, as a rule of thumb you need F-stat of 10
...
0
...
Think about treatment as a chain
...

- Now the instrument must also be independent of potential treatment status
...

Monotonicity assumption: if you are assigned treatment it should on average increase the
probability of being treated
...
0
...
independence assumption
2
...
monotonicity
4
...

This parameter is called the local average treatment effect
...
IV PART 2

The LATE Theorem
E[Yi |Zi = 1] − E[Yi |Zi = 0]
= E[Y1i − Y0i |D1i > D0i ]
E[Di |Zi = 1] − E[Yi |Zi = 0]
This is the average treatment effect for the group D1i > D0i
Proof:

[H]

23

CHAPTER 9
...
1

Sharpe RD

Di = 1[xi ≥ x0 ]- is the treatment indicator
...

threshold = x0
xi is the ”forcing variable”
The regression becomes: yi = α + βxi + ρDi + ηi
You can also use some p-order polynomial
...
- You do this by including interactions between Forcing variable with
the threshold indicator
...
1
...
δ = size of the neighbourhood
...

However you need a lot of data
...

Those to the left and right of the threshold should be same, also, the control group should
not be reacting to the treatment (Lucas-critique?)
...
2

Fuzzy RD

I a reduced form version, the actual treatment is unobserved, but we know there is a jump
in the probability of treatment
...
In the end you get LATE (but it is even more ”local” because you
only get it for those close to the threshold
...
RD

26

Figure 10
...
The
points are the actual data binned
...


10
...

A way of using panel is through fixed effects
...
of individuals observed is
larger than the time dimension but this does not necessarily have to be the case
...
When
using fixed effects, we allow E[ηi |Xi1
...
But this could be a lot of individuals

CHAPTER 10
...
3
...


On the properties of within estimation
1
...

2
...

3
...

4
...
This provides the within estimators and ηˆi in a single
step

10
...
2

1st-Differences

We can also take first differences instead
...


CHAPTER 10
...
It would typically fail, if
there is some time-specifc unobserved shock that affects both the outcome and our X
variable of interest
...
3
...
Specifically, downward bias
...

Impossible to estimate time-invariant regressors
The deviation from the individual-specific mean will always be zero for such a variable
...
Random effects could do this but has unrealistic
assumptions
...
Since we are relying on
within-individual variation
...

Violation of strict exogeneity assumption
see above
...


CHAPTER 10
...
4

Difference in Differences DiD

In fixed effects, we looked at individual level data
...
RD

Parallel trend assumption
...


10
...
1

2x2 DiD

You can also estimate this in a regression
...
RD

10
...
2

31

Two-way fixed effects

When treatment doesn´t turn on at the same time for all individuals: So there are group

and time dummies
...
4
...
RD

32

Using this, we ca create an even study graph to:
1
...
We look at the
post-treatment (lags) effects for this
...
Study if the parallel assumption makes sense
...


Figure 10
...
5

Matching estimators

Matching is useful when you can´t use RD IV or DiD burt ranks lower in causality
...
RD

33

Definition 10
...
1:
All variables that are relevant for jointly determining treatment and outcomes are
observed and included in Xi
...
Not Testable!

Definition 10
...
2: Overlap/Common support assumption
All treatment have a control counterpart in the population
...
If each treated individual is compared to several
untreated, we can use weights that reflects the importance of these untreated individuals
...
5
...
So we match to
treated, the control individuals with the same/ as close as possible, propensity score
...
RD

34

What variables should be included when estimating the propensity scores?
Only variables that influence simultaneously the treatment decision and the outcome
variable should be included It should also be clear that only variables that are unaffected
by treatment (or the anticipation of it) should be included in the model
...


10
...
1

Nearest Neighbour matching

How many neighbours to match? - Bias Variance trade-off
...
i
...
If someone has already been matched, can that person also be matched
to someone else
...
This can
be avoided by imposing a tolerance level on the maximum propensity score distance
(caliper)
...


10
...
2

Kernel matching

Kernel matching uses all control observations but puts some weight on them depending on
how close they are to the treatment to be matched to
...

Benefit: low variance (buuut bad matches may also be used)

CHAPTER 10
...
3: Common support problem displayed in example 2
A more formal test:
Compare the minima and maxima of the propensity score in the treatment and control
groups Example: assume that the propensity score lies within the interval [0
...
94] in
the treatment group and within [0
...
89] in the control group
...
We then delete
all observations whose propensity score is smaller than the minimum and larger than the
maximum in the opposite group
...

Balancing: If the distribution of X -variables in both groups is similar, we say that the X
-variables are balanced, which means that the propensity score matching did a good job
...

A simple approach is to use a two-sample t-test to check if there are significant differences
in co-variate means for both groups
...
If
still not satisfactory, it may indicate a failure of the CIA
No easy way to get SE-s for matching estimators so we use bootstrapping
Title: Course notes for microeconometrics, econometrics & causality
Description: Notes on a course focused on causality in economics, microeconometrics with some notes on papers as well.