Search for notes by fellow students, in your own course and all over the country.
Browse our notes for titles which look like what you need, you can preview any of the notes via a sample of the contents. After you're happy these are the notes you're after simply pop them into your shopping cart.
Document Preview
Extracts from the notes are below, to see the PDF you'll receive please use the links above
Statistics 502 Lecture Notes
Peter D
...
1 Induction
...
2 Model of a process or system
...
3 Experiments and observational studies
1
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
1
1
1
2
6
2 Comparing Two Treatments
8
2
...
9
2
...
11
2
...
16
2
...
17
2
...
18
2
...
21
2
...
23
2
...
29
2
...
36
2
...
1 The non-central t-distribution
...
9
...
40
2
...
43
2
...
1 Two-sample t-test with unequal variances
...
10
...
t(yA , yB )
...
1 Introduction to ANOVA
...
1
...
3
...
2 Model Fitting
...
1
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
1
...
3
...
5 The ANOVA table
...
1
...
3
...
7 Unbalanced Designs
...
1
...
3
...
9 Sampling distribution of the F -statistic
3
...
10 Comparing group means
...
1
...
Treatment Comparisons
...
2
...
3
...
2 Orthogonal Contrasts
...
2
...
Model Diagnostics
...
3
...
3
...
2 Variance stabilizing transformations
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
57
60
63
65
68
70
73
75
78
78
80
83
85
86
92
4 Multifactor Designs
4
...
4
...
1 Data analysis:
...
1
...
4
...
3 Evaluating additivity:
...
1
...
2 Randomized complete block designs
...
3 Unbalanced designs
...
3
...
4
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
2
3
...
1 Nested Designs
...
1
...
153
5
...
2 Repeated measures analysis:
...
1
2
...
3
2
...
5
2
...
7
2
...
9
2
...
11
Wheat yield distributions
...
The superpopulation model
...
The t-distribution
...
05 rejection region
...
Randomization distribution for the t-statistic
...
Critical regions and the non-central t-distribution
...
2
...
3
...
2
3
...
4
3
...
6
3
...
8
3
...
10
3
...
12
3
...
14
...
...
...
...
...
...
46
Bacteria data
...
Coagulation data
...
Normal-theory and randomization distributions of the F -statistic
Power
...
Yield-density data
...
Crab residuals
...
Data and log data
...
iii
10
15
20
25
26
28
32
33
39
41
50
58
69
72
73
77
77
81
88
89
90
92
94
95
LIST OF FIGURES
iv
3
...
100
4
...
2
4
...
4
4
...
6
4
...
...
...
...
8
4
...
10
4
...
12
4
...
14
4
...
16
4
...
Conditional Plots
...
Mean-variance relationship
...
Plots of transformed poison data
...
Comparison between types I and II, with delivery in color
...
Three datasets exhibiting non-additive effects
...
Results of the experiment
...
Marginal plots for pain data
...
Oxygen uptake data
...
...
...
...
...
...
...
1
5
...
3
5
...
Diagnostic plots
Potato data
...
...
...
148
149
151
154
...
...
potato
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
Chapter 1
Research Design Principles
1
...
Example (survey): Do you favor increasing the gas tax for public transportation?
• Specific cases: 200 people called for a telephone survey
• Inferential goal: get information on the opinion of the entire city
...
Some took hormones, others did not
...
1
...
RESEARCH DESIGN PRINCIPLES
2
Input variables consist of
controllable factors: measured and determined by scientist
uncontrollable factors: measured but not determined by scientist
noise factors: unmeasured, uncontrolled factors (experimental variability
or “error”)
For any interesting process, there are inputs such that:
variability in input → variability in output
If variability in an input factor x leads to variability in output y, we say x
is a source of variation
...
1
...
It may be hard to say what is input and what
is output
...
Example (Women’s Health Initiative, WHI):
• Population: Healthy, post-menopausal women in the U
...
• Input variables:
1
...
demographic variables (age, race, diet, family history,
...
unmeasured variables (?)
• Output variables
1
...
MI)
CHAPTER 1
...
invasive breast cancer
3
...
Observational population: 93,676 women enlisted starting in 1991,
tracked over eight years on average
...
2
...
3
...
Experimental Study (WHI randomized controlled trial):
1
...
e
...
age group
1 (50-59) 2 (60-69) 3 (70-79)
clinic 1
n11
n12
n13
2
n21
n22
n23
...
...
...
...
ni,j = # of women in study, in clinic i and in age group j
= # of women in block i, j
CHAPTER 1
...
Question: Why did they randomize within a block ?
2
...
Women on treatment had higher incidence of
• CHD
• breast cancer
• stroke
• pulmonary embolism
and lower incidence of
• colorectal cancer
• hip fracture
3
...
That is, our inductive inference
is
(specific) Higher CHD rate in treatment population than control
suggests
(general) if everyone in the population were treated, the incidence
rate of CHD would increase
...
RESEARCH DESIGN PRINCIPLES
5
Observational study
X2
cause
cause
...
...
...
...
Randomization can eliminate correlation between x1 and y due to a
different cause x2 , aka a confounder
...
RESEARCH DESIGN PRINCIPLES
1
...
Identify research hypotheses to be tested
...
Choose a set of experimental units, which are the units to which
treatments will be randomized
...
Choose a response/output variable
...
Determine potential sources of variation in response:
(a) factors of interest
(b) nuisance factors
5
...
Decide on the experimental procedure and how treatments are to be
randomly assigned
...
Three principles in Experimental Design
1
...
Replicates are runs of an experiment or sets of experimental units that
have the same values of the control variables
...
, n
...
e
...
( larger n → more evidence )
...
Randomization: Random assignment of treatments to experimental units
...
Makes confounding unlikely
...
RESEARCH DESIGN PRINCIPLES
7
3
...
The goal is to evenly distribute treatments across large potential
sources of variation
...
Experimental units: three plots of land, each to be divided into a 2 × 2 grid
...
Plot 1
bad soil
Plot 2
Plot 3
good soil
Chapter 2
Comparing Two Treatments
Example: Wheat yield
Factor of interest: Fertilizer type, A or B
...
Question: Is one fertilizer better than another, in terms of yield?
Experimental material: One plot of land to be divided into 2 rows of 6
subplots
...
Design question: How to assign treatments/factor levels to the plots?
Want to avoid confounding a treatment effect with another potential source
of variation
...
Potential sources of variation: Fertilizer, soil, sun, water
...
Implementation: If we assign treatments randomly, we can avoid
any pre-experimental bias in results: 12 playing cards, 6 red, 6 black were
shuffled and dealt
...
...
8
CHAPTER 2
...
Results:
A
A
B
B
A
B
26
...
4 26
...
7 25
...
5
B
B
A
A
B
A
14
...
9 16
...
1 24
...
6
How much evidence is there that fertilizer type is a source of yield variation?
Evidence about differences between two populations is generally measured by
comparing summary statistics across the two sample populations
...
2
...
Location:
• sample mean or average : y¯ =
1
n
Pn
i=1
yi
• sample median : qˆ(1/2) is a/the value y(1/2) such that
#(yi ≤ y(1/2) )/n ≥ 1/2
#(yi ≥ y(1/2) )/n ≥ 1/2
To find the median, sort the data in increasing order, and call
these values y(1) ,
...
If there are no ties, then
if n is odd, then y( n+1 ) is the median;
2
if n is even, then all numbers between y( n2 ) and y( n+1 ) are
2
medians
...
COMPARING TWO TREATMENTS
10
0
5
10
15
y
20
25
0
...
02
0
...
00
0
...
00
0
...
02
F(y)
0
...
6
Density
0
...
8
0
...
0
Histogram of y
10
15
20
y
25
30
5 10
20
30
N = 12 Bandwidth = 2
...
06
10
15
20
yA
25
30
0
5
10
15
y
20
25
0
...
0
0
...
00 0
...
02
Histogram of yB
Density
0
...
4 0
...
8
Density
0
...
05
1
...
133
Figure 2
...
025) , y(
...
9,11
...
3,16
...
1,19
...
6,23
...
5,14
...
9,24
...
COMPARING TWO TREATMENTS
> mean(yA)
[1] 20
...
53333
> median(yA)
[1] 20
...
712676
> sd(yB)
[1] 5
...
25,
...
275 24
...
25,
...
350 26
...
Would you recommend B over A for future plantings?
Do you think these results generalize to a larger population?
2
...
11
CHAPTER 2
...
A statistical hypothesis test evaluates the plausibility of H0 in light of
the data
...
We can evaluate H0 by
answering the following questions:
• Is a mean difference of 2
...
4 large compared to experimental noise?
To answer the above, we need to compare
{|¯
yB − y¯A | = 2
...
Hypothetical values of |¯
yB − y¯A | that could have been observed under H0 are
referred to as samples from the null distribution
...
, Y6,A }, {Y1,B ,
...
This is a function of the outcome of the experiment
...
Since
we will use it to perform a hypothesis test, we will call it a test statistic
...
9, 11
...
, 24
...
4 = gobs
Hypothesis testing procedure: Compare gobs to g(YA , YB ) for values of
YA and YB that could have been observed, if H0 were true
...
Cards were shuffled and dealt R, R, B, B,
...
COMPARING TWO TREATMENTS
13
2
...
9 11
...
6 23
...
3 28
...
2 17
...
5 21
...
3 19
...
Cards are shuffled and dealt B, R, B, B,
...
Crops were grown and wheat yields obtained:
B
A
B
B
A
A
26
...
4 26
...
7 25
...
5
A
B
B
A
A
B
14
...
9 16
...
1 24
...
6
Under this hypothetical treatment assignment,
(YA , YB ) = {11
...
3,
...
6}
|Y¯B − Y¯A | = 1
...
CHAPTER 2
...
Under our randomization scheme, there were
12!
12
=
= 924
6!6!
6
equally likely ways the treatments could have been assigned
...
, g924 }
This enumerates all potential pre-randomization outcomes of our test
statistic, assuming no treatment effect
...
Pr(g(YA , YB ) ≤ x|H0 ) =
#{gk ≤ x}
924
This distribution is sometimes called the randomization distribution, because it is obtained by the randomization scheme of the experiment
...
4|H0 ) = 0
...
4 or more is not unlikely under the null hypothesis
...
Generically, a p-value is
“The probability, under the null hypothesis, of obtaining a result as or more
extreme than the observed result
...
00
0
...
04
0
...
10
0
...
12
CHAPTER 2
...
2: Randomization distribution for the wheat example
Approximating a randomization
distribution We don’t want to have
n
to enumerate all n/2 possible treatment assignments
...
(b) compute the value of the test statistic, given the simulated treatment
assignment and under H0
...
, gNsim } approximates the null distribution :
#(|gk |) ≥ 2
...
4|H0 )
Nsim
The approximation improves if Nsim increased
...
9,11
...
6,23
...
3,28
...
2,17
...
5,21
...
3,19
...
COMPARING TWO TREATMENTS
g<-real()
for(nsim in 1:5000) {
xsim<-sample(x)
g[nsim]<- abs ( mean(y[xsim=="B"]) - mean(y[xsim=="A"] )
}
2
...
, yn }:
1
...
g(y) should be chosen so that it can differentiate between H0 and H1
...
Obtain a null distribution : A probability distribution over the possible outcomes of g(Y) under H0
...
, Yn } are potential
experimental results that could have happened under H0
...
Compute the p-value: The probability under H0 of observing a test
statistic g(Y) as or more extreme than the observed statistic g(y)
...
If the p-value is large ⇒ not evidence against H0
Even if we follow these guidelines, we must be careful in our specification of
H0 , H1 and g(Y) for the hypothesis testing procedure to be useful
...
• What does the p-value say about the probability that the null hypothesis is true? Try using Bayes rule to figure this out
...
COMPARING TWO TREATMENTS
2
...
action
accept H0
reject H0
truth
H0 true
H0 false
correct decision
type II error
type I error
correct decision
As we discussed
• The p-value can measure of evidence against H0 ;
• The smaller the p-value, the larger the evidence against H0
...
Compute the p-value by comparing observed test-stat to the null distribution;
2
...
This procedure is called a level-α test
...
Pr(type I error|H0 ) = Pr(reject H0 |H0 )
= Pr(p-value ≤ α|H0 )
= α
Single Experiment Interpretation: If you use a level-α test for your experiment where H0 is true, then before you run the experiment
there is probability α that you will erroneously reject H0
...
CHAPTER 2
...
Typically we need to be more specific
than “H0 false” to calculate the power
...
5
Relating samples to (super)populations
If the experiment is
• large
• complicated
• non- or partially randomized
then a null distribution based on randomization may be difficult to obtain
...
Consider the following model for our experiment:
• There is a large/infinite population of plots of similar size/shape/composition
as the plots in our experiment
...
COMPARING TWO TREATMENTS
19
• The plots that received A in our experiment can be viewed as independent samples from pA , and likewise for plots receiving B
...
, YnA ,A ∼ i
...
d
...
, YnB ,B ∼ i
...
d
...
Furthermore, if Y1,A ,
...
In general, as the
sample sizes increases
Y¯A → µA
s2A → σA2
#{Yi,A ≤ x}
= FˆA (x) → FA (x) =
nA
Z
x
pA (x)dx
−∞
CHAPTER 2
...
06
0
...
13
µA
0
...
00
0
...
02
0
...
04
sA=5
...
53
µB
0
...
02
0
...
04
sB=5
...
02
0
...
06
0
...
3: The superpopulation model
20
yB
25
30
35
CHAPTER 2
...
For example, if µB > µA , we
would recommend B over A, and vice versa
...
53 = y¯B > y¯A = 20
...
How much evidence is it? Should we
reject the null hypothesis? Consider our test statistic:
g(YA , YB ) = Y¯A − Y¯B
To decide whether to reject or not, we need to know the distribution of
g(YA , YB ) under H0
...
, YnA ,A ∼ i
...
d
...
, YnB ,B ∼ i
...
d
...
This will involve assumptions about/approximations to pA and pB
...
6
Normal distribution
The normal distribution is useful because in many cases,
• our data are approximately normally distributed, and/or
• our sample means are approximately normally distributed
...
COMPARING TWO TREATMENTS
22
These are both due to the central limit theorem:
X1 ∼ P1 (µ1 , σ1 )
X
m
qX
X
X2 ∼ P2 (µ2 , σ2 )
σj2
⇒
Xi ∼normal
˙
µj ,
...
j=1
Xm ∼ Pm (µm , σm )
Sums of varying quantities are approximately normally distributed
...
Yi = a1 × seedi + a2 × soili + a3 × wateri + a4 × suni + · · ·
The empirical distribution of crop yields from a population of fields with
varying quantities of seed, soil, water, sun,
...
and the variability
of seed, soil, water, sun,
...
, yn ∼ i
...
d
...
, yn ∼ i
...
d
...
...
, yn
∼ i
...
d
...
, y¯(m) } will look approximately normally distributed
with
sample mean {y (1) ,
...
, y (m) } ≈ σ 2 /n
i
...
the sampling distribution of the mean is approximately normal(µ, σ 2 /n),
even if the sampling distribution of the data are not normal
...
COMPARING TWO TREATMENTS
23
Basic properties of the normal distribution:
• Y ∼ normal(µ, σ) ⇒ aY + b ∼normal(aµ + b, |a|σ)
...
, Yn ∼ i
...
d
...
How does this help with hypothesis testing? Consider testing H0 :
µA = µ = µB (trt doesn’t affect mean)
...
So if we knew the variances, we’d have a null
distribution
...
7
Introduction to the t-test
Consider a simple one-sample hypothesis test:
Y1 ,
...
i
...
P , with mean µ and variance σ 2
...
• Y¯ would make a good test stat: it is sensitive to deviations from H0
...
COMPARING TWO TREATMENTS
24
– Y¯ is approximately normal
...
Is
f (Y) a statistic?
• Problem: Don’t usually know σ 2
...
One-sample t-statistic:
t(Y) =
Y¯ − µ0
√
s/ n
What is the null distribution of t(Y)? It seems that
s ≈ σ so
Y¯ − µ0
Y¯ − µ0
√ ≈
√
s/ n
σ/ n
so t(Y) ≈ normally distributed
...
χ2 distribution
Z1 ,
...
i
...
normal(0, 1) ⇒
X
Zi2 ∼ χ2n , chi-squared dist with n degrees of freedom
X
¯ 2 ∼ χ2n−1
(Zi − Z)
Y1 ,
...
i
...
normal(µ, σ) ⇒ (Y1 − µ)/σ,
...
i
...
normal(0, 1)
1 X
(Yi − µ)2 ∼ χ2n
⇒
σ2
1 X
(Yi − Y¯ )2 ∼ χ2n−1
σ2
¯
...
In fact, it is has a singular multivariate normal
0
...
COMPARING TWO TREATMENTS
25
0
...
04
n=9
n=10
n=11
0
5
10
15
X
20
25
30
Figure 2
...
Which vector do you
¯
...
, Zn ) or (Z1 − Z,
Getting back to the problem at hand,
n−1 2 n−1 1 X
s =
(Yi − Y¯ )2 ∼ χ2n−1 ,
σ2
σ2 n − 1
a known distribution we can look up on a table
...
, Yn ∼ i
...
d
...
3
0
...
COMPARING TWO TREATMENTS
0
...
1
p(t)
0
...
5: The t-distribution
•
•
√
n(Y¯ − µ)/σ ∼normal(0,1)
n−1 2
s
σ2
∼ χ2n−1
• Y¯ , s2 are independent
...
σ2
Then
√ ¯
Z
n(Y − µ)/σ
p
= q
n−1 2
X/(n − 1)
s /(n − 1)
σ2
=
Y¯ − µ
√ ∼ tn−1
s/ n
This is still not a statistic, as µ is unknown, but under a specific hypothesis,
like H0 : µ = µ0 , it is a statistic:
t(Y) =
Y¯ − µ0
√ ∼ tn−1 if E(Y ) = µ0
s/ n
It is called the t-statistic
...
COMPARING TWO TREATMENTS
27
• What happens to the distribution of t(Y) when n → ∞? Why?
• What if the data are not normal?
√
– What is the distribution of n(Y¯ − µ)/σ for large and small n?
– What is the distribution of n−1
s2 for large and small n?
σ2
√
– Are n(Y¯ − µ) and s2 independent for small n? What about for
large n?
Two-sided, one-sample t-test:
1
...
, Yn ∼ i
...
d
...
Null hypothesis: H0 : µ = µ0
3
...
Test statistic: t(Y) = n(Y¯ − µ0 )/s
• Pre-experiment we think of this as a random variable, an unknown
quantity that is to be randomly sampled from a population
...
5
...
6
...
p-value =
=
=
=
=
Pr(|t(Y)| ≥ |t(y)||H0 )
Pr(|Tn−1 | ≥ |t(y)|)
2 × Pr(Tn−1 ≥ |t(y)|)
2 ∗ (1 − pt(tobs, n − 1))
t
...
Level-α decision procedure: Reject H0 if
28
0
...
1
density
0
...
3
CHAPTER 2
...
6: A t8 null distribution and α = 0
...
05, t(n−1),1−α/2 ≈ 2 )
...
Is this a level-α test?
Confidence intervals via hypothesis tests: Recall that
• H0 : E(Y ) = µ0 is rejected if
√
n|(¯
y − µ0 )/s| ≥ t1−α/2
• H0 : E(Y ) = µ0 is not rejected if
√
n|(¯
y − µ0 )/s| ≤ t1−α/2
1
|¯
y − µ0 | ≤ √ s × t1−α/2
n
s
s
y¯ − √ × t1−α/2 ≤ µ0 ≤ y¯ + √ × t1−α/2
n
n
CHAPTER 2
...
Otherwise
it is in the rejection region
...
This
interval contains only those values of µ that are not rejected by this level-α
test
...
gather data;
2
...
Further suppose H0 : E(Y ) = µ0 is true
...
It is
• the pre-experimental probability that your confidence interval will cover
the true value;
• the large sample fraction of experiments in which the confidence interval
covers the true mean
...
8
Two sample tests
Recall the wheat example:
B
A
B
B
A
A
26
...
4 26
...
7 25
...
5
A
B
B
A
A
B
14
...
9 16
...
1 24
...
6
CHAPTER 2
...
, YnA A ∼
Y1B ,
...
i
...
normal(µA , σ)
i
...
d
...
In addition to normality we assume for now that both variances are equal
...
Hence if H0 is true then
Y¯B − Y¯A ∼ N
r
0, σ
1
1
+
nA nB
...
Show that (nA + nB − 2)s2p /σ 2 ∼ χ2nA +nB −2 (recall how the χ2 distribution was defined )
...
Show that the two-sample t-statistic has a t-distribution with nA +nB −
2 d
...
CHAPTER 2
...
05, so if the null hypothesis is true, the probability of a type I
error is 0
...
• y¯A = 20
...
63, nA = 6
y¯B = 22
...
51, nB = 6
p
• s2p = 31
...
4/(5
...
75
• Hence the p-value= Pr(|T10 | ≥ 0
...
47
• Hence H0 : µA = µB is not rejected at level α = 0
...
test( y[x=="A"],y[x=="B"], var
...
7458, df = 10, p-value = 0
...
570623 4
...
13333 22
...
7
Comparison to the randomization test: We could also use the twosample t-statistic in a randomization test
...
, (YA , YB )(nsim) }
that we could have seen under H0 and different treatment assignments
• compare the null distribution of the outcomes to the observed outcome
...
0
0
...
2
0
...
4
CHAPTER 2
...
7: The t-based null distribution for the wheat example
The comparison can be made using any statistic we like
...
Recall, to obtain a sample of t(YA , YB ) from the null distribution,
we
1
...
Compute the value of t(YA , YB ) under this treatment assignment and
assuming the null hypothesis
...
, t(nsim)
To obtain the p-value, we compare tobs to this empirical distribution:
p-value =
t
...
sim<-real()
for(nsim in 1:5000) {
#(|t(j) | ≥ |tobs |)
nsim
33
0
...
1
Density
0
...
3
0
...
COMPARING TWO TREATMENTS
−6
−4
−2
0
t(YA,YB)
2
4
6
Figure 2
...
stat
...
test(y[xsim=="B"],y[xsim=="A"],var
...
stat
...
stat
...
75)
= 0
...
47 = Pr(|TnA +nB −2 | ≥ 0
...
COMPARING TWO TREATMENTS
34
(1) Data are independent samples
(2) Each population is normally distributed
(3) The two populations have the same variance
• Imagined Universes:
– Randomization Test: Numerical responses remain fixed, we
imagine only hypothetical treatment assignments
– t-test: Treatment assignments remain fixed, we imagine a
population of different experimental units and/or conditions, giving different numerical responses
...
– t-test: under our assumptions, inference claims to be generalizable to other units / conditions, i
...
to a larger population
...
Pn
i=1
Yi , Y i ∈
Laplace (1800s): Used normal distribution to model measurement error in
experiments
...
Gosset/Student (1908): Derived the t-distribution
...
CHAPTER 2
...
”
Box and Andersen (1955): Approximation of randomization null distributions (and a long discussion)
...
05 level
...
, YnA A ∼ i
...
d
...
, YnB B ∼ i
...
d
...
Consider evaluating whether δ is a reasonable value for the mean difference:
H 0 : µA − µB = δ
H1 : µA − µB 6= δ
Under H0 , you should be able to show that
(Y¯A − Y¯B ) − δ
p
∼ tnA +nB −2
sp 1/nA + 1/nB
Thus a given difference δ is accepted at level α iff
|¯
y − y¯B − δ|
pA
sp 1/nA + 1/nB
r
(¯
yA − y¯B ) − sp
≤ t1−α/2,nA +nB −2
1
1
+
t1−α/2,nA +nB −2 ≤ δ
nA nB
r
≤ (¯
yA − y¯B ) + sp
Wheat example: A 95% C
...
for µA − µB is (−9
...
77)
...
COMPARING TWO TREATMENTS
36
Questions:
• What does the fact that 0 is in the interval say about H0 : µA = µB ?
• What is the interpretation of this interval?
• Could we have constructed an interval via randomization tests?
2
...
Two sample t-test:
• H 0 : µA = µB
H1 : µA 6= µB
• Gather data
• Perform a level α hypothesis test: reject H0 if
|tobs | ≥ t1−α/2,nA +nB −2
Recall, if α = 0
...
05, or more precisely:
Pr(type I error|H0 true) = Pr(reject H0 |H0 true) = 0
...
g
...
0001 and µB −µA = 10, 000
are both ways instances of the alternative hypothesis
...
0001) < Pr(reject H0 |µB − µA = 10, 000)
CHAPTER 2
...
For example, in
the case of the two-sample test, for a specific difference δ, we may ask what
is:
1 − Pr(type II error|µB − µA = δ) = Pr(reject H0 |µB − µA = δ)?
We define the power of a two-sample hypothesis test against a specific
alternative to be:
Power(δ, σ, nA , nB ) = Pr(reject H0 | µB − µA = δ)
= Pr(|t(YA , YB )| ≥ t1−α/2,nA +nB −2 µB − µA = δ)
...
However, now we want to work out the probability of getting a value of
the t-statistic greater than this critical value, when a specific alternative
hypothesis is true
...
If we suppose YA1 ,
...
i
...
normal(µA , σ) and YB1 ,
...
i
...
normal(µB , σ), where µB − µA = δ then we need to know the distribution of
Y¯B − Y¯A
t(YA , YB ) = q
...
1
nB
(∗)
CHAPTER 2
...
More specifically, if µB − µA = δ then
δ
t(YA , YB ) ∼ t∗nA +nB −2 q
σ n1A + n1B
|
{z
}
non-centrality
parameter
2
...
1
The non-central t-distribution
A noncentral t-distribution random variable can be represented as
Z +γ
T =p
X/ν
where
• γ is a constant;
• Z is standard normal;
• X is χ2 with ν degrees of freedom, independent of Z
...
Self-check question: Derive the above result about the distribution of
t(YA , YB )
...
In fact, it can be shown that
pν
δ
E(t(YA , YB ) | µB − µA = δ) = q
σ n1A +
×
1
nB
Γ( ν−1
)
2
ν
Γ( 2 )
2
where ν = nA + nB − 2, the degrees of freedom, and Γ(x) is the Gamma
function, a generalization of the factorial:
39
0
...
COMPARING TWO TREATMENTS
0
...
1
0
...
3
gamma=0
gamma=1
gamma=2
−2
0
2
t
4
6
Figure 2
...
• Γ(1) = 1, Γ( 12 ) =
√
π
...
nA nB
Hence
δ
Y¯ − Y¯A
qB
∼ normal q
σ n1A + n1B
σ n1A +
1
nB
, 1
...
COMPARING TWO TREATMENTS
40
We also know that for large nA , nB , s ≈ σ, so non-central t-distribution will
(for large enough nA , nB ) look approximately normal with
p
• mean δ/(σ (1/nA ) + (1/nB ));
• standard deviation 1
...
2
...
2
Computing the Power of a test
Recall our level-α testing procedure:
1
...
2
...
3
...
For this procedure, you have shown in the homework that
Pr( reject H0 |µB − µA = 0) =
=
=
=
Pr( p-value ≤ α|µB − µA = 0)
Pr(|t(YA , YB )| ≥ t1−α/2,nA +nB −2 |H0 )
Pr(|TnA +nB −2 | ≥ t1−α/2,nA +nB −2 )
α
But what is the probability of rejection under Hδ : µB − µA = δ? (Hopefully
this is bigger than α! )
Pr(reject H0 | µB − µA = δ) = Pr |t(YA , YB )| > t1−α/2,nA +nB −2 µB − µA = δ
= Pr |Tn∗A +nB −2,γ | > t1−α/2,nA +nB −2
= Pr Tn∗A +nB −2,γ > t1−α/2,nA +nB −2
+ Pr Tn∗A +nB −2,γ < −t1−α/2,nA +nB −2
= [1 − Pr Tn∗A +nB −2,γ < t1−α/2,nA +nB −2 ]
+ Pr Tn∗A +nB −2,γ < −t1−α/2,nA +nB −2
41
0
...
COMPARING TWO TREATMENTS
0
...
1
0
...
3
gamma=0
gamma=1
−2
0
2
t
4
6
Figure 2
...
f
...
If we have a rough idea of δ and σ we can evaluate the
power using this formula
...
rcrit <- qt( 1-alpha/2 , nA + nB -2 )
t
...
power <- pt(-t
...
gamma ) +
1- pt(t
...
gamma )
The normal approximation is given by using:
t
...
power <- pnorm(-t
...
gamma ) +
1- pnorm(t
...
gamma )
This will be very reasonable for large nA , nB
...
CHAPTER 2
...
rcrit, nA + nB − 2, ncp=t
...
rcrit, nA + nB − 2, ncp=t
...
However, including both
terms is not only correct, it also means that we can use the same R code for
positive and negative δ values
...
Example: Suppose
• µB − µA = 2
...
07, σ = 5
...
05 for such an
experiment?
> t
...
gamma
[1] 0
...
rcrit<-qt(1-alpha/2,nA+nB-2)
> t
...
228139
> t
...
rcrit,nA+nB-2, ncp=t
...
rcrit,nA+nB-2, ncp=t
...
power
[1] 0
...
gamma<- delta/(sigma*sqrt(1/nA +1/nB))
> t
...
054669
> t
...
rcrit
[1] 2
...
COMPARING TWO TREATMENTS
43
> t
...
rcrit,nA+nB-2,ncp=t
...
rcrit,nA+nB-2,ncp=t
...
power
[1] 0
...
Suppose an agricultural company
wants to know whether or not there are any effects of a soil additive on crop
yield
...
In practice, a difference in yield of less than
3 units is considered inconsequential
...
How many plots should
be in their study?
Procedure:
• CRD with n plots assigned to each of the two treatments;
• Declare a difference between A and B if |t(YA , YB )| > t
...
Question: What is the probability that a difference is declared, assuming
• δ = 3;
• σ = 6;
• various values of n
...
11
...
10
Checking Assumptions of a two-sample
t-test
Recall that in a two sample t-test we make three assumptions:
44
0
...
COMPARING TWO TREATMENTS
0
...
1
p(t)
0
...
3
5
20
80
−4
−2
0
t
2
4
6
0
...
4
0
...
8
−6
20
40
60
80
100
nA=nB
Figure 2
...
CHAPTER 2
...
(3) Variances of the two groups are equal
(Strictly speaking we only require this to hold under the null hypothesis of no
difference
...
However we would need to do something else if we wanted to compute
the power, e
...
simulation )
Assumption (1) will generally hold if the experiment is properly randomized
(and blocks are accounted for, more on this later)
...
2
The
2
−1
2
nA
2
is a continuity correction
...
Thus
we plot the pairs
(z
1
− 12
nA
, yA(1) ),
...
nA
2
Assumption (3) may be checked roughly by fitting straight lines to the probability plots and examining their slopes (Why?)
Formal hypothesis tests for equal variances may also be performed
...
14
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
Sample Quantiles
−1
...
6 −0
...
0
−0
...
5
●
Sample Quantiles
−0
...
5 1
...
0
0
...
0
Theoretical Quantiles
−1
...
0
1
...
0
0
...
0
Theoretical Quantiles
●
●
●
●
−1
...
0
1
...
0
0
...
0
Theoretical Quantiles
−1
...
0
1
...
0
0
...
0
Theoretical Quantiles
●
●
●
●
●
●
Sample Quantiles
−0
...
5 1
...
0
0
...
0
Theoretical Quantiles
●
●
●
●
−1
...
0
1
...
0
0
...
0
Theoretical Quantiles
−1
...
0
1
...
0
0
...
0
Theoretical Quantiles
●
Sample Quantiles
−1
...
4 0
...
0
0
...
0
Theoretical Quantiles
●
●
●
●
●
●
−1
...
0
1
...
12: Normal scores plots
...
0
−0
...
0
●
●
●
Sample Quantiles
−2
...
0
0
...
0
1
...
0
−1
...
0
1
...
5
−0
...
5
0
...
5
Sample Quantiles
−1
...
0
1
...
5 −1
...
5
●
●
●
Sample Quantiles
−0
...
5 1
...
4
−1
...
6
CHAPTER 2
...
0
0
...
0
Theoretical Quantiles
CHAPTER 2
...
10
...
This is known as Welch’s approximation; it may not give an integer d
...
Note: This t-distribution is not, in fact the exact sampling distribution of
tdiff (yA , yB ) under the null hypothesis that µA = µB , and σA2 6= σB2
...
So we are stuck if we want an exact answer! (See Behrens-Fisher
problem
...
2
...
2
Which two-sample t-test to use? tdiff (yA , yB )
vs
...
• If nA > nB , but σA < σB , and µA = µB then the two sample test based
on comparing t(yA , yB ) to a t-distribution on nA + nB − 2 d
...
will
reject more than 5% of the time
...
e
...
COMPARING TWO TREATMENTS
48
is scientifically relevant, then we are computing a valid p-value,
and this higher rejection rate is a good thing! Since when the
variances are unequal the null hypothesis is false
...
Likewise we will underestimate the probability of Type I error
...
In short: one should be careful about applying the test based on t(yA , yB )
if the sample standard deviations appear very different, and it is not reasonable to assume equal means and variances under the null hypothesis
...
1
Introduction to ANOVA
Example (Bacterial growth):
• Background: Bacteria grows on meat products stored in plastic wrap
...
• Treatments: Meat wrapped in
1
...
3
...
ambient air;
a vacuum;
1%CO, 40%O2 , 59%N ;
100%CO2
...
Bacterial count was measured for each
steak after 9 days of storage at 4◦ C
...
48
5
...
26
3
...
19
...
04
...
COMPARING SEVERAL TREATMENTS
log bact count/cm2
4
5
6
7
●
●
50
●
●
●
●
●
●
●
1
2
3
4
●
●
●
●
●
3
●
●
1
...
5
2
...
5
Treatment
3
...
5
4
...
1: Bacteria data
• Question: Is there significant variation due to treatment?
Possible analysis method: Perform paired t-tests of
H0ij : µi = µj
versus
H1ij : µi 6= µj
for each of the 42 = 6 possible pairwise comparisons
...
Problem: If there is no treatment effect at all, then
Pr(reject H0ij
Pr(reject H0ij |H0ij true) = α
some i, j|H0ij true all i, j) = 1 − Pr( accept H0ij all i, j|H0ij true all i, j)
Y
≈ 1 − (1 − α)
i
If α = 0
...
956 = 0
...
COMPARING SEVERAL TREATMENTS
51
So, even though the pairwise error rate is 0
...
26
...
For now, we will discuss a method of testing
the global hypothesis of no variation due to treatment:
H0 : µi = µj for all i 6= j
versus
H1 : µi 6= µj for some i 6= j
To do this, we will compare treatment variability to experimental variability
...
3
...
1
A model for treatment variation
Data: yi,j = measurement from the jth replicate under th ith treatment
...
, t indexes treatments
j = 1,
...
Treatment means model:
yi,j = µi + i,j
E[i,j ] = 0
V [i,j ] = σ 2
• µi is the ith treatment mean,
• i,j represents within treatment variation/error/noise
...
, τt are the treatment effects, representing between treatment variation
CHAPTER 3
...
P
In this model, we typically restrict
τi = 0 (Why?) The above represent
two parameterizations of the same model:
µj = µ + τ j ⇔ τ j = µj − µ
Reduced model:
yi,j = µ + i,j
E[i,j ] = 0
V [i,j ] = σ 2
This is a special case of the above two models with
• µ = µ1 = · · · µt , or
• τi = 0 for all i
...
3
...
2
Model Fitting
What are good estimates of the parameters? One criteria used to evaluate
different values of µ = {µ1 ,
...
Estimating treatment means: SSE is quadratic → convex
...
minimizer µ
r
∂ X
∂
SSE(µ) =
(yi,j − µi )2
∂µi
∂µi j=1
X
= −2
(yi,j − µi )
= −2r(¯
yi· − µi )
CHAPTER 3
...
, y¯t· }
...
If we
have more than one group, we want to pool our estimates to be more precise:
(r − 1)s21 + · · · + (r − 1)s2t
(r − 1) + · · · + (r − 1)
P
P
(y1j − y¯1· )2 + · · · + (y1j − y¯1· )2
=
t(r − 1)
PP
(yi,j − µ
ˆi )2
=
t(r − 1)
ˆ
SSE(µ)
=
≡ M SE
t(r − 1)
Consider the following assumptions:
s2 =
A0: Data are independently sampled from their respective populations
A1: Populations have the same variance
A2: Populations are normally distributed
ˆ is an unbiased estimator of µ;
• A0 → µ
• A0+A1 → s2 is an unbiased estimator of σ 2 ;
ˆ s2 ) are the minimum variance unbiased esti• A0+A1+A2 → (µ,
mators of (µ, σ 2 )
...
ˆ ≡ SSE is a measure of within
Within-treatment variability: SSE(µ)
treatment variability:
SSE =
t X
r
X
(yi,j − y¯i· )2
i=1 j=1
It is the sum of squared deviation of observation values from their group
mean
...
COMPARING SEVERAL TREATMENTS
54
Between-treatment variability: The analogous measure of between treatment variability is
SST =
t X
r
X
2
(¯
yi· − y¯·· ) = r
i=1 j=1
t
X
(¯
yi· − y¯·· )2
i=1
We call this the treatment sum of squares
...
1
...
, µt } all equal versus H1 : {µ1 ,
...
, µt } not all equal ⇔
t
X
(µi − µ
¯)2 > 0
i=1
Probabilistically,
t
X
2
(µi − µ
¯) > 0 ⇒ a large
i=1
t
X
(¯
yi· − y¯·· )2 will probably be observed
i=1
Inductively,
a large
X
i=1t
(¯
yi· − y¯·· )2 observed ⇒
X
(µi − µ
¯)2 > 0 is plausible
i=1t
So a large value of SST or M ST gives evidence that there are differences
between the true treatment means
...
Then
√
√
E(Y¯i· ) = µ → E( rY¯i· ) = rµ
√
V (Y¯i· ) = σ 2 /r → V ( rY¯i· ) = σ 2
√
√
So under the null rY¯1· ,
...
Recall that if X1 ,
...
i
...
P , then an
CHAPTER 3
...
Therefore,
variance (Xi − X)
P√ ¯ √ ¯ 2
√
( r Yi − r Y )
is an unbiased estimate of V ( rY¯i ) = σ 2
t−1
But
P
P√ ¯ √ ¯ 2
( r Yi − r Y )
r (Y¯i − Y¯ )2
=
t−1
t−1
SST
=
t−1
= M ST,
so E(M ST |H0 ) = σ 2
...
Lets summarize our potential estimators of σ 2 :
• Under H0 :
– E(M SE|H0 ) = σ 2
– E(M ST |H0 ) = σ 2
• Under H1 :
– E(M SE|H1 ) = σ 2
– E(M ST |H1 ) = σ 2 + rvτ2
Here is an idea for a test statistic:
• If H0 is true:
CHAPTER 3
...
Thus the test statistic F (Y) = M ST /M SE is sensitive to deviations from
the null, and can be used to measure evidence against H0
...
Example (Bacterial Growth):
y1<-c(7
...
98,7
...
26,5
...
80)
y3<-c(7
...
33,7
...
51,2
...
66)
r<-3
t<-4
ybar
...
t<- c(var(y1),var(y2),var(y3),var(y4))
SSE<- sum( (r-1)*s2
...
t-mean(ybar
...
8728
0
...
958
0
...
58
The F-ratio seems pretty far away from 1
...
CHAPTER 3
...
To rule out H0 , we need to look at how likely such a treatment assignment
is
...
66 6
...
80 5
...
44 5
...
41 7
...
04 3
...
91 3
...
obs<-anova(lm(y~as
...
obs
[1] 94
...
null<-NULL
for(nsim in 1:1000) {
x
...
null<-c(F
...
factor(x
...
null>=F
...
null)
[1] 92
...
3
...
4
Partitioning sums of squares
Intuition:
Total variability = variability due to treatment
+ variability due to other things
= between treatment variability + within treatment variability
Proposition:
SSTotal ≡
t X
r
X
i=1 j=1
(yi,j − y¯·· )2 = SST + SSE
CHAPTER 3
...
0
0
...
2
0
...
4
Histogram of F
...
null
Figure 3
...
COMPARING SEVERAL TREATMENTS
59
for all j
...
e
...
• the residual for observation {i, j} is (yi,j − y¯i· )
...
i
j
If we believe H0 ,
• our fitted value of yi,j is y¯·· , i
...
our predicted value of another
observation in any group is y¯·· = µ
ˆ
...
• the model lack-of-fit is measured by the residual sum of squares=
sum of squared residuals=
XX
(yi,j − y¯·· )2 = SST
...
The variance explained by different sources can be compared
and analyzed
...
COMPARING SEVERAL TREATMENTS
3
...
5
60
The ANOVA table
ANOVA = Analysis of Variance
Source of variation
Treatment
Noise
Total
Bacteria example:
Source of variation
Treatment
Noise
Total
Degrees of Freedom Sums of Squares
t−1
SST
t(r − 1)
SSE
tr − 1
SST otal
Mean Squares
M ST = SST
t−1
SSE
M SE = t(r−1)
F-ratio
M ST /M SE
Degrees of Freedom Sums of Squares Mean Squares F-ratio
3
32
...
958
94
...
927
0
...
800
Using the ANOVA table:
• No model assumptions → table gives a descriptive decomposition of
variation
...
• Assuming treatments were randomly assigned → hypothesis tests, pvalues
...
Understanding Degrees of Freedom: Consider an experiment with t
treatments/groups :
Data
Group means
y1,1
...
, y2,r
y¯2,·
=
y¯ =
tr
t
...
...
yt,1
...
COMPARING SEVERAL TREATMENTS
61
We can “decompose” each observation as follows:
yi,j = y¯ + (¯
yi − y¯) + (yi,j − y¯i )
This leads to
(yi,j − y¯)
=
(¯
yi − y¯)
+
(yi,j − y¯i )
total variation = between group variation + within group variation
All data can be decomposed this way, leading to the following vectors of
length tr :
Total
y1,1 − y¯
...
...
...
y2,1 − y¯
...
...
y2,r − y¯
...
...
− y¯
...
− y¯
...
...
(¯
y1
...
)
(¯
y2
...
)
...
...
− y¯
...
...
)
(y1,2 − y¯1
...
...
(y1,r − y¯1
...
)
...
...
)
...
yt,1 − y¯
...
...
yt,r − y¯
...
− y¯
...
...
(¯
yt
...
)
+
+
+
+
+
(yt,1 − y¯t
...
...
(yt,r − y¯t
...
COMPARING SEVERAL TREATMENTS
62
To understand this latest definition, consider x1 , x2 , x3 and x¯ = (x1 + x2 +
x3 )/3:
c1
x1 − x¯
x2 − x¯ = c2
x3 − x¯
c3
How many degrees of freedom does this vector have?
How many components can vary independently ?
c1 + c2 =
=
=
=
=
x1 + x2 − 2(x1 + x2 + x3 )/3
x1 /3 + x2 /3 − 2x3 /3
x1 /3 + x2 /3 + x3 /3 − x3
x¯ − x3
−c3
So c1 , c2 , c3 can’t all be independently varied, only two at a time can be
arbitrarily changed
...
In general,
x1 − x¯
···
xm − x¯
is an m-dimensional vector in an m − 1 dimensional subspace, having m − 1
degrees of freedom
...
Note that
• dof = dimension of the space the vector lies in
• SS = squared length of the vector
CHAPTER 3
...
1
...
...
...
...
...
¯ trt =
¯ ·· = 1¯
y=
y·· =
y
y
...
...
...
...
...
Recall that
two vectors u and v are orthogonal/perpendicular/at right angles if
u·v ≡
m
X
ui v i = 0
i=1
We have already shown that b and c are orthogonal:
b·c =
rt
X
bl cl
l=1
=
t X
r
X
(¯
yt· − y¯·· )(yi,j − y¯i· )
i=1 j=1
=
t
X
i=1
=
t
X
(¯
yt· − y¯·· )
r
X
(yi,j − y¯i· )
j=1
(¯
yt· − y¯·· ) × 0
i=1
= 0
So the vector a is the vector sum of two orthogonal vectors
...
COMPARING SEVERAL TREATMENTS
¯ ·· )
a = (y − y
¯ trt )
c = (y − y
¯ ·· )
b = (¯
ytrt − y
Now recall
¯ ·· ||2 = SSTotal
• ||a||2 = ||y − y
• ||b||2 = ||¯
ytrt − y¯·· ||2 = SST
¯ trt ||2 = SSE
• ||c||2 = ||y − y
What do we know about right triangles?
||a||2
= ||b||2 + ||c||2
SSTotal = SST + SSE
So the ANOVA decomposition is an application of Pythagoras’ Theorem
...
• (¯
ytrt − y
This last bit means the degrees of freedom must add, so
¯ ·· ) = (t − 1) + t(r − 1) = tr − 1
df(y − y
64
CHAPTER 3
...
1
...
Otherwise it is said to be unbalanced
...
, t, j = 1,
...
How might the ANOVA decomposition work in this case?
Partitioning sources of variation, unequal sample sizes:
Total
y1,1 − y¯
...
...
...
y2,1 − y¯
...
...
y2,r2 − y¯
...
...
− y¯
...
− y¯
...
...
(¯
y1
...
)
(¯
y2
...
)
...
...
− y¯
...
...
)
(y1,2 − y¯1
...
...
(y1,r1 − y¯1
...
)
...
...
)
...
yt,1 − y¯
...
...
yt,rt − y¯
...
− y¯
...
...
(¯
yt
...
)
+
+
+
+
+
(yt,1 − y¯t
...
...
(yt,rt − y¯t
...
Might we take a different average?
CHAPTER 3
...
First, lets check orthogonality:
b·c =
ri
t X
X
(¯
yt· − y¯·· )(yi,j − y¯i· )
i=1 j=1
=
t
X
(¯
yt· − y¯·· )
i=1
=
t
X
ri
X
(yi,j − y¯i· )
j=1
(¯
yt· − y¯·· ) × 0 = 0
i=1
Note: c must be orthogonal to any vector that is constant “within a group
...
What about degrees of freedom?
P
• dof(c) = ti=1 (ri − 1) = N − t should be clear
• dof(a) = N − 1 should be clear
• dof(b) = ?
Note that in general,
t
1X
y¯i· 6= y¯··
t i=1
But in the vector b we have ri copies of (¯
y· − y¯·· ), and
1 X
P
ri y¯i· = y¯··
ri
and so the vector b does sum to zero
...
CHAPTER 3
...
of Freedom Sum of Squares
Mean Square
F-Ratio
Treatment
t−1
SST
MST= SST
t−1
MST/MSE
Noise
N −t
SSE
Total
SSTotal
N −1
MSE =
SSE
N −t
Now suppose the following model is correct:
yi,j = µi + i,j
V (i,j ) = σ 2
Does MSE still estimate σ 2 ?
M SE = SST /(N − t)
Pr1
Pt
¯1· )2 + · · · rj=1
(yt,j − y¯t· )2
j=1 (y1,j − y
=
(r1 − 1) + · · · + (rt − 1)
(r1 − 1)s21 + · · · + (rt − 1)s2t
=
(r1 − 1) + · · · + (rt − 1)
So MSE is a weighted average of a bunch of unbiased estimates of σ 2 , so it
is still unbiased
...
One can show
• E(MSE) = σ 2
• E(MST) = σ 2 + vτ2 , where
– vτ2 =
Pt
2
i=1 ri τi
t−1
– τ i = µi − µ
¯
– µ=
Pt
rµ
i=1
P i i
...
CHAPTER 3
...
1
...
However, due to resource constraints, rA = 4, rB = 6, rC =
6, rD = 8
...
For the second
and third we need a sampling model:
yi,j = µi + i,j
1,1
...
i
...
normal(0, σ)
This model implies
• independence of errors
• constant variance
• normally distributed data
Another way to write it:
yA,1 ,
...
i
...
normal(µA , σ)
yB,1 ,
...
i
...
normal(µB , σ)
yC,1 ,
...
i
...
normal(µC , σ)
yD,1 ,
...
i
...
normal(µD , σ)
So we are viewing the 4 samples under A as a random sample from the
population of coagulation times that would be present if all animals got A
(and similarly for samples under B,C and D)
...
COMPARING SEVERAL TREATMENTS
70
coagulation time
60
65
C
B
A
A
A
A
69
C
C
C
B
B
B
B
B
D
D
D
D
D
D
D
diet
Figure 3
...
0
76
...
571
Residuals 20 112
...
6
The F-stat is large, but how unlikely is it under H0 ?
Two viewpoints on null distributions:
• The 24 animals we selected are fixed
...
• The 24 animals we selected are random samples from a larger population of animals
...
Randomization tests correspond to the first viewpoint, sampling theory to
the second
...
COMPARING SEVERAL TREATMENTS
3
...
9
70
Sampling distribution of the F -statistic
Recall the χ2 distribution:
Y1 ,
...
i
...
normal(µ, σ) ⇒
1 X
(Yi − Y¯ )2 ∼ χ2n−1
2
σ
Also,
X1 ∼ χ2k1
X2 ∼ χ2k2
⇒ X1 + X2 ∼ χ2k1 +k2
X1 ⊥ X2
(Null) distribution of SSE:
PP
(Yij −Y¯i· )2
σ2
=
∼
∼
1
σ2
P
(Y1j − Y¯1· )2 +
χ2r1 −1
+
2
χN −t
···
+
·
·
·
+
P
(recall (ri − 1) = N − t)
So SSE/σ 2 ∼ χ2N −t
...
COMPARING SEVERAL TREATMENTS
71
Introducing the F -distribution: If
X1 ∼ χ2k1
X1 /k1
X2 ∼ χ2k2
⇒
∼ Fk1 ,k2
X2 /k2
X1 ⊥ X2
Fk1 ,k2 is the “F -distribution with k1 and k2 degrees of freedom
...
How to
determine Fcrit ?
Level-α testing procedure:
1
...
Construct ANOVA table;
3
...
trt, dof
...
So Fcrit is the 1−α quantile of an Ft−1,N −t distribution
...
Back to the example:
> anova(lm(ctime~diet))
Analysis of Variance Table
Response: ctime
Df Sum Sq Mean Sq F value
Pr(>F)
diet
3 228
...
0 13
...
658e-05 ***
72
0
...
7
CHAPTER 3
...
0
0
...
2
density
0
...
4
0
...
6
0
...
0
0
0
...
4
F(3,20)
F(3,10)
F(3,5)
F(3,2)
0
5
10
F
CHAPTER 3
...
0
0
...
4
0
...
5: Normal-theory and randomization distributions of the F -statistic
Residuals 20
112
...
6
Fobs<-anova(lm(ctime~diet))$F[1]
Fsim<-NULL
for(nsim in 1:1000) {
diet
...
sim))$F[1] )
}
> mean(Fsim>=Fobs)
[1] 2e-04
> 1-pf(Fobs,3,20)
[1] 4
...
1
...
We can explore this further by making treatment comparisons
...
COMPARING SEVERAL TREATMENTS
74
If H0 is rejected we
• estimate µi with y¯i ;
• estimate σi2 with
– s2i : if variances are very unequal, this might be a better estimate
...
Standard practice: Unless strong evidence to the contrary, we typically assume V (Yi,j ) = V (Yk,l ) = σ 2 , and use s2 ≡ M SE to estimate σ 2
...
Confidence intervals for treatment means: Similar to the one sample
case
...
As a
result,
Y¯i ± SE(Y¯i· ) × t1−α/2,N −t
is a 100 × (1 − α)% confidence interval for µi
...
COMPARING SEVERAL TREATMENTS
75
A handy rule of thumb: If θˆ is an estimator of θ, then in many situations
ˆ
θˆ ± 2 × SE(θ)
is an approximate 95% CI for θ
...
975,20 = qt(
...
1
p
p
√
• SE(Y¯i ) = s2 /ri = 5
...
37/ ri
Diet µdiet ) ri
C
68
6
B
66
6
A
61
4
D
61
8
3
...
11
SE(ˆ
µdiet )
0
...
97
1
...
84
95% CI
(65
...
0)
(63
...
0)
(58
...
5)
(59
...
8)
Power calculations for the F-test
Recall, the power of a test is the probability of rejecting H0 when it is false
...
Power(µ, σ, r) = Pr(reject H0 |µ, σ, r)
= Pr(Fobs > F1−α,t−1,N −t |µ, σ, r)
Sampling distributions:
Y1,1
...
i
...
normal(µ1 , σ)
...
Yt,1
...
i
...
normal(µt , σ)
Under this sampling model, F = F (Y) has a non-central F distribution
with
CHAPTER 3
...
In many texts, power is
expressed as a function of the quantity Φ
r P
r τi2 p
= λ/t
Φ=
σ2t
Lets try to understand what Φ represents:
P 2
τi /t
2
Φ =
2
σ /r
treatment variation
=
experimental uncertainty
= treatment variation × experimental precision
Note that “treatment variation” means “average squared treatment effect
size”
...
Power(µ, σ, r) = Pr(reject H0 |µ, σ, r)
= Pr(Fobs > F1−α,t−1,N −t |µ, σ, r)
= 1-pf( qf(1-alpha,t-1,N-t) , t-1 , N-t , ncp=lambda )
CHAPTER 3
...
0
F
...
5
4
...
4 0
...
6 0
...
8 0
...
0
t=4, alpha=0
...
between/var
...
6: Power
2
4
6
r
8
10
0
...
85
0
...
0
F
...
5
4
...
05, var
...
within=2
2
4
6
r
8
Figure 3
...
COMPARING SEVERAL TREATMENTS
3
...
0
76
...
571 4
...
0
5
...
But
what are the differences?
3
...
1
Contrasts
Differences between sets of means can be evaluated by estimating contrasts
...
We can estimate
them, and obtain standard errors for them
...
Parameter estimates: Let
Cˆ =
t
X
i=1
k i µi =
t
X
ki y¯i·
i=1
ˆ = C, so Cˆ is a unbiased estimator of C
...
COMPARING SEVERAL TREATMENTS
79
Standard errors:
t
X
ˆ =
V (C)
i=1
t
X
=
V (ki y¯i· )
ki2 σ 2 /ri
i=1
= σ
2
t
X
ki2 /ri
i=1
ˆ is
So an estimate of V (C)
s2C
=s
2
t
X
ki2 /ri
i=1
The standard error of a contrast is an estimate of its standard deviation
s
X k2
i
ˆ =s
SE(C)
ri
t-distributions for contrasts: Consider
Pt
ki y¯i·
Cˆ
= pi=1
P 2
ˆ
s
ki /ri
SE(C)
If the data are normally distributed , then under H0 : C =
Cˆ
ˆ
SE(C)
P
ki µi = 0,
∼ tN −t
Exercise: Prove this result
Hypothesis test:
• H0 : C = 0 versus H1 : C 6= 0
...
• Level-α test : Reject H0 if |C/SE(
C)|
ˆ
ˆ
• p-value: Pr(|tN −t > |C(y)/SE(
C(y))|)
= 2*(1-pt( abs(c_hat/se_c_hat),N-t))
...
COMPARING SEVERAL TREATMENTS
80
Example: Recall in the coagulation example µ
ˆA = 61, µ
ˆB = 66, and their
95% confidence intervals were (58
...
5) and (63
...
0)
...
Hypothesis test: H0 : C = 0
...
53
= 3
...
27,20)) = 0
...
Confidence intervals: Based on the normality assumptions, a 100 × (1 −
α)% confidence interval for C is given by
ˆ
Cˆ ± t1−α/2,N −t × SE(C)
3
...
2
Orthogonal Contrasts
What use are contrasts? Consider the data in Figure 3
...
> anova(lm(y~as
...
factor(x) 4 87
...
900 29
...
690e-05 ***
Residuals
10 7
...
748
In this case, the treatment levels have an ordering to them (this is not always
the case)
...
• They are orthogonal : Ci · Cj = 0
...
COMPARING SEVERAL TREATMENTS
81
20
●
●
12
grain yield
14
16
18
●
●
●
●
●
●
●
●
●
●
●
●
●
10
20
30
plant density
40
50
Figure 3
...
This is the
“sum to zero” part, i
...
Ci · 1 = 0 for each contrast
...
, µ5 are monotonically increasing or monotonically decreasing
...
• C2 is big if “there is a bump”, up or down, in the middle of the treatments, i
...
C2 is measuring the quadratic component of the relationship
...
You can produce these contrast coefficients in R:
> contr
...
L
...
C
^4
[1,] -6
...
5345225 -3
...
1195229
[2,] -3
...
2672612 6
...
4780914
CHAPTER 3
...
481950e-18 -0
...
893692e-16 0
...
162278e-01 -0
...
324555e-01 -0
...
324555e-01 0
...
162278e-01 0
...
This doesn’t
change their sum, and it doesn’t change the orthogonality
...
means<-tapply(y,x,mean)
> trt
...
hat<-trt
...
poly(5)
> c
...
L
...
C
^4
[1,] 3
...
741657 0
...
83666
> 3*c
...
L
...
C ^4
[1,] 43
...
3 2
...
hat^2)
[1] 87
...
Recall, we represented treatment variation as a
vector that lived in t − 1 dimensional space
...
20 43
...
75
Quad trt 1 42
...
00 56
...
30 0
...
40
Quart trt 1 2
...
10 2
...
60 21
...
28
Error
10 7
...
75
Total
14 95
...
CHAPTER 3
...
2
...
e
...
Should we perform hypothesis tests for
all comparisons?
The more hypotheses we test, the higher the probability that at least one
of them will be rejected, regardless of their validity
...
• Comparison-wise type I error rate : Pr(reject H0ij |H0ij is true )
...
Gather data
2
...
Reject each H0ij for which |tij | > t1−αC /2,N −t
The comparison-wise type I error rate is of course
P (|ti j| > t1−αC /2,N −t |H0ij ) = αC
The experiment-wise type I error rate is the probability that we say
differences between treatments exist when no differences exist:
P (|tij | > tcrit for some i, j |H0 ) ≥ αC
with equality only if there are two treatments total
...
What is the experiment-wise rate in
this analysis procedure?
CHAPTER 3
...
Then f 0 (0) = − log(1 − αC ) and
f (x) ≈ f (0) + xf 0 (0)
1 − (1 − αC )x ≈ (1 − 1) − x log(1 − αC )
≈ xαC for small αC
...
Therefore
P (one or more H0ij rejected |H0 ) ≤
X
P (H0ij rejected |H0 )
i,j
t
=
αC
2
Bonferroni: Bonferroni procedure for controlling experiment-wise type I
error rate is as follows:
1
...
2
...
COMPARING SEVERAL TREATMENTS
where αC = αE /
t
2
85
...
Fisher’s Protected LSD (Fisher’s least-significant-difference method)
Another approach to controlling type I error makes use of the F -test
...
Perform ANOVA and compute F -statistic
2
...
3
...
Under this procedure,
• Experiment-wise type I error rate is αE
• Comparison-wise type I error rate is
P (reject H0ij |H0ij ) = P (F > Fcrit and |tij | > tcrit |H0ij )
= P (|tij | > tcrit |F > Fcrit , H0ij )P (F > Fcrit |H0ij )
– If H0 is also true, this is less than αE , and is generally between
αC × αE and αE , depending on how many treatments there are
...
3
...
COMPARING SEVERAL TREATMENTS
86
A3: {i,j }’s are normally distributed
...
If in addition
H0: µi = µ for all i = 1,
...
then the noncentrality parameter is zero and F = M ST /M SE ∼ Ft−1,N −t
...
What should we do if the model assumptions are not correct?
We could have written
A0 : {i,j } ∼ i
...
d
...
We don’t do this because some violations of
assumptions are more serious than others
...
We will discuss A1 when we talk about
blocking
...
3
...
1
Detecting violations with residuals
Violations of assumptions can be checked via residual analysis
...
ˆij is called the residual for observation i, j
...
COMPARING SEVERAL TREATMENTS
87
Checking Normality Assumptions:
• Histogram:
Make a histogram of ˆij ’s
...
If there are enough observations, graphically compare the
histogram to a N (0, s2 ) distribution
...
• Normal probability, or qq-plot:
If ij ∼ N (0, σ 2 ) then the ordered residuals (ˆ(1) ,
...
How non-normal can a sample from a normal population look? You can
always check yourself by simulating data in R
...
9
...
Data: yij = population total in transect j of site i
...
Data description:
site sample mean
1
33
...
72
3
50
...
24
5
10
...
64
ANOVA:
sample median
17
10
5
2
2
4
sample std dev
50
...
35
107
...
39
19
...
01
CHAPTER 3
...
0 0
...
2 0
...
4 0
...
6
Histogram of y
88
Normal Q−Q Plot
●
2
●
Sample Quantiles
0
1
●
●
●
●
●●●●●
●●●
−1
● ● ●
●
●
●
−2
−1
0
y
1
2
−2
−1
Histogram of y
0
1
Theoretical Quantiles
2
Normal Q−Q Plot
Sample Quantiles
−1
0
1
2
Density
0
...
1 0
...
3 0
...
5
●
●●
−2
●
−2
−1
0
y
1
● ●
●
●
●
●●
●●●
●●●
●
●
●●●
●●●●●
●●
●●●●
●
●
●
●●
●●●
●●●●
●●
●●
●
2
−2
−1
0
1
Theoretical Quantiles
Histogram of y
2
Normal Q−Q Plot
0
...
1
Density
0
...
3
2
●
●● ●
●●●
●
●●
●●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●●●
●●●
●●
● ●
●
−2
−1
0
y
1
2
−2
−1
0
1
Theoretical Quantiles
2
Figure 3
...
COMPARING SEVERAL TREATMENTS
site2
0
...
005
0
...
000 0
...
004 0
...
015
site1
89
0
100
200
300
crab[crab[, 1] == i, 2]
400
0
100
400
site4
0
...
000
0
...
04
Density
0
...
010
0
...
015
site3
200
300
crab[crab[, 1] == i, 2]
0
100
200
300
crab[crab[, 1] == i, 2]
400
100
200
300
crab[crab[, 1] == i, 2]
400
0
...
00
0
...
01
0
...
02 0
...
02 0
...
04
site5
0
0
100
200
300
crab[crab[, 1] == i, 2]
400
0
Figure 3
...
012
400
Density
0
...
008
Sample Quantiles
100 200 300
CHAPTER 3
...
000
0
●●
−100
0
100 200 300 400
fit1$res
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●●●●
−2 −1 0
1
2
Theoretical Quantiles
Figure 3
...
factor(crab[,1])))
Analysis of Variance Table
Response: crab[, 2]
Df Sum Sq Mean Sq F value Pr(>F)
as
...
9669 0
...
Residual diagnostic plots: See Figure 3
...
Conclusion: Data are clearly not normally distributed
...
(a) tabulate residual variance in each treatment:
Trt
1
...
Sample var
...
, ˆin )
s21
...
t
s2t
CHAPTER 3
...
Consider a randomization test, or converting the data to ranks for the overall
test of treatment effects
...
Fitted Value plots
For many types of data there is a mean-variance relationship: typically, groups with large means tend to have large variances
...
yˆij = y¯i· (fitted value)
...
24
10
...
64
33
...
64
68
...
39
19
...
01
50
...
44
125
...
These differences will be ‘large’ in group i, if group i has a ‘large’
variance; and ‘small’ in group i, if group i has a ‘small’ variance
...
92
400
CHAPTER 3
...
12: Fitted values versus residuals
Reject H0 : V (ij ) = σ 2 for all i, j if F0 > Ft−1,t(n−1),1−α
Crab data:
F0 =
14, 229
= 2
...
95 = 2
...
05 level
...
Crab data: So the assumptions that validate the use of the F-test are
violated
...
3
...
COMPARING SEVERAL TREATMENTS
93
was that if the noise = Xij1 + Xij2 + · · · was the result of the addition of
unobserved additive, independent effects then by the central limit theorem
ij will be approximately normal
...
Also
note that by the central limit theorem the errors should be approximately
normally distributed
...
82 2
...
91 1
...
01 1
...
75 2
...
16 2
...
30 2
...
COMPARING SEVERAL TREATMENTS
●
●
94
●
●
●
1
●
●
2
●
●
3
●
●
●
●
●
●
●
4
5
6
4
5
6
log crab population
−2 0 2 4 6
site
1
2
3
site
Figure 3
...
COMPARING SEVERAL TREATMENTS
95
●
●
●
●
●
●
●
●
● ●●●●●
−2 −1 0
1
2
Theoretical Quantiles
residual
0
2
●
●●
●
●
●
●
●●
●
●
●
●
●●
●●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
−2
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●● ●
●
−4
−4
Sample Quantiles
−2
0
2
4
●
●●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
4
Normal Q−Q Plot
●
1
...
5
2
...
14: Diagnostics after the log transformation
> anova(lm(log(crab[,2]+1/6)~as
...
factor(crab[, 1])
5 54
...
95 2
...
04604 *
Residuals
144 678
...
71
> anova(lm(log(crab[,2]+1/6)~as
...
factor(crab[, 1])
5 54
...
95 2
...
04604 *
Residuals
144 678
...
71
Other transformations: For data having multiplicative effects , we showed
that
σi ∝ µi ,
CHAPTER 3
...
In general, we may observe:
σi ∝ µαi i
...
the standard deviation of a group depends on the group mean
...
e
...
Consider the class of power transformations, transformations of the
∗
λ
form Yi,j
= Yi,j
...
So if we take λ = 1 − α then
i
we will have stabilized the variances to some extent
...
Estimation of α:
σi ∝ µαi ⇔ σi = cµαi
log σi = log c + α × log µi ,
so log si ≈ log c + α log y¯i·
Thus we may use the following procedure:
(1) Plot log si vs
...
(4) Analyze yij∗ = yij1−αˆ
...
COMPARING SEVERAL TREATMENTS
97
Here are some common transformations:
Mean-var
...
0
1
no transform!
σy ∝
1/2
µi
3/4
µi
∗
yij
yij
1/2
yij
1/2
square root
3/4
1/4
quarter power
yij
σy ∝ µi
1
0
log
log yij
3/2
σy ∝ µi
σy ∝ µ2i
3/2
-1/2
reciproc
...
root
2
-1
reciprocal
σy ∝
=
√
1/2
yij
1/4
−1/2
yij
1/yij
• All the mean-variance relationships here are examples of power-laws
...
• α = 1 is the multiplicative model discussed previously
...
λ
For λ = 0, it’s natural to define the transformation as:
yλ − 1
λ→0
λ
λ
y ln y
=
= ln y
1 λ=0
y ∗(0) = lim y ∗(λ) = lim
λ→0
Note that for a given λ 6= 0 it will not change the results of the ANOVA on
the transformed data if we transform using:
y∗ = yλ
or
y ∗(λ) =
yλ − 1
= ay λ + b
...
COMPARING SEVERAL TREATMENTS
98
To summarize the procedure: If the data present strong evidence of
nonconstant variance,
(1) Plot log si vs
...
If the relationship looks linear, then
(2) Fit a least squares line: lm( log si ∼ log y¯i· )
(3) The slope α
ˆ of the line is an estimate of α
...
This procedure is called the “Box-Cox” transformation (George Box and
David Cox first proposed the method in a paper in 1964)
...
A few words of caution:
• For variance stabilization via the Box-Cox procedure to do much good,
the linear relationship between means and variances should be quite
tight
...
e
...
• Don’t get too carried away with the Box-Cox transformations
...
47
0
...
53 don’t analyze yij = yij , just use yij
...
)
• Remember to make sure that you describe the units of the transformed
data, and make sure that readers of your analysis will be able to understand that the model is additive in the transformed data, but not
in the original data
...
• Try to think about whether the associated non-linear model for yij
makes sense
...
COMPARING SEVERAL TREATMENTS
99
• Don’t assume that the transformation is a magical fix: remember to
look at residuals and diagnostics after you do the transform
...
• Remember that the mean-variance relationship might not be cured by
a transform in the Box-Cox class
...
g
...
d
...
)
• Keep in mind that statisticians disagree on the usefulness of transformations: some regard them as a ‘hack’ more than a ‘cure’:
– It can be argued that if the scientist who collected the data had
a good reason for using certain units, then one should not just
transform the data in order to bang it into an ANOVA-shaped
hole
...
)
• The sad truth: as always you will need to exercise judgment while
performing your analysis
...
Example (Crab data): Looking at the plot of means vs
...
s suggests
α ≈ 1, implying a log-transformation
...
Instead we can use yij∗ = log(yij + 1/6)
...
) For the transformed
data this gives us a ratio of the largest to smallest standard deviation of
approximately 2 which is acceptable based on the rule of 3
...
39
9
...
84
10
...
01
12
...
39
33
...
44
50
...
35
68
...
86
2
...
14
3
...
68
4
...
22
2
...
54
3
...
92
4
...
COMPARING SEVERAL TREATMENTS
●
4
...
0
3
...
5
●
●
●
●
2
...
0
3
...
0
log_mean
Figure 3
...
COMPARING SEVERAL TREATMENTS
lm(formula = log_sd ~ log_mean)
Coefficients:
(Intercept)
log_mean
0
...
9839
101
Chapter 4
Multifactor Designs
4
...
• Treatments:
– Type ∈ {I, II, III}
– Delivery ∈ {A, B, C, D}
• Response: time to death, in minutes
...
But,
which Delivery to use for the experiment testing for Type effects?
which Type to use for the experiment testing for Delivery effects?
To compare different Type×Delivery combinations, we need to do experiments under all 12 treatment combinations
...
e
...
MULTIFACTOR DESIGNS
103
4 assigned to (I, B),
...
It might be helpful to visualize the design as follows:
Type
1
2
3
A
yI,A
yII,A
yIII,A
Delivery
B
C
yI,B
yI,C
yII,B yII,C
yIII,B yIII,C
D
yI,D
yII,D
yIII,D
This type of design is called a factorial design
...
Factors: Categories of treatments
Levels of a factor: the different treatments in a category
So in this case, Type and Delivery are both factors
...
4
...
1
Data analysis:
Lets first look at a series of plots:
Marginal Plots: Based on these marginal plots, it looks like (III, A) would
be the most effective combination
...
But the difference between types I and II seems to depend on delivery
...
Sometimes each group is called a cell
...
Lets take care
of this before we go any further: Plotting means versus standard deviations
on both the raw and log scale gives the relationship in Figure 4
...
Computing
the least squares line gives
CHAPTER 4
...
1: Marginal Plots
...
203
1
...
This suggests a reciprocal transformation, i
...
analyzing
yi,j = 1/ time to death
...
5 shows the mean-variance relationship for the transformed data
...
6)
...
Conversely, looking at Delivery, the experiment is a one
factor ANOVA with 4 treatment levels and 12 reps per treatment
...
MULTIFACTOR DESIGNS
105
B
2
...
5
6
3
...
5
10
4
...
5
A
I
II
III
I
III
D
2
3
4
3
5
4
6
5
7
8
6
9
7
10
C
II
I
II
III
I
Figure 4
...
II
III
106
2
4
6
8
10
12
CHAPTER 4
...
A II
...
B II
...
C II
...
D II
...
0
9
Figure 4
...
●
7
●
●
●
●
●
●
●
−2
...
0
●
log(sds)
−1
...
0
means
5 6
●
3
2
●
●
●
1
...
0
sds
3
...
8
1
...
6
log(means)
Figure 4
...
2
...
MULTIFACTOR DESIGNS
●
●
●
●
●
●
−3
...
2
●
●
●
0
...
04
0
...
4
−3
...
3
0
...
6
●
107
0
...
2
−1
...
4
−1
...
5: Mean-variance relationship for transformed data
...
34877 0
...
621 3
...
30628 0
...
20414 0
...
6401 0
...
45091 0
...
56862 0
...
531 1
...
08643 0
...
MULTIFACTOR DESIGNS
108
10
8
12
10
8
12
●
6
4
2
2
4
6
●
I
II
III
A
B
D
B
0
...
25
0
...
20
0
...
30
0
...
10
0
...
25
0
...
35
0
...
45
C
III
II
III
I
II
III
0
...
2 0
...
4 0
...
A
II
...
A
I
...
B
III
...
C
II
...
C
I
...
6: Plots of transformed poison data
II
...
D
CHAPTER 4
...
The third ANOVA is “valid,” but we’d like a more
specific result: we’d like to know which factors are sources of variation, and
the relative magnitude of their effects
...
e
...
4
...
2
Additive effects model
Yi,j,k = µ + ai + bj + i,j,k ,
i = 1,
...
, t2 , k = 1,
...
, at1 = additive effects of factor 1;
b1 ,
...
Notes:
1
...
2
...
In contrast, the additive model only has
CHAPTER 4
...
Parameter estimation and ANOVA decomposition:
yijk = y¯··· + (¯
yi·· − y¯··· ) + (¯
y·j· − y¯··· ) + (yijk − y¯i·· − y¯·j· + y¯··· )
ˆbj
= µ
ˆ +
a
ˆi
+
+
ˆijk
These are the least-squares parameter estimates, under the sum-to-zero
side conditions:
X
X
a
ˆi =
(¯
yi·· − y¯··· ) = n¯
y··· − n¯
y··· = 0
To obtain the set-to-zero side conditions, add a
ˆ1 and ˆb1 to µ
ˆ, subtract a
ˆ1
from the a
ˆi ’s, and subtract ˆb1 from the ˆbj ’s
...
You should be able to show that these vectors are orthogonal, and so
P P P
P P P 2
P P P ˆ2
P P P 2
ˆi
¯··· )2 =
ˆi +
i
j
k
i
j
k (yijk − y
i
j
ka
i
j
k bi +
SSTotal
=
SSA
+
SSB
+
SSE
CHAPTER 4
...
34877
dat$delivery 3 0
...
10214
---
Mean Sq F value
0
...
708
0
...
982
0
...
Does this
adequately represent what is going on in the data? What do we mean by
additive? Assuming the model is correct, we have:
E[Y |type=I, delivery=A] = µ + a1 + b1
E[Y |type=II, delivery=A] = µ + a2 + b1
This says the difference between Type I and Type II is a1 − a2 regardless of
Delivery
...
MULTIFACTOR DESIGNS
112
• The full model allows differences between Types to vary across levels
of Delivery
• The reduced/additive model says differences are constant across levels
of Delivery
...
, at1 = additive effects of factor 1;
b1 ,
...
(ab)ij = interaction terms = deviations from additivity
...
This is a full model: It fits a separate mean for each treatment combination:
E(Yijk ) = µij = µ + ai + bj + (ab)ij
Parameter estimation and ANOVA decomposition:
yijk = y¯··· + (¯
yi·· − y¯··· ) + (¯
y·j· − y¯··· ) + (¯
yij· − y¯i·· − y¯·j· + y¯··· ) + (yijk − y¯ij· )
ˆ
ˆ
= µ
ˆ +
a
ˆi
+
bj
+
(ab)
+
ˆijk
ij
Deciding between the additive/reduced model and the interaction/full model
ˆ ’s is large or
is tantamount to deciding if the variance explained by the (ab)
ij
not
...
MULTIFACTOR DESIGNS
4
...
3
113
Evaluating additivity:
Recall the additive model
yijk = µ + ai + bj + ijk
• i = 1,
...
, t2 indexes the levels of factor 1
• k = 1,
...
Parameter estimates are obtained via the following decomposition:
yijk = y¯··· + (¯
yi·· − y¯··· ) + (¯
y·j· − y¯··· ) + (yijk − y¯i·· − y¯·j· + y¯··· )
ˆbj
= µ
ˆ +
a
ˆi
+
+
ˆijk
Fitted value:
yˆijk = µ
ˆ+a
ˆi + ˆbj
= y¯i·· + y¯·j· − y¯···
Note that in this model the fitted value in one cell depends on data
from the others
...
e
...
, t2 , and
• y¯i1· − y¯i2· ≈ ˆb1 − ˆb2 for all i = 1,
...
This adequacy can be assessed by looking differences between the sample
means from each cell and the fitted values under the additive model
...
It is called an interaction
...
MULTIFACTOR DESIGNS
114
Interactions and the full model: The interaction terms also can be
derived by taking the additive decomposition above one step further: The
residual in the additive model can be written:
ˆA
¯i·· − y¯·j· + y¯···
ijk = yijk − y
= (yijk − y¯ij· ) + (¯
yij· − y¯i·· − y¯·j· + y¯··· )
I
ˆ
= ˆ + (ab)
ijk
ij
This suggests the following full model, or interaction model
yijk = µ + ai + bj + (ab)ij + ijk
with parameter estimates obtained from the following decomposition:
yijk = y¯··· + (¯
yi·· − y¯··· ) + (¯
y·j· − y¯··· ) + (¯
yij· − y¯i·· − y¯·j· + y¯··· ) + (yijk − y¯ij· )
ˆ ij
ˆ
= µ
ˆ +
a
ˆi
+
bj
+
(ab)
+
ˆijk
Fitted value:
ˆ ij
yˆijk = µ
ˆ+a
ˆi + ˆbj + (ab)
= y¯ij· = µ
ˆij
This is a full model for the treatment means: The estimate of the
mean in each cell depends only on data from that cell
...
Residual:
ˆijk = yijk − yˆijk
= yijk − y¯ij·
Thus the full model ANOVA decomposition partitions the variability among
the cell means y¯11· , y¯12· ,
...
ˆ = y¯ij· − (ˆ
• what is “leftover” : (ab)
µ+a
ˆi + ˆbj )
ij
CHAPTER 4
...
Df Sum Sq Mean Sq F value
pois$deliv 3 0
...
06805 27
...
34877 0
...
708
Residuals 42 0
...
00243
Df
pois$deliv
3
pois$type
2
pois$deliv:pois$type 6
Residuals
36
Sum Sq
0
...
34877
0
...
08643
Mean Sq F value
0
...
3431
0
...
6347
0
...
0904
0
...
10214 = 0
...
08643, that is SSEadd = SSABint + SSEint
• 42 = 6 + 36, that is dof (SSEadd ) = dof (SSABint ) + dof (SSEint )
• SSA, SSB, dof(A), dof(B) are unchanged in the two models
• M SEadd ≈ M SEint , but degrees of freedom are larger in the additive
model
...
MULTIFACTOR DESIGNS
116
2
– E(M SAB) = σ 2 + rτAB
> σ2
...
Under H0 : (ab)ij = 0 ,
FAB = M SAB/M SE ∼ F(t1 −1)×(t2 −1),t1 t2 (r−1)
Evidence against H0 can be evaluated by computing the p-value
...
We may then want to combine them to improve our estimate of σ 2
...
20414
0
...
01571
0
...
06805 28
...
376e-09 ***
0
...
6347 2
...
00262 1
...
3867
0
...
We may want to use the additive model
...
1
...
Now we want to compare means across levels
of one of the factors
...
So we have 4 × 4 = 16 observations for each level of
Type
...
047
CHAPTER 4
...
1
2 2 2 22
11 1
0
...
3
rate
0
...
5
Figure 4
...
• s12 = 0
...
• t-statistic = -1
...
112
...
e
...
For the above example, s12 =
...
00240 ≈ 0
...
65
...
7,
sM SE 2/(4 × 4)
p-value ≈ 0
...
The
relationship between the cell means model and the parameters in the interaction model are as follows:
µij = µ·· + (µi· − µ·· ) + (µ·j − µ·· ) + (µij − µi· − µj· + µ·· )
= µ + ai + bj + (ab)ij
and so
CHAPTER 4
...
1
11 1
118
2 2
2
1
0
...
3
rate
0
...
5
Figure 4
...
•
P
ai = 0
•
P
=0
•
P
i
j bj
i (ab)ij
= 0 for each j,
P
j (ab)ij
= 0 for each i
Suppose we are interested in the difference between F1 = 1 and F1 = 2
...
Note that a1 − a2 is a contrast of treatment
CHAPTER 4
...
MULTIFACTOR DESIGNS
120
• Under the assumption that the data are normally distributed
p
a
ˆ1 − a
ˆ2 ∼ normal(a1 − a2 , σ 2/(r × t2 ))
and is independent of M SE (why?)
• So
(ˆ
a1 − a
ˆ ) − (a1 − a2 )
(ˆ
a1 − a
ˆ2 ) − (a1 − a2 )
q2
=
∼ tν
2
SE(ˆ
a1 − a
ˆ2 )
M SE r×t
2
where ν are the degrees of freedom associated with our estimate of σ 2 ,
i
...
the residual degrees of freedom in our model
...
It is sometimes called the
least significant difference for comparing levels of Factor 1
...
Important note: The LSD depends on which factor you are looking at:
The comparison of levels of Factor 2 depends on
V (¯
y·1· − y¯·2· ) = V (¯
y·1· ) + V (¯
y·2· )
2
= σ /(r × t1 ) + σ 2 /(r × t1 )
= 2σ 2 /(r × t1 )
CHAPTER 4
...
20414
0
...
01571
0
...
06805 28
...
376e-09 ***
0
...
6347 2
...
00262 1
...
3867
0
...
Lets assume
there is no interaction term
...
Df Sum Sq Mean Sq F value
Pr(>F)
pois$deliv 3 0
...
06805 27
...
192e-10 ***
pois$type
2 0
...
17439 71
...
865e-14 ***
Residuals 42 0
...
00243
So there is strong evidence against the hypothesis that the additive effects
are zero for either factor
...
05,
LSDtype = t
...
018 ×
...
018 × 0
...
035
Type
I
II
III
Mean
0
...
23
0
...
2
0
...
4
0
...
1
0
...
2
0
...
4
0
...
MULTIFACTOR DESIGNS
I
II
III
A
B
C
D
Figure 4
...
effects of poison delivery: At αC = 0
...
975,42 × SE(ˆb1 − ˆb2 )
r
2
= 2
...
0024
4×3
= 2
...
02 = 0
...
19
0
...
29
0
...
Interpretation of estimated additive effects: If the additive model is
clearly wrong, can we still interpret additive effects? The full model is
yijk = µij + ijk
CHAPTER 4
...
, at1 } , {b1 ,
...
The additive model is
yijk = µ + ai + bj + ijk
Sometimes this is called the “main effects” model
...
Now,
how do we interpret this effect?
CHAPTER 4
...
• If additivity is correct, a
ˆ1 − a
ˆ2 can further be interpreted as the estimated difference in response between having factor 1 =1 and factor
1=2, for every level of factor 2 in the experiment
...
As we can see, this is not true:
the additive effects have a very definite interpretation under the full model
...
Dissecting the interaction: Sometimes if there is an interaction, we
might want to go in and compare individual cell means
...
y¯11· y¯12· y¯13· y¯14·
y¯21· y¯22· y¯23· y¯24·
A large interaction SS in the ANOVA table gives us evidence, for example,
that µ1j − µ2j varies across levels j of factor 2
...
10
...
Contrasts for examining interactions: Suppose we want to compare
the effect of (factor 1=1) to (factor 1=2) across levels of factor 2
...
MULTIFACTOR DESIGNS
●
●
2
●
●
2
...
2
2
...
3
2
...
4
2
...
1
2
3
4
●
2
...
2
2
...
3
2
...
4
2
...
1
2
...
2
2
...
3
2
...
4
2
...
1
Figure 4
...
CHAPTER 4
...
4
...
Typically, one hopes the e
...
’s are homogeneous or nearly so:
• scientifically: If the units are nearly homogeneous, then any observed
variability in response can be attributed to variability in factor levels
...
But what if the e
...
’s are not homogeneous?
Example: Let F2 be a factor that is a large source of variation/heterogeneity,
but is not recorded (age of animals in experiment, gender, field plot conditions, soil conditions)
...
If a factor
1
...
varies across experimental units
CHAPTER 4
...
If F2 is a known, potentially large source
of variation, we can control for it pre-experimentally with a block design
...
Objective: To have less variation among units within blocks than between
blocks
...
, 6: Level 4 is “standard”
• Response: Nitrogen uptake (ppm×10−2 )
• Experimental material: One irrigated field
Soil moisture is thought to be a source of variation in response
...
Field is divided into a 4 × 6 grid
...
Within each row or block, each of the 6 treatments are randomly
allocated
...
The experimental units are blocked into presumably more homogeneous groups
...
The blocks are complete, in that each treatment appears in each block
...
MULTIFACTOR DESIGNS
128
dry
wet
−−−−−−−−−−−−−− irrigation −−−−−−−−−−−−−−
Figure 4
...
1
5
37
...
18
1
34
...
89
3
42
...
22
3
49
...
85
6
50
...
99
2
46
...
57
3
52
...
61
1
36
...
65
4
40
...
9
4
39
...
29
5
40
...
91
1
39
...
89
4
treatment and data versus location
column
Figure 4
...
MULTIFACTOR DESIGNS
129
3
...
This design is a randomized complete block design
...
32 5 40
...
59
0
...
00 3 65
...
12
Error 108
...
20
Total 506
...
factor(c(trt))
))
Df Sum Sq Mean Sq F value Pr(>F)
as
...
316 40
...
3761 0
...
Residuals
18 305
...
945
#######
40
c(y)
45
50
130
35
35
40
c(y)
45
50
CHAPTER 4
...
factor(c(trt))
6
1
2
3
as
...
52
4
−3
...
3
6
−8
...
7
2
1
2
...
65
4
5
...
92
5
2
...
66
3
6
1
...
91
5
−1
...
34
2
2
...
39
2
−2
...
41
6
0
...
94
3
−3
...
69
1
2
3
4
5
6
row
2
−3
...
13: Marginal plots and residuals
CHAPTER 4
...
factor(c(trt)) + as
...
factor(c(trt)) 5 201
...
263 5
...
004191 **
as
...
004 65
...
1198 0
...
008
7
...
factor(c(trt)):as
...
factor(c(trt)):as
...
33
22
...
00
#######
• Can we test for interaction? Do we care about interaction in this case,
or just main effects? Suppose it were true that “in row 2, timing 6 is
significantly better than timing 4, but in row 3, treatment 3 is better
...
– If we don’t estimate block effects, we’ll have more dof for error
...
– If “row ” is a big source of variation, then ignoring it may lead to
an overly large MSE
...
MULTIFACTOR DESIGNS
132
Consider comparing the F-stat from a CRD with that from an RCB: According to Cochran and Cox (1957)
SSB + r(t − 1)M SErcbd
rt − 1
r(t − 1)
r−1
= M SB
+ M SErcbd
rt − 1
rt − 1
M SEcrd =
In general, the effectiveness of blocking is a function of M SEcrd /M SErcb
...
For the nitrogen example, this ratio
is about 2
...
3
Unbalanced designs
Example: Observational study of 20 fatal accidents
...
R =rainy (rainy/not rainy)
2
...
5
Lets naively compute sums of squares based on the decomposition:
yijk = y¯··· + (¯
yij· − y¯··· ) + (yijk − y¯ij· )
yijk = y¯··· + (¯
yi·· − y¯··· ) + (¯
y·j· − y¯··· ) + (¯
yij· − y¯i·· − y¯·j· + y¯··· ) + (yijk − y¯ij· )
CHAPTER 4
...
5) + 2(5 − 12
...
5)2 + 8(10 − 12
...
5)2 + 10 × (12 − 12
...
5)2 + 10 × (10 − 12
...
But look at the cell means :
• “no rain” - “rain” =5 regardless of interstate
• “interstate” - “two=lane” =10 regardless of rain
There is absolutely no interaction! The problem is that SSR, SSI, SSRI are
not orthogonal (as computed this way) and so SSM 6= SSR + SSI + SSRI
...
• Cell means show, for both road types, speeds were higher, by 5 mph
on avg , on clear days
...
• Marginal means for rain are dominated by interstate accidents
• Marginal means for no rain are dominated by two-lane accidents
We say that the marginal effects are unbalanced
...
MULTIFACTOR DESIGNS
134
Solution: Least-squares means
...
get estimates for each cell
2
...
1
2
1
y¯11·
y¯21·
...
2 ···
y¯12· · · ·
y¯22· · · ·
...
t2
y¯1t2 ·
y¯2t2 ·
µ
ˆ1·
µ
ˆ2·
t1 y¯t1 1· y¯t1 2· · · · y¯t1 t2 · µ
ˆt1 ·
µ
ˆ·1 µ
ˆ·2 · · · µ
ˆ·t2
Accident example:
interstate
rainy
15
not rainy
20
marginal mean
16
LS mean
17
...
5
marginal mean
13
12
LS mean
10
15
Comparison of means can be made in the standard way:
CHAPTER 4
...
Testing for interaction: Consider
H0 : Yijk = µ + ai + bj + ijk
H1 : Yijk = µ + ai + bj + (ab)ij + ijk
How do we evaluate H0 when the data are unbalanced? I’ll first outline the
procedure, then explain why it works
...
Compute SSEf = minµ,a,b,(ab) i j k (yijk − [µ + ai + bj + (ab)ij ])2
P P P
2
...
Compute SSI = SSEr − SSEf
...
Allowing for interaction improves the fit, and reduces error variance
...
If SSI is large, i
...
SSEr is much bigger than SSEf , this suggests the additive model does not fit well and the
interaction term should be included in the model
...
CHAPTER 4
...
A painful example: A small scale clinical trial was done to evaluate the
effect of painkiller dosage on pain reduction for cancer patients in a variety
of age groups
...
> table(trt,ageg)
ageg
trt 50 60 70 80
1 1 2 3 4
2 1 3 3 3
3 2 1 4 3
> tapply(y,trt,mean)
1
2
3
0
...
95 -2
...
200000 -1
...
920000 -0
...
15;
−2
0
2
137
−4
−4
−2
0
2
CHAPTER 4
...
14: Marginal plots for pain data
• compute the LS means and compare to marginal means
...
90 1
...
066667 0
...
40 -3
...
266667 0
...
15 -1
...
150000 -1
...
4395833 -0
...
0916667
> age_lsm
50
60
70
80
-1
...
0000000 -0
...
2305556
70
80
138
−4
−2
0
2
CHAPTER 4
...
50
3
...
60
1
...
70
2
...
15: Interaction plots for pain data
What are the differences between LS means and marginal means? Not as
extreme as in the accident example, but the differences can be explained by
looking at the interaction plot, and the slight imbalance in the design:
• The youngest patients (ageg=50) were imbalanced towards the higher
dose, so we might expect their marginal mean to be too low
...
2 to the LS mean = -1
...
• The oldest patient (ageg=80) were imbalanced towards the lower dose,
so we might expect their marginal mean to be too high
...
14 to the LS mean = -
...
Lets look at the main effects in our model:
> trt_coef<- trt_lsm-mean(cellmeans)
> age_coef<- age_lsm-mean(cellmeans)
> trt_coef
1
0
...
5826389 -1
...
MULTIFACTOR DESIGNS
50
60
-0
...
02569444
70
0
...
74375000
What linear modeling commands in R will get you the same thing?
> options(contrasts=c("contr
...
poly"))
> fit_full<-lm( y~as
...
factor(trt))
> fit_full$coef[2:4]
as
...
factor(ageg)2 as
...
90902778
-0
...
19097222
> fit_full$coef[5:6]
as
...
factor(trt)2
0
...
5826389
Note that the coefficients in the reduced/additive model are not the same:
> fit_add<-lm( y~as
...
factor(trt))
>
> fit_add$coef[2:4]
as
...
factor(ageg)2 as
...
7920935
-0
...
3070743
> fit_add$coef[5:6]
as
...
factor(trt)2
1
...
001899274
4
...
1
Non-orthogonal sums of squares:
Consider the following ANOVA table obtained from R:
Df Sum Sq Mean Sq F value Pr(>F)
as
...
355
4
...
9606 0
...
factor(trt)
2 28
...
127 3
...
06613
...
230
4
...
factor(trt)+as
...
factor(trt)
2 31
...
794 3
...
0498 *
as
...
021
3
...
7207 0
...
230
4
...
MULTIFACTOR DESIGNS
140
Where do these sums of squares come from? What do the F-tests represent? By typing “?anova
...
That is, the
reductions in the residual sum of squares as each term of the formula
is added in turn are given in as the rows of a table, plus the residual
sum of squares
...
For example, let
• A be the main effects of factor 1
• B be the main effects of factor 2
• C be their interaction
...
Calculate SS0 = residual sum of squares from the model
(0) yijk = µ + ijk
1
...
Calculate SS2 = residual sum of squares from the model
(AB) yijk = µ + ai + bj + ijk
3
...
MULTIFACTOR DESIGNS
141
This is actually what R presents in an ANOVA table:
>
>
>
>
ss0<-sum(
ss1<-sum(
ss2<-sum(
ss3<
lm( y~1 )$res^2 )
lm( y~as
...
factor(ageg)+as
...
3554
>
> ss1-ss2
[1] 28
...
75015
> ss3
[1] 57
...
factor(ageg)*as
...
factor(ageg)
3 13
...
452 1
...
27688
as
...
254 14
...
4239 0
...
factor(ageg):as
...
750
8
...
8054 0
...
480
3
...
• In an unbalanced design, the estimates of one set of parameters depends
on whether or not you are estimating the others, i
...
they are not
orthogonal, and in general SSA 6= SSA|B , SSB 6= SSB|A
...
The bottom line: For unbalanced designs, there is no “variance due to
factor 1” or “variance due to factor 2”
...
This is
essentially because of the non-orthogonality, and so the part of the variance
CHAPTER 4
...
This will become more clear when you get to regression next
quarter
...
4
Analysis of covariance
Example(Oxygen ventilation): Researchers are interested in measuring
the effects of exercise type on maximal oxygen uptake
...
Design: Six subjects randomly selected to a 12-week step aerobics program,
the remaining to 12-weeks of outdoor running on flat terrain
...
y = change in O2 uptake
Initial analysis: CRD with one two-level factor
...
The first panel of Figure 4
...
The second thing to do is a twosample t-test:
y¯ − y¯ 7
...
767)
A
B
= 2
...
907) = 0
...
239 ×
...
However, a plot of residuals versus age of the subject indicates a couple of
causes for concern with our inference:
1
...
2
...
CHAPTER 4
...
16: Oxygen uptake data
Question: Is the observed difference in y¯A , y¯B due to
• treatment?
• age?
• both?
A linear model for ANCOVA: Let yi,j be the response of the jth subject
in treatment i:
yi,j = µ + ai + b × xi,j + i,j
This model gives a linear relationship between age and response for each
group:
if i = A, Yi =
i = B, Yi =
intercept
slope
(µ + aA ) + b × xi,j
(µ + aB ) + b × xi,j
+
+
error
i,j
i,j
Unbiased parameter estimates can be obtained by minimizing the residual
sum of squares:
X
(ˆ
µ, a
ˆ, ˆb) = arg min
(yi,j − [µ + ai + b × xi,j ])2
µ,a,b
i,j
CHAPTER 4
...
17: ANOVA and ANCOVA fits to the oxygen uptake data
The fitted model is shown in the second panel of Figure 4
...
Variance decomposition Consider the following two ANOVAs:
> anova(lm(o2_change~grp))
Df Sum Sq Mean Sq F value Pr(>F)
grp
1 328
...
97 8
...
01565 *
Residuals 10 389
...
93
> anova(lm(o2_change~grp+age))
Df Sum Sq Mean Sq F value
Pr(>F)
grp
1 328
...
97 42
...
0001133 ***
age
1 318
...
91 40
...
0001274 ***
Residuals 9 70
...
82
The second one decomposes the variation in the data that is orthogonal to
treatment (SSE from the first ANOVA) into a parts that can be ascribed to
age (SS age in the second ANOVA), and everything else (SSE from second
ANOVA)
...
Now consider two other ANOVAs:
CHAPTER 4
...
09 576
...
519 8
...
18
14
...
09 576
...
6594 1
...
79
71
...
1788
0
...
39
7
...
Blocking, ANCOVA and unbalanced designs: Suppose we have a factor of interest (say F 1) that we will randomly assign to experimental material,
but it is known that there is some nuisance factor (say F 2 ) that is suspected
to be a large source of variation
...
As a result, we can separately estimate the effects
of F 1 and F 2
...
– ANCOVA allows for more precise control of variation due to F 2
...
This is primarily a problem for experiments with small sample sizes:
The probability of design imbalance decreases as a function of sample size
(as does the correlation between the parameter estimates) as long as F 1 is
randomly assigned
...
1
Nested Designs
Example(Potato): Sulfur added to soil kills bacteria, but too much sulfur
can damage crops
...
Factors of interest:
1
...
Sulfur additive ∈ {low,high}
Experimental material: Four plots of land
...
Experimental Design: A Split-plot design
1
...
2
...
Each potato type was
randomly assigned to two subplots per plot
...
NESTED DESIGNS
147
L
A B
A B
H
B A
A B
H
B A
A B
L
B A
A B
Randomization:
Sulfur type was randomized to whole plots;
Potato type was randomized to subplots
...
• 8 responses for each potato type
• 8 responses for each sulfur type
• 4 responses for each potato×type combination
> fit
...
full)
Df Sum Sq Mean Sq
type
1 1
...
48840
sulfur
1 0
...
54022
type:sulfur 1 0
...
00360
Residuals
12 1
...
11070
fit
...
4459 0
...
8803 0
...
0325 0
...
add)
Df Sum Sq Mean Sq F value
Pr(>F)
type
1 1
...
48840 14
...
00216 **
sulfur
1 0
...
54022 5
...
03893 *
Residuals 13 1
...
10246
5
...
5
148
4
...
5
5
...
5
CHAPTER 5
...
5
5
...
5
sulfur
high
...
A
high
...
1: Potato data
...
B
CHAPTER 5
...
4
Sample Quantiles
0
...
2 0
...
add$res
0
...
2
●
●
● ●
●
●
●
●
●
●
●
−0
...
4
●
−2
●
●
●
●
−1
0
1
Theoretical Quantiles
2
●
●
● ●●
●
●
●
●
●
4
...
8 5
...
2
fit
...
4
Figure 5
...
Randomization test: Consider comparing the observed outcome to the
population of other outcomes that could’ve occurred under
• different treatment assignments;
• no treatment effects
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
NESTED DESIGNS
150
Number of possible treatment assignments:
• 42 = 6 ways to assign sulfur
• For each sulfur assignment there are
4 4
2
= 1296 type assignments
So we have 7776 possible treatment assignments
...
t
...
s
...
sim<-rep( sample(c("low","low","high","high")),rep(4,4))
t
...
sim<-anova( lm(y~as
...
sim)+as
...
sim)) )
F
...
null<-c(F
...
null,fit
...
s
...
s
...
sim[2,4])
}
> mean(F
...
null>=F
...
obs)
[1] 0
...
s
...
s
...
352
What happened?
rand
anova1
≈ F1,13 ⇒ prand
Ftype
type ≈ ptype
rand
anova1
Fsulfur
6≈ F1,13 ⇒ prand
sulfur 6≈ psulfur
The F-distribution approximates a null randomization distribution if treatments are randomly assigned to units
...
• The precision of an effect is related to the number of independent
treatment assignments made
151
0
...
00
0
...
05
0
...
2 0
...
4
0
...
NESTED DESIGNS
0
5
10
15
F
...
null
20
0
5
10
F
...
null
15
Figure 5
...
It
is difficult to tell the difference between sulfur effects and field effects
...
Note:
• From the point of view of type alone, the design is a RCB
...
We have 2 observations of each
type per block
...
– Degrees of freedom breakdown for the sub-plot analysis:
Source
dof
block=whole plot 3
type
1
subplot error
11
subplot total
15
• From the point of view of sulfur alone, the design is a CRD
...
NESTED DESIGNS
152
– Each whole plot is an experimental unit
...
– Degrees of freedom breakdown for the whole-plot analysis:
Source
dof
sulfur
1
whole plot error 2
whole plot total 3
• Recall, degrees of freedom quantify the precision of our estimates and
the shape of the null distribution
...
• We compare the MS of a factor to the variation among the experimental
units of that factor’s level
...
• The level of an interaction is the level of the smallest experimental unit
involved
...
factor(field) ) )
Df Sum Sq Mean Sq F value
Pr(>F)
type
1 1
...
48840 54
...
326e-05 ***
as
...
59648 0
...
5575 0
...
00360 0
...
1323 0
...
27210 0
...
CHAPTER 5
...
factor(field)))
Df Sum Sq Mean Sq F value Pr(>F)
sulfur
1 0
...
54022 3
...
07936
...
factor(field) 2 1
...
52813 3
...
05989
...
76410 0
...
e
...
MSS<-anova(lm(y~sulfur+as
...
factor(field)))[2,3]
F
...
sulfur
[1] 1
...
sulfur,1,2)
[1] 0
...
5
...
1
Mixed-effects approach
What went wrong with the normal sampling model approach? What is wrong
with the following model?
yijk = µ + ai + bj + (ab)ij + ijk
where
• i indexes sulfur level, i ∈ {1, 2} ;
• j indexes type level, i ∈ {1, 2} ;
• k indexes reps, i ∈ {1,
...
i
...
normal
154
l
h
hh
h
l
l
l
l
l
h
h
h
h
−0
...
0 0
...
4
CHAPTER 5
...
0
1
...
0
2
...
0
l
3
...
0
Figure 5
...
What about independence? Figure 5
...
The figure indicates that residuals are
more alike within a field than across fields, and so observations within a field
are positively correlated
...
dependence within whole-plots
• affects the amount of information we have about factors applied at
the whole-plot level: within a given plot, we can’t tell the difference
between plot effects and whole-plot factor effects
...
If residuals within a whole-plot are positively correlated, the most intuitively straightforward way to analyze such data (in my opinion) is with a
hierarchical mixed-effects model:
yijkl = µ + ai + bj + (ab)ij + γik + ijkl
CHAPTER 5
...
e
...
The index k represents whole-plot reps k = 1,
...
e
...
, r2 ,
{ijkl } ∼ normal(0, σs )
Now every subplot within the same wholeplot has something in common, i
...
γik
...
To use this command, you need the nlme package:
library(nlme)
fit
...
factor(field))
>summary(fit
...
Error DF
t-value p-value
(Intercept) 4
...
2599650 11 18
...
0000
typeB
0
...
0791575 11 7
...
0000
sulfurlow
-0
...
3633602 2 -1
...
4183
> anova(fit
...
2946 <
...
3848 <
...
0229 0