Statistics | More Info | Notesale | Buy and Sell Study Notes Online | Extra Student Income | University Notes

Search for notes by fellow students, in your own course and all over the country.

Browse our notes for titles which look like what you need, you can preview any of the notes via a sample of the contents. After you're happy these are the notes you're after simply pop them into your shopping cart.

My Basket

Buy These Notes

You have nothing in your shopping cart yet.

Title: Statistics
Description: This is a statistics PDF.

Buy These Notes Preview

Document Preview

Extracts from the notes are below, to see the PDF you'll receive please use the links above

Statistics 502 Lecture Notes
Peter D
...
1 Induction
...
2 Model of a process or system
...
3 Experiments and observational studies
1
...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

1
1
1
2
6

2 Comparing Two Treatments
8
2
...
9
2
...
11
2
...
16
2
...
17
2
...
18
2
...
21
2
...
23
2
...
29
2
...
36
2
...
1 The non-central t-distribution
...
9
...
40
2
...
43
2
...
1 Two-sample t-test with unequal variances
...
10
...
t(yA , yB )

...
1 Introduction to ANOVA
...
1
...

3
...
2 Model Fitting
...
1
...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...
1
...

3
...
5 The ANOVA table
...
1
...

3
...
7 Unbalanced Designs
...
1
...

3
...
9 Sampling distribution of the F -statistic
3
...
10 Comparing group means
...
1
...

Treatment Comparisons
...
2
...

3
...
2 Orthogonal Contrasts
...
2
...

Model Diagnostics
...
3
...

3
...
2 Variance stabilizing transformations
...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

57
60
63
65
68
70
73
75
78
78
80
83
85
86
92

4 Multifactor Designs
4
...

4
...
1 Data analysis:
...
1
...

4
...
3 Evaluating additivity:
...
1
...
2 Randomized complete block designs
...
3 Unbalanced designs
...
3
...

4
...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...
2

3
...
1 Nested Designs
...
1
...
153
5
...
2 Repeated measures analysis:
...
1
2
...
3
2
...
5
2
...
7
2
...
9
2
...
11

Wheat yield distributions
...

The superpopulation model
...

The t-distribution
...
05 rejection region
...

Randomization distribution for the t-statistic
...

Critical regions and the non-central t-distribution
...

2
...

3
...
2
3
...
4
3
...
6
3
...
8
3
...
10
3
...
12
3
...
14

...

...

...

...

...

...
46

Bacteria data
...

Coagulation data
...

Normal-theory and randomization distributions of the F -statistic
Power
...

Yield-density data
...

Crab residuals
...

Data and log data
...

iii

10
15
20
25
26
28
32
33
39
41

50
58
69
72
73
77
77
81
88
89
90
92
94
95

LIST OF FIGURES

iv

3
...
100
4
...
2
4
...
4
4
...
6
4
...

...

...

...
8
4
...
10
4
...
12
4
...
14
4
...
16
4
...

Conditional Plots
...

Mean-variance relationship
...

Plots of transformed poison data
...

Comparison between types I and II, with delivery in color
...

Three datasets exhibiting non-additive effects
...

Results of the experiment
...

Marginal plots for pain data
...

Oxygen uptake data
...

...

...

...

...

...

...
1
5
...
3
5
...

Diagnostic plots
Potato data
...

...

...

148
149
151
154

...

...

potato

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

Chapter 1
Research Design Principles
1
...

Example (survey): Do you favor increasing the gas tax for public transportation?
• Specific cases: 200 people called for a telephone survey
• Inferential goal: get information on the opinion of the entire city
...
Some took hormones, others did not
...

1
...
RESEARCH DESIGN PRINCIPLES

2

Input variables consist of
controllable factors: measured and determined by scientist
uncontrollable factors: measured but not determined by scientist
noise factors: unmeasured, uncontrolled factors (experimental variability
or “error”)
For any interesting process, there are inputs such that:
variability in input → variability in output
If variability in an input factor x leads to variability in output y, we say x
is a source of variation
...

1
...
It may be hard to say what is input and what
is output
...

Example (Women’s Health Initiative, WHI):
• Population: Healthy, post-menopausal women in the U
...

• Input variables:
1
...
demographic variables (age, race, diet, family history,
...
unmeasured variables (?)
• Output variables
1
...
MI)

CHAPTER 1
...
invasive breast cancer
3
...
Observational population: 93,676 women enlisted starting in 1991,
tracked over eight years on average
...

2
...

3
...

Experimental Study (WHI randomized controlled trial):
1
...
e
...

age group
1 (50-59) 2 (60-69) 3 (70-79)
clinic 1
n11
n12
n13
2
n21
n22
n23

...

...

...

...

ni,j = # of women in study, in clinic i and in age group j
= # of women in block i, j

CHAPTER 1
...

Question: Why did they randomize within a block ?
2
...
Women on treatment had higher incidence of
• CHD
• breast cancer
• stroke
• pulmonary embolism
and lower incidence of
• colorectal cancer
• hip fracture
3
...
That is, our inductive inference
is
(specific) Higher CHD rate in treatment population than control
suggests
(general) if everyone in the population were treated, the incidence
rate of CHD would increase
...
RESEARCH DESIGN PRINCIPLES

5

Observational study

X2
cause

cause

...

...

...

...

Randomization can eliminate correlation between x1 and y due to a
different cause x2 , aka a confounder
...
RESEARCH DESIGN PRINCIPLES

1
...
Identify research hypotheses to be tested
...
Choose a set of experimental units, which are the units to which
treatments will be randomized
...
Choose a response/output variable
...
Determine potential sources of variation in response:
(a) factors of interest
(b) nuisance factors
5
...
Decide on the experimental procedure and how treatments are to be
randomly assigned
...

Three principles in Experimental Design
1
...

Replicates are runs of an experiment or sets of experimental units that
have the same values of the control variables
...
, n
...
e
...
( larger n → more evidence )
...
Randomization: Random assignment of treatments to experimental units
...
Makes confounding unlikely
...
RESEARCH DESIGN PRINCIPLES

7

3
...
The goal is to evenly distribute treatments across large potential
sources of variation
...

Experimental units: three plots of land, each to be divided into a 2 × 2 grid
...

Plot 1

bad soil

Plot 2

Plot 3

good soil

Chapter 2
Comparing Two Treatments
Example: Wheat yield
Factor of interest: Fertilizer type, A or B
...

Question: Is one fertilizer better than another, in terms of yield?
Experimental material: One plot of land to be divided into 2 rows of 6
subplots
...
Design question: How to assign treatments/factor levels to the plots?
Want to avoid confounding a treatment effect with another potential source
of variation
...
Potential sources of variation: Fertilizer, soil, sun, water
...
Implementation: If we assign treatments randomly, we can avoid
any pre-experimental bias in results: 12 playing cards, 6 red, 6 black were
shuffled and dealt
...

...

8

CHAPTER 2
...
Results:
A
A
B
B
A
B
26
...
4 26
...
7 25
...
5
B
B
A
A
B
A
14
...
9 16
...
1 24
...
6
How much evidence is there that fertilizer type is a source of yield variation?
Evidence about differences between two populations is generally measured by
comparing summary statistics across the two sample populations
...

2
...

Location:
• sample mean or average : y¯ =

1
n

Pn

i=1

yi

• sample median : qˆ(1/2) is a/the value y(1/2) such that
#(yi ≤ y(1/2) )/n ≥ 1/2

#(yi ≥ y(1/2) )/n ≥ 1/2

To find the median, sort the data in increasing order, and call
these values y(1) ,
...
If there are no ties, then
if n is odd, then y( n+1 ) is the median;
2
if n is even, then all numbers between y( n2 ) and y( n+1 ) are
2
medians
...
COMPARING TWO TREATMENTS

10

0

5

10

15
y

20

25

0
...
02
0
...
00

0
...
00

0
...
02

F(y)
0
...
6

Density
0
...
8

0
...
0

Histogram of y

10

15

20
y

25

30

5 10
20
30
N = 12 Bandwidth = 2
...
06
10

15

20
yA

25

30

0

5

10

15
y

20

25

0
...
0

0
...
00 0
...
02

Histogram of yB

Density
0
...
4 0
...
8

Density
0
...
05

1
...
133

Figure 2
...
025) , y(
...
9,11
...
3,16
...
1,19
...
6,23
...
5,14
...
9,24
...
COMPARING TWO TREATMENTS

> mean(yA)
[1] 20
...
53333

> median(yA)
[1] 20
...
712676
> sd(yB)
[1] 5
...
25,
...
275 24
...
25,
...
350 26
...

Would you recommend B over A for future plantings?
Do you think these results generalize to a larger population?

2
...

11

CHAPTER 2
...

A statistical hypothesis test evaluates the plausibility of H0 in light of
the data
...
We can evaluate H0 by
answering the following questions:
• Is a mean difference of 2
...
4 large compared to experimental noise?
To answer the above, we need to compare
{|¯
yB − y¯A | = 2
...

Hypothetical values of |¯
yB − y¯A | that could have been observed under H0 are
referred to as samples from the null distribution
...
, Y6,A }, {Y1,B ,
...

This is a function of the outcome of the experiment
...
Since
we will use it to perform a hypothesis test, we will call it a test statistic
...
9, 11
...
, 24
...
4 = gobs
Hypothesis testing procedure: Compare gobs to g(YA , YB ) for values of
YA and YB that could have been observed, if H0 were true
...
Cards were shuffled and dealt R, R, B, B,
...
COMPARING TWO TREATMENTS

13

2
...
9 11
...
6 23
...
3 28
...
2 17
...
5 21
...
3 19
...
Cards are shuffled and dealt B, R, B, B,
...
Crops were grown and wheat yields obtained:
B
A
B
B
A
A
26
...
4 26
...
7 25
...
5
A
B
B
A
A
B
14
...
9 16
...
1 24
...
6
Under this hypothetical treatment assignment,
(YA , YB ) = {11
...
3,
...
6}
|Y¯B − Y¯A | = 1
...

CHAPTER 2
...

Under our randomization scheme, there were

12!
12
=
= 924
6!6!
6
equally likely ways the treatments could have been assigned
...
, g924 }
This enumerates all potential pre-randomization outcomes of our test
statistic, assuming no treatment effect
...

Pr(g(YA , YB ) ≤ x|H0 ) =

#{gk ≤ x}
924

This distribution is sometimes called the randomization distribution, because it is obtained by the randomization scheme of the experiment
...
4|H0 ) = 0
...
4 or more is not unlikely under the null hypothesis
...
Generically, a p-value is
“The probability, under the null hypothesis, of obtaining a result as or more
extreme than the observed result
...
00

0
...
04
0
...
10
0
...
12

CHAPTER 2
...
2: Randomization distribution for the wheat example
Approximating a randomization
distribution We don’t want to have

n
to enumerate all n/2 possible treatment assignments
...

(b) compute the value of the test statistic, given the simulated treatment
assignment and under H0
...
, gNsim } approximates the null distribution :
#(|gk |) ≥ 2
...
4|H0 )
Nsim
The approximation improves if Nsim increased
...
9,11
...
6,23
...
3,28
...
2,17
...
5,21
...
3,19
...
COMPARING TWO TREATMENTS
g<-real()
for(nsim in 1:5000) {
xsim<-sample(x)
g[nsim]<- abs ( mean(y[xsim=="B"]) - mean(y[xsim=="A"] )
}

2
...
, yn }:
1
...

g(y) should be chosen so that it can differentiate between H0 and H1
...
Obtain a null distribution : A probability distribution over the possible outcomes of g(Y) under H0
...
, Yn } are potential
experimental results that could have happened under H0
...
Compute the p-value: The probability under H0 of observing a test
statistic g(Y) as or more extreme than the observed statistic g(y)
...

If the p-value is large ⇒ not evidence against H0
Even if we follow these guidelines, we must be careful in our specification of
H0 , H1 and g(Y) for the hypothesis testing procedure to be useful
...

• What does the p-value say about the probability that the null hypothesis is true? Try using Bayes rule to figure this out
...
COMPARING TWO TREATMENTS

2
...

action
accept H0
reject H0

truth
H0 true
H0 false
correct decision
type II error
type I error
correct decision

As we discussed
• The p-value can measure of evidence against H0 ;
• The smaller the p-value, the larger the evidence against H0
...
Compute the p-value by comparing observed test-stat to the null distribution;
2
...

This procedure is called a level-α test
...

Pr(type I error|H0 ) = Pr(reject H0 |H0 )
= Pr(p-value ≤ α|H0 )
= α
Single Experiment Interpretation: If you use a level-α test for your experiment where H0 is true, then before you run the experiment
there is probability α that you will erroneously reject H0
...

CHAPTER 2
...
Typically we need to be more specific
than “H0 false” to calculate the power
...
5

Relating samples to (super)populations

If the experiment is
• large
• complicated
• non- or partially randomized
then a null distribution based on randomization may be difficult to obtain
...

Consider the following model for our experiment:
• There is a large/infinite population of plots of similar size/shape/composition
as the plots in our experiment
...
COMPARING TWO TREATMENTS

19

• The plots that received A in our experiment can be viewed as independent samples from pA , and likewise for plots receiving B
...
, YnA ,A ∼ i
...
d
...
, YnB ,B ∼ i
...
d
...
Furthermore, if Y1,A ,
...
In general, as the
sample sizes increases
Y¯A → µA
s2A → σA2
#{Yi,A ≤ x}
= FˆA (x) → FA (x) =
nA

Z

x

pA (x)dx
−∞

CHAPTER 2
...
06

0
...
13

µA

0
...
00

0
...
02

0
...
04

sA=5
...
53

µB

0
...
02

0
...
04

sB=5
...
02
0
...
06

0
...
3: The superpopulation model

20
yB

25

30

35

CHAPTER 2
...
For example, if µB > µA , we
would recommend B over A, and vice versa
...
53 = y¯B > y¯A = 20
...
How much evidence is it? Should we
reject the null hypothesis? Consider our test statistic:
g(YA , YB ) = Y¯A − Y¯B
To decide whether to reject or not, we need to know the distribution of
g(YA , YB ) under H0
...
, YnA ,A ∼ i
...
d
...
, YnB ,B ∼ i
...
d
...

This will involve assumptions about/approximations to pA and pB
...
6

Normal distribution

The normal distribution is useful because in many cases,
• our data are approximately normally distributed, and/or
• our sample means are approximately normally distributed
...
COMPARING TWO TREATMENTS

22

These are both due to the central limit theorem:


X1 ∼ P1 (µ1 , σ1 )


X
m
qX

X
X2 ∼ P2 (µ2 , σ2 )
σj2
⇒
Xi ∼normal
˙
µj ,

...


j=1

Xm ∼ Pm (µm , σm ) 
Sums of varying quantities are approximately normally distributed
...

Yi = a1 × seedi + a2 × soili + a3 × wateri + a4 × suni + · · ·
The empirical distribution of crop yields from a population of fields with
varying quantities of seed, soil, water, sun,
...
and the variability
of seed, soil, water, sun,
...
, yn ∼ i
...
d
...
, yn ∼ i
...
d
...

...
, yn

∼ i
...
d
...
, y¯(m) } will look approximately normally distributed
with
sample mean {y (1) ,
...
, y (m) } ≈ σ 2 /n
i
...
the sampling distribution of the mean is approximately normal(µ, σ 2 /n),
even if the sampling distribution of the data are not normal
...
COMPARING TWO TREATMENTS

23

Basic properties of the normal distribution:
• Y ∼ normal(µ, σ) ⇒ aY + b ∼normal(aµ + b, |a|σ)
...
, Yn ∼ i
...
d
...

How does this help with hypothesis testing? Consider testing H0 :
µA = µ = µB (trt doesn’t affect mean)
...
So if we knew the variances, we’d have a null
distribution
...
7

Introduction to the t-test

Consider a simple one-sample hypothesis test:
Y1 ,
...
i
...
P , with mean µ and variance σ 2
...

• Y¯ would make a good test stat: it is sensitive to deviations from H0
...
COMPARING TWO TREATMENTS

24

– Y¯ is approximately normal
...
Is
f (Y) a statistic?
• Problem: Don’t usually know σ 2
...

One-sample t-statistic:
t(Y) =

Y¯ − µ0
√
s/ n

What is the null distribution of t(Y)? It seems that
s ≈ σ so

Y¯ − µ0
Y¯ − µ0
√ ≈
√
s/ n
σ/ n

so t(Y) ≈ normally distributed
...

χ2 distribution
Z1 ,
...
i
...
normal(0, 1) ⇒

X

Zi2 ∼ χ2n , chi-squared dist with n degrees of freedom

X

¯ 2 ∼ χ2n−1
(Zi − Z)

Y1 ,
...
i
...
normal(µ, σ) ⇒ (Y1 − µ)/σ,
...
i
...
normal(0, 1)
1 X
(Yi − µ)2 ∼ χ2n
⇒
σ2
1 X
(Yi − Y¯ )2 ∼ χ2n−1
σ2
¯
...
In fact, it is has a singular multivariate normal

0
...
COMPARING TWO TREATMENTS

25

0
...
04

n=9
n=10
n=11

0

5

10

15
X

20

25

30

Figure 2
...
Which vector do you
¯
...
, Zn ) or (Z1 − Z,
Getting back to the problem at hand,
n−1 2 n−1 1 X
s =
(Yi − Y¯ )2 ∼ χ2n−1 ,
σ2
σ2 n − 1
a known distribution we can look up on a table
...
, Yn ∼ i
...
d
...
3

0
...
COMPARING TWO TREATMENTS

0
...
1

p(t)
0
...
5: The t-distribution
•
•

√

n(Y¯ − µ)/σ ∼normal(0,1)

n−1 2
s
σ2

∼ χ2n−1

• Y¯ , s2 are independent
...

σ2

Then
√ ¯
Z
n(Y − µ)/σ
p
= q
n−1 2
X/(n − 1)
s /(n − 1)
σ2
=

Y¯ − µ
√ ∼ tn−1
s/ n

This is still not a statistic, as µ is unknown, but under a specific hypothesis,
like H0 : µ = µ0 , it is a statistic:
t(Y) =

Y¯ − µ0
√ ∼ tn−1 if E(Y ) = µ0
s/ n

It is called the t-statistic
...
COMPARING TWO TREATMENTS

27

• What happens to the distribution of t(Y) when n → ∞? Why?
• What if the data are not normal?
√
– What is the distribution of n(Y¯ − µ)/σ for large and small n?
– What is the distribution of n−1
s2 for large and small n?
σ2
√
– Are n(Y¯ − µ) and s2 independent for small n? What about for
large n?
Two-sided, one-sample t-test:
1
...
, Yn ∼ i
...
d
...
Null hypothesis: H0 : µ = µ0
3
...
Test statistic: t(Y) = n(Y¯ − µ0 )/s
• Pre-experiment we think of this as a random variable, an unknown
quantity that is to be randomly sampled from a population
...

5
...

6
...

p-value =
=
=
=
=

Pr(|t(Y)| ≥ |t(y)||H0 )
Pr(|Tn−1 | ≥ |t(y)|)
2 × Pr(Tn−1 ≥ |t(y)|)
2 ∗ (1 − pt(tobs, n − 1))
t
...
Level-α decision procedure: Reject H0 if

28

0
...
1

density
0
...
3

CHAPTER 2
...
6: A t8 null distribution and α = 0
...
05, t(n−1),1−α/2 ≈ 2 )
...

Is this a level-α test?
Confidence intervals via hypothesis tests: Recall that
• H0 : E(Y ) = µ0 is rejected if
√
n|(¯
y − µ0 )/s| ≥ t1−α/2
• H0 : E(Y ) = µ0 is not rejected if
√
n|(¯
y − µ0 )/s| ≤ t1−α/2
1
|¯
y − µ0 | ≤ √ s × t1−α/2
n
s
s
y¯ − √ × t1−α/2 ≤ µ0 ≤ y¯ + √ × t1−α/2
n
n

CHAPTER 2
...
Otherwise
it is in the rejection region
...
This
interval contains only those values of µ that are not rejected by this level-α
test
...
gather data;
2
...

Further suppose H0 : E(Y ) = µ0 is true
...
It is
• the pre-experimental probability that your confidence interval will cover
the true value;
• the large sample fraction of experiments in which the confidence interval
covers the true mean
...
8

Two sample tests

Recall the wheat example:
B
A
B
B
A
A
26
...
4 26
...
7 25
...
5
A
B
B
A
A
B
14
...
9 16
...
1 24
...
6

CHAPTER 2
...
, YnA A ∼
Y1B ,
...
i
...
normal(µA , σ)
i
...
d
...

In addition to normality we assume for now that both variances are equal
...

Hence if H0 is true then
Y¯B − Y¯A ∼ N

r

0, σ

1
1
+
nA nB

...
Show that (nA + nB − 2)s2p /σ 2 ∼ χ2nA +nB −2 (recall how the χ2 distribution was defined )
...
Show that the two-sample t-statistic has a t-distribution with nA +nB −
2 d
...

CHAPTER 2
...
05, so if the null hypothesis is true, the probability of a type I
error is 0
...

• y¯A = 20
...
63, nA = 6

y¯B = 22
...
51, nB = 6

p
• s2p = 31
...
4/(5
...
75
• Hence the p-value= Pr(|T10 | ≥ 0
...
47
• Hence H0 : µA = µB is not rejected at level α = 0
...
test( y[x=="A"],y[x=="B"], var
...
7458, df = 10, p-value = 0
...
570623 4
...
13333 22
...
7
Comparison to the randomization test: We could also use the twosample t-statistic in a randomization test
...
, (YA , YB )(nsim) }
that we could have seen under H0 and different treatment assignments
• compare the null distribution of the outcomes to the observed outcome
...
0

0
...
2

0
...
4

CHAPTER 2
...
7: The t-based null distribution for the wheat example
The comparison can be made using any statistic we like
...

Recall, to obtain a sample of t(YA , YB ) from the null distribution,
we
1
...
Compute the value of t(YA , YB ) under this treatment assignment and
assuming the null hypothesis
...
, t(nsim)
To obtain the p-value, we compare tobs to this empirical distribution:
p-value =
t
...
sim<-real()
for(nsim in 1:5000) {

#(|t(j) | ≥ |tobs |)
nsim

33

0
...
1

Density
0
...
3

0
...
COMPARING TWO TREATMENTS

−6

−4

−2

0
t(YA,YB)

2

4

6

Figure 2
...
stat
...
test(y[xsim=="B"],y[xsim=="A"],var
...
stat
...
stat
...
75)
= 0
...
47 = Pr(|TnA +nB −2 | ≥ 0
...
COMPARING TWO TREATMENTS

34

(1) Data are independent samples
(2) Each population is normally distributed
(3) The two populations have the same variance
• Imagined Universes:
– Randomization Test: Numerical responses remain fixed, we
imagine only hypothetical treatment assignments
– t-test: Treatment assignments remain fixed, we imagine a
population of different experimental units and/or conditions, giving different numerical responses
...

– t-test: under our assumptions, inference claims to be generalizable to other units / conditions, i
...
to a larger population
...

Pn

i=1

Yi , Y i ∈

Laplace (1800s): Used normal distribution to model measurement error in
experiments
...

Gosset/Student (1908): Derived the t-distribution
...

CHAPTER 2
...
”
Box and Andersen (1955): Approximation of randomization null distributions (and a long discussion)
...
05 level
...
, YnA A ∼ i
...
d
...
, YnB B ∼ i
...
d
...

Consider evaluating whether δ is a reasonable value for the mean difference:
H 0 : µA − µB = δ
H1 : µA − µB 6= δ
Under H0 , you should be able to show that
(Y¯A − Y¯B ) − δ
p
∼ tnA +nB −2
sp 1/nA + 1/nB
Thus a given difference δ is accepted at level α iff
|¯
y − y¯B − δ|
pA
sp 1/nA + 1/nB
r
(¯
yA − y¯B ) − sp

≤ t1−α/2,nA +nB −2

1
1
+
t1−α/2,nA +nB −2 ≤ δ
nA nB

r
≤ (¯
yA − y¯B ) + sp

Wheat example: A 95% C
...
for µA − µB is (−9
...
77)
...
COMPARING TWO TREATMENTS

36

Questions:
• What does the fact that 0 is in the interval say about H0 : µA = µB ?
• What is the interpretation of this interval?
• Could we have constructed an interval via randomization tests?

2
...

Two sample t-test:
• H 0 : µA = µB

H1 : µA 6= µB

• Gather data
• Perform a level α hypothesis test: reject H0 if
|tobs | ≥ t1−α/2,nA +nB −2
Recall, if α = 0
...
05, or more precisely:
Pr(type I error|H0 true) = Pr(reject H0 |H0 true) = 0
...
g
...
0001 and µB −µA = 10, 000
are both ways instances of the alternative hypothesis
...
0001) < Pr(reject H0 |µB − µA = 10, 000)

CHAPTER 2
...
For example, in
the case of the two-sample test, for a specific difference δ, we may ask what
is:
1 − Pr(type II error|µB − µA = δ) = Pr(reject H0 |µB − µA = δ)?
We define the power of a two-sample hypothesis test against a specific
alternative to be:
Power(δ, σ, nA , nB ) = Pr(reject H0 | µB − µA = δ)

= Pr(|t(YA , YB )| ≥ t1−α/2,nA +nB −2 µB − µA = δ)
...

However, now we want to work out the probability of getting a value of
the t-statistic greater than this critical value, when a specific alternative
hypothesis is true
...

If we suppose YA1 ,
...
i
...
normal(µA , σ) and YB1 ,
...
i
...
normal(µB , σ), where µB − µA = δ then we need to know the distribution of
Y¯B − Y¯A
t(YA , YB ) = q

...

1
nB

(∗)

CHAPTER 2
...
More specifically, if µB − µA = δ then


δ

t(YA , YB ) ∼ t∗nA +nB −2  q
σ n1A + n1B
|
{z
}
non-centrality
parameter

2
...
1

The non-central t-distribution

A noncentral t-distribution random variable can be represented as
Z +γ
T =p
X/ν
where
• γ is a constant;
• Z is standard normal;
• X is χ2 with ν degrees of freedom, independent of Z
...

Self-check question: Derive the above result about the distribution of
t(YA , YB )
...

In fact, it can be shown that
pν

δ

E(t(YA , YB ) | µB − µA = δ) = q
σ n1A +

×
1
nB

Γ( ν−1
)
2
ν
Γ( 2 )

2

where ν = nA + nB − 2, the degrees of freedom, and Γ(x) is the Gamma
function, a generalization of the factorial:

39

0
...
COMPARING TWO TREATMENTS

0
...
1

0
...
3

gamma=0
gamma=1
gamma=2

−2

0

2
t

4

6

Figure 2
...

• Γ(1) = 1, Γ( 12 ) =

√

π
...

nA nB
Hence



δ
Y¯ − Y¯A
qB
∼ normal  q
σ n1A + n1B
σ n1A +


1
nB

, 1
...
COMPARING TWO TREATMENTS

40

We also know that for large nA , nB , s ≈ σ, so non-central t-distribution will
(for large enough nA , nB ) look approximately normal with
p
• mean δ/(σ (1/nA ) + (1/nB ));
• standard deviation 1
...

2
...
2

Computing the Power of a test

Recall our level-α testing procedure:
1
...

2
...

3
...

For this procedure, you have shown in the homework that
Pr( reject H0 |µB − µA = 0) =
=
=
=

Pr( p-value ≤ α|µB − µA = 0)
Pr(|t(YA , YB )| ≥ t1−α/2,nA +nB −2 |H0 )
Pr(|TnA +nB −2 | ≥ t1−α/2,nA +nB −2 )
α

But what is the probability of rejection under Hδ : µB − µA = δ? (Hopefully
this is bigger than α! )

Pr(reject H0 | µB − µA = δ) = Pr |t(YA , YB )| > t1−α/2,nA +nB −2 µB − µA = δ

= Pr |Tn∗A +nB −2,γ | > t1−α/2,nA +nB −2

= Pr Tn∗A +nB −2,γ > t1−α/2,nA +nB −2

+ Pr Tn∗A +nB −2,γ < −t1−α/2,nA +nB −2

= [1 − Pr Tn∗A +nB −2,γ < t1−α/2,nA +nB −2 ]

+ Pr Tn∗A +nB −2,γ < −t1−α/2,nA +nB −2

41

0
...
COMPARING TWO TREATMENTS

0
...
1

0
...
3

gamma=0
gamma=1

−2

0

2
t

4

6

Figure 2
...
f
...
If we have a rough idea of δ and σ we can evaluate the
power using this formula
...
rcrit <- qt( 1-alpha/2 , nA + nB -2 )
t
...
power <- pt(-t
...
gamma ) +
1- pt(t
...
gamma )
The normal approximation is given by using:
t
...
power <- pnorm(-t
...
gamma ) +
1- pnorm(t
...
gamma )
This will be very reasonable for large nA , nB
...

CHAPTER 2
...
rcrit, nA + nB − 2, ncp=t
...
rcrit, nA + nB − 2, ncp=t
...
However, including both
terms is not only correct, it also means that we can use the same R code for
positive and negative δ values
...

Example: Suppose
• µB − µA = 2
...
07, σ = 5
...
05 for such an
experiment?
> t
...
gamma
[1] 0
...
rcrit<-qt(1-alpha/2,nA+nB-2)
> t
...
228139
> t
...
rcrit,nA+nB-2, ncp=t
...
rcrit,nA+nB-2, ncp=t
...
power
[1] 0
...
gamma<- delta/(sigma*sqrt(1/nA +1/nB))
> t
...
054669
> t
...
rcrit
[1] 2
...
COMPARING TWO TREATMENTS

43

> t
...
rcrit,nA+nB-2,ncp=t
...
rcrit,nA+nB-2,ncp=t
...
power
[1] 0
...
Suppose an agricultural company
wants to know whether or not there are any effects of a soil additive on crop
yield
...
In practice, a difference in yield of less than
3 units is considered inconsequential
...
How many plots should
be in their study?
Procedure:
• CRD with n plots assigned to each of the two treatments;
• Declare a difference between A and B if |t(YA , YB )| > t
...

Question: What is the probability that a difference is declared, assuming
• δ = 3;
• σ = 6;
• various values of n
...
11
...
10

Checking Assumptions of a two-sample
t-test

Recall that in a two sample t-test we make three assumptions:

44

0
...
COMPARING TWO TREATMENTS

0
...
1

p(t)
0
...
3

5
20
80

−4

−2

0
t

2

4

6

0
...
4
0
...
8

−6

20

40

60

80

100

nA=nB
Figure 2
...

CHAPTER 2
...

(3) Variances of the two groups are equal
(Strictly speaking we only require this to hold under the null hypothesis of no
difference
...
However we would need to do something else if we wanted to compute
the power, e
...
simulation )

Assumption (1) will generally hold if the experiment is properly randomized
(and blocks are accounted for, more on this later)
...

2

The

2

−1
2

nA

2

is a continuity correction
...
Thus
we plot the pairs
(z

1
− 12
nA

, yA(1) ),
...

nA

2

Assumption (3) may be checked roughly by fitting straight lines to the probability plots and examining their slopes (Why?)
Formal hypothesis tests for equal variances may also be performed
...

14

●

●

●
●

●
●

●

●

●

●

●
●
●

●

●

●

●

Sample Quantiles
−1
...
6 −0
...
0
−0
...
5

●

Sample Quantiles
−0
...
5 1
...
0
0
...
0
Theoretical Quantiles

−1
...
0
1
...
0
0
...
0
Theoretical Quantiles

●

●
●

●

−1
...
0
1
...
0
0
...
0
Theoretical Quantiles

−1
...
0
1
...
0
0
...
0
Theoretical Quantiles

●

●

●

●

●

●

Sample Quantiles
−0
...
5 1
...
0
0
...
0
Theoretical Quantiles

●
●

●

●

−1
...
0
1
...
0
0
...
0
Theoretical Quantiles

−1
...
0
1
...
0
0
...
0
Theoretical Quantiles

●

Sample Quantiles
−1
...
4 0
...
0
0
...
0
Theoretical Quantiles

●

●

●
●
●

●

−1
...
0
1
...
12: Normal scores plots
...
0
−0
...
0

●

●
●

Sample Quantiles
−2
...
0
0
...
0
1
...
0

−1
...
0
1
...
5
−0
...
5
0
...
5

Sample Quantiles
−1
...
0
1
...
5 −1
...
5

●
●

●

Sample Quantiles
−0
...
5 1
...
4
−1
...
6

CHAPTER 2
...
0
0
...
0
Theoretical Quantiles

CHAPTER 2
...
10
...

This is known as Welch’s approximation; it may not give an integer d
...

Note: This t-distribution is not, in fact the exact sampling distribution of
tdiff (yA , yB ) under the null hypothesis that µA = µB , and σA2 6= σB2
...
So we are stuck if we want an exact answer! (See Behrens-Fisher
problem
...

2
...
2

Which two-sample t-test to use? tdiff (yA , yB )
vs
...

• If nA > nB , but σA < σB , and µA = µB then the two sample test based
on comparing t(yA , yB ) to a t-distribution on nA + nB − 2 d
...
will
reject more than 5% of the time
...
e
...
COMPARING TWO TREATMENTS

48

is scientifically relevant, then we are computing a valid p-value,
and this higher rejection rate is a good thing! Since when the
variances are unequal the null hypothesis is false
...

Likewise we will underestimate the probability of Type I error
...

In short: one should be careful about applying the test based on t(yA , yB )
if the sample standard deviations appear very different, and it is not reasonable to assume equal means and variances under the null hypothesis
...
1

Introduction to ANOVA

Example (Bacterial growth):
• Background: Bacteria grows on meat products stored in plastic wrap
...

• Treatments: Meat wrapped in
1
...

3
...

ambient air;
a vacuum;
1%CO, 40%O2 , 59%N ;
100%CO2
...
Bacterial count was measured for each
steak after 9 days of storage at 4◦ C
...
48
5
...
26
3
...
19

...
04

...
COMPARING SEVERAL TREATMENTS

log bact count/cm2
4
5
6
7

●
●

50

●
●
●

●
●
●
●

1
2
3
4

●
●
●

●
●

3

●

●

1
...
5

2
...
5
Treatment

3
...
5

4
...
1: Bacteria data
• Question: Is there significant variation due to treatment?
Possible analysis method: Perform paired t-tests of
H0ij : µi = µj

versus

H1ij : µi 6= µj

for each of the 42 = 6 possible pairwise comparisons
...

Problem: If there is no treatment effect at all, then

Pr(reject H0ij

Pr(reject H0ij |H0ij true) = α
some i, j|H0ij true all i, j) = 1 − Pr( accept H0ij all i, j|H0ij true all i, j)
Y
≈ 1 − (1 − α)
i
If α = 0
...
956 = 0
...
COMPARING SEVERAL TREATMENTS

51

So, even though the pairwise error rate is 0
...
26
...
For now, we will discuss a method of testing
the global hypothesis of no variation due to treatment:
H0 : µi = µj for all i 6= j

versus

H1 : µi 6= µj for some i 6= j

To do this, we will compare treatment variability to experimental variability
...

3
...
1

A model for treatment variation

Data: yi,j = measurement from the jth replicate under th ith treatment
...
, t indexes treatments
j = 1,
...

Treatment means model:
yi,j = µi + i,j
E[i,j ] = 0
V [i,j ] = σ 2
• µi is the ith treatment mean,
• i,j represents within treatment variation/error/noise
...
, τt are the treatment effects, representing between treatment variation

CHAPTER 3
...

P
In this model, we typically restrict
τi = 0 (Why?) The above represent
two parameterizations of the same model:
µj = µ + τ j ⇔ τ j = µj − µ
Reduced model:
yi,j = µ + i,j
E[i,j ] = 0
V [i,j ] = σ 2
This is a special case of the above two models with
• µ = µ1 = · · · µt , or
• τi = 0 for all i
...

3
...
2

Model Fitting

What are good estimates of the parameters? One criteria used to evaluate
different values of µ = {µ1 ,
...

Estimating treatment means: SSE is quadratic → convex
...

minimizer µ
r

∂ X
∂
SSE(µ) =
(yi,j − µi )2
∂µi
∂µi j=1
X
= −2
(yi,j − µi )
= −2r(¯
yi· − µi )

CHAPTER 3
...
, y¯t· }
...
If we
have more than one group, we want to pool our estimates to be more precise:
(r − 1)s21 + · · · + (r − 1)s2t
(r − 1) + · · · + (r − 1)
P
P
(y1j − y¯1· )2 + · · · + (y1j − y¯1· )2
=
t(r − 1)
PP
(yi,j − µ
ˆi )2
=
t(r − 1)
ˆ
SSE(µ)
=
≡ M SE
t(r − 1)
Consider the following assumptions:
s2 =

A0: Data are independently sampled from their respective populations
A1: Populations have the same variance
A2: Populations are normally distributed
ˆ is an unbiased estimator of µ;
• A0 → µ
• A0+A1 → s2 is an unbiased estimator of σ 2 ;
ˆ s2 ) are the minimum variance unbiased esti• A0+A1+A2 → (µ,
mators of (µ, σ 2 )
...

ˆ ≡ SSE is a measure of within
Within-treatment variability: SSE(µ)
treatment variability:
SSE =

t X
r
X

(yi,j − y¯i· )2

i=1 j=1

It is the sum of squared deviation of observation values from their group
mean
...
COMPARING SEVERAL TREATMENTS

54

Between-treatment variability: The analogous measure of between treatment variability is
SST =

t X
r
X

2

(¯
yi· − y¯·· ) = r

i=1 j=1

t
X

(¯
yi· − y¯·· )2

i=1

We call this the treatment sum of squares
...
1
...
, µt } all equal versus H1 : {µ1 ,
...
, µt } not all equal ⇔

t
X

(µi − µ
¯)2 > 0

i=1

Probabilistically,
t
X

2

(µi − µ
¯) > 0 ⇒ a large

i=1

t
X

(¯
yi· − y¯·· )2 will probably be observed

i=1

Inductively,
a large

X
i=1t

(¯
yi· − y¯·· )2 observed ⇒

X

(µi − µ
¯)2 > 0 is plausible

i=1t

So a large value of SST or M ST gives evidence that there are differences
between the true treatment means
...
Then
√
√
E(Y¯i· ) = µ → E( rY¯i· ) = rµ
√
V (Y¯i· ) = σ 2 /r → V ( rY¯i· ) = σ 2
√
√
So under the null rY¯1· ,
...
Recall that if X1 ,
...
i
...
P , then an

CHAPTER 3
...
Therefore,
variance (Xi − X)
P√ ¯ √ ¯ 2
√
( r Yi − r Y )
is an unbiased estimate of V ( rY¯i ) = σ 2
t−1
But
P
P√ ¯ √ ¯ 2
( r Yi − r Y )
r (Y¯i − Y¯ )2
=
t−1
t−1
SST
=
t−1
= M ST,
so E(M ST |H0 ) = σ 2
...

Lets summarize our potential estimators of σ 2 :
• Under H0 :
– E(M SE|H0 ) = σ 2
– E(M ST |H0 ) = σ 2
• Under H1 :
– E(M SE|H1 ) = σ 2
– E(M ST |H1 ) = σ 2 + rvτ2
Here is an idea for a test statistic:
• If H0 is true:

CHAPTER 3
...

Thus the test statistic F (Y) = M ST /M SE is sensitive to deviations from
the null, and can be used to measure evidence against H0
...

Example (Bacterial Growth):
y1<-c(7
...
98,7
...
26,5
...
80)
y3<-c(7
...
33,7
...
51,2
...
66)
r<-3
t<-4
ybar
...
t<- c(var(y1),var(y2),var(y3),var(y4))
SSE<- sum( (r-1)*s2
...
t-mean(ybar
...
8728
0
...
958
0
...
58

The F-ratio seems pretty far away from 1
...

CHAPTER 3
...

To rule out H0 , we need to look at how likely such a treatment assignment
is
...
66 6
...
80 5
...
44 5
...
41 7
...
04 3
...
91 3
...
obs<-anova(lm(y~as
...
obs
[1] 94
...
null<-NULL
for(nsim in 1:1000) {
x
...
null<-c(F
...
factor(x
...
null>=F
...
null)
[1] 92
...

3
...
4

Partitioning sums of squares

Intuition:
Total variability = variability due to treatment
+ variability due to other things
= between treatment variability + within treatment variability
Proposition:
SSTotal ≡

t X
r
X
i=1 j=1

(yi,j − y¯·· )2 = SST + SSE

CHAPTER 3
...
0

0
...
2
0
...
4

Histogram of F
...
null
Figure 3
...
COMPARING SEVERAL TREATMENTS

59

for all j
...
e
...

• the residual for observation {i, j} is (yi,j − y¯i· )
...

i

j

If we believe H0 ,
• our fitted value of yi,j is y¯·· , i
...
our predicted value of another
observation in any group is y¯·· = µ
ˆ
...

• the model lack-of-fit is measured by the residual sum of squares=
sum of squared residuals=
XX
(yi,j − y¯·· )2 = SST
...
The variance explained by different sources can be compared
and analyzed
...
COMPARING SEVERAL TREATMENTS

3
...
5

60

The ANOVA table

ANOVA = Analysis of Variance
Source of variation
Treatment
Noise
Total

Bacteria example:
Source of variation
Treatment
Noise
Total

Degrees of Freedom Sums of Squares
t−1
SST
t(r − 1)
SSE
tr − 1
SST otal

Mean Squares
M ST = SST
t−1
SSE
M SE = t(r−1)

F-ratio
M ST /M SE

Degrees of Freedom Sums of Squares Mean Squares F-ratio
3
32
...
958
94
...
927
0
...
800

Using the ANOVA table:
• No model assumptions → table gives a descriptive decomposition of
variation
...

• Assuming treatments were randomly assigned → hypothesis tests, pvalues
...

Understanding Degrees of Freedom: Consider an experiment with t
treatments/groups :

Data
Group means 




y1,1
...
, y2,r
y¯2,·
=
y¯ =

tr
t

...




...




yt,1
...
COMPARING SEVERAL TREATMENTS

61

We can “decompose” each observation as follows:
yi,j = y¯ + (¯
yi − y¯) + (yi,j − y¯i )
This leads to
(yi,j − y¯)
=
(¯
yi − y¯)
+
(yi,j − y¯i )
total variation = between group variation + within group variation
All data can be decomposed this way, leading to the following vectors of
length tr :
Total
y1,1 − y¯
...

...

...

y2,1 − y¯
...

...

y2,r − y¯
...

...
− y¯
...
− y¯
...

...

(¯
y1
...
)
(¯
y2
...
)

...

...
− y¯
...

...
)
(y1,2 − y¯1
...

...

(y1,r − y¯1
...
)

...

...
)

...

yt,1 − y¯
...

...

yt,r − y¯
...
− y¯
...

...

(¯
yt
...
)

+
+
+
+
+

(yt,1 − y¯t
...

...

(yt,r − y¯t
...
COMPARING SEVERAL TREATMENTS

62

To understand this latest definition, consider x1 , x2 , x3 and x¯ = (x1 + x2 +
x3 )/3:
 


c1
x1 − x¯
 x2 − x¯  =  c2 
x3 − x¯
c3
How many degrees of freedom does this vector have?
How many components can vary independently ?

c1 + c2 =
=
=
=
=

x1 + x2 − 2(x1 + x2 + x3 )/3
x1 /3 + x2 /3 − 2x3 /3
x1 /3 + x2 /3 + x3 /3 − x3
x¯ − x3
−c3

So c1 , c2 , c3 can’t all be independently varied, only two at a time can be
arbitrarily changed
...
In general,


x1 − x¯
 ··· 
xm − x¯
is an m-dimensional vector in an m − 1 dimensional subspace, having m − 1
degrees of freedom
...
Note that
• dof = dimension of the space the vector lies in
• SS = squared length of the vector

CHAPTER 3
...
1
...
 
 
...

 
...
 
 
...

...

¯ trt = 
¯ ·· = 1¯
y=
y·· = 
 y
 y
 
...
 
 
...
 
 
...

 
...
 
 
...
Recall that
two vectors u and v are orthogonal/perpendicular/at right angles if
u·v ≡

m
X

ui v i = 0

i=1

We have already shown that b and c are orthogonal:
b·c =

rt
X

bl cl

l=1

=

t X
r
X

(¯
yt· − y¯·· )(yi,j − y¯i· )

i=1 j=1

=

t
X
i=1

=

t
X

(¯
yt· − y¯·· )

r
X

(yi,j − y¯i· )

j=1

(¯
yt· − y¯·· ) × 0

i=1

= 0
So the vector a is the vector sum of two orthogonal vectors
...
COMPARING SEVERAL TREATMENTS

¯ ·· )
a = (y − y
¯ trt )
c = (y − y

¯ ·· )
b = (¯
ytrt − y
Now recall
¯ ·· ||2 = SSTotal
• ||a||2 = ||y − y
• ||b||2 = ||¯
ytrt − y¯·· ||2 = SST
¯ trt ||2 = SSE
• ||c||2 = ||y − y
What do we know about right triangles?
||a||2
= ||b||2 + ||c||2
SSTotal = SST + SSE
So the ANOVA decomposition is an application of Pythagoras’ Theorem
...

• (¯
ytrt − y
This last bit means the degrees of freedom must add, so
¯ ·· ) = (t − 1) + t(r − 1) = tr − 1
df(y − y

64

CHAPTER 3
...
1
...
Otherwise it is said to be unbalanced
...
, t, j = 1,
...

How might the ANOVA decomposition work in this case?
Partitioning sources of variation, unequal sample sizes:
Total
y1,1 − y¯
...

...

...

y2,1 − y¯
...

...

y2,r2 − y¯
...

...
− y¯
...
− y¯
...

...

(¯
y1
...
)
(¯
y2
...
)

...

...
− y¯
...

...
)
(y1,2 − y¯1
...

...

(y1,r1 − y¯1
...
)

...

...
)

...

yt,1 − y¯
...

...

yt,rt − y¯
...
− y¯
...

...

(¯
yt
...
)

+
+
+
+
+

(yt,1 − y¯t
...

...

(yt,rt − y¯t
...
Might we take a different average?

CHAPTER 3
...
First, lets check orthogonality:
b·c =

ri
t X
X

(¯
yt· − y¯·· )(yi,j − y¯i· )

i=1 j=1

=

t
X

(¯
yt· − y¯·· )

i=1

=

t
X

ri
X

(yi,j − y¯i· )

j=1

(¯
yt· − y¯·· ) × 0 = 0

i=1

Note: c must be orthogonal to any vector that is constant “within a group
...
What about degrees of freedom?
P
• dof(c) = ti=1 (ri − 1) = N − t should be clear
• dof(a) = N − 1 should be clear
• dof(b) = ?
Note that in general,
t

1X
y¯i· 6= y¯··
t i=1
But in the vector b we have ri copies of (¯
y· − y¯·· ), and
1 X
P
ri y¯i· = y¯··
ri
and so the vector b does sum to zero
...

CHAPTER 3
...
of Freedom Sum of Squares

Mean Square

F-Ratio

Treatment

t−1

SST

MST= SST
t−1

MST/MSE

Noise

N −t

SSE

Total

SSTotal

N −1

MSE =

SSE
N −t

Now suppose the following model is correct:
yi,j = µi + i,j

V (i,j ) = σ 2

Does MSE still estimate σ 2 ?
M SE = SST /(N − t)
Pr1
Pt
¯1· )2 + · · · rj=1
(yt,j − y¯t· )2
j=1 (y1,j − y
=
(r1 − 1) + · · · + (rt − 1)
(r1 − 1)s21 + · · · + (rt − 1)s2t
=
(r1 − 1) + · · · + (rt − 1)
So MSE is a weighted average of a bunch of unbiased estimates of σ 2 , so it
is still unbiased
...
One can show
• E(MSE) = σ 2
• E(MST) = σ 2 + vτ2 , where
– vτ2 =

Pt

2
i=1 ri τi

t−1

– τ i = µi − µ
¯
– µ=

Pt
rµ
i=1
P i i
...

CHAPTER 3
...
1
...
However, due to resource constraints, rA = 4, rB = 6, rC =
6, rD = 8
...
For the second
and third we need a sampling model:
yi,j = µi + i,j
1,1
...
i
...
normal(0, σ)
This model implies
• independence of errors
• constant variance
• normally distributed data
Another way to write it:
yA,1 ,
...
i
...
normal(µA , σ)
yB,1 ,
...
i
...
normal(µB , σ)
yC,1 ,
...
i
...
normal(µC , σ)
yD,1 ,
...
i
...
normal(µD , σ)
So we are viewing the 4 samples under A as a random sample from the
population of coagulation times that would be present if all animals got A
(and similarly for samples under B,C and D)
...
COMPARING SEVERAL TREATMENTS

70
coagulation time
60
65

C

B

A
A
A
A

69

C
C
C

B
B
B
B
B

D
D
D
D
D
D
D
diet

Figure 3
...
0
76
...
571
Residuals 20 112
...
6

The F-stat is large, but how unlikely is it under H0 ?
Two viewpoints on null distributions:
• The 24 animals we selected are fixed
...

• The 24 animals we selected are random samples from a larger population of animals
...

Randomization tests correspond to the first viewpoint, sampling theory to
the second
...
COMPARING SEVERAL TREATMENTS

3
...
9

70

Sampling distribution of the F -statistic

Recall the χ2 distribution:
Y1 ,
...
i
...
normal(µ, σ) ⇒

1 X
(Yi − Y¯ )2 ∼ χ2n−1
2
σ

Also,

X1 ∼ χ2k1 
X2 ∼ χ2k2
⇒ X1 + X2 ∼ χ2k1 +k2

X1 ⊥ X2

(Null) distribution of SSE:
PP
(Yij −Y¯i· )2
σ2

=
∼
∼

1
σ2

P
(Y1j − Y¯1· )2 +
χ2r1 −1
+
2
χN −t

···
+
·
·
·
+
P
(recall (ri − 1) = N − t)

So SSE/σ 2 ∼ χ2N −t
...
COMPARING SEVERAL TREATMENTS

71

Introducing the F -distribution: If

X1 ∼ χ2k1 
X1 /k1
X2 ∼ χ2k2
⇒
∼ Fk1 ,k2

X2 /k2
X1 ⊥ X2
Fk1 ,k2 is the “F -distribution with k1 and k2 degrees of freedom
...
How to
determine Fcrit ?
Level-α testing procedure:
1
...
Construct ANOVA table;
3
...
trt, dof
...

So Fcrit is the 1−α quantile of an Ft−1,N −t distribution
...

Back to the example:
> anova(lm(ctime~diet))
Analysis of Variance Table
Response: ctime
Df Sum Sq Mean Sq F value
Pr(>F)
diet
3 228
...
0 13
...
658e-05 ***

72

0
...
7

CHAPTER 3
...
0

0
...
2

density
0
...
4

0
...
6

0
...
0

0

0
...
4

F(3,20)
F(3,10)
F(3,5)
F(3,2)

0

5

10
F

CHAPTER 3
...
0

0
...
4

0
...
5: Normal-theory and randomization distributions of the F -statistic
Residuals 20

112
...
6

Fobs<-anova(lm(ctime~diet))$F[1]
Fsim<-NULL
for(nsim in 1:1000) {
diet
...
sim))$F[1] )
}
> mean(Fsim>=Fobs)
[1] 2e-04
> 1-pf(Fobs,3,20)
[1] 4
...
1
...
We can explore this further by making treatment comparisons
...
COMPARING SEVERAL TREATMENTS

74

If H0 is rejected we
• estimate µi with y¯i ;
• estimate σi2 with
– s2i : if variances are very unequal, this might be a better estimate
...

Standard practice: Unless strong evidence to the contrary, we typically assume V (Yi,j ) = V (Yk,l ) = σ 2 , and use s2 ≡ M SE to estimate σ 2
...

Confidence intervals for treatment means: Similar to the one sample
case
...
As a
result,
Y¯i ± SE(Y¯i· ) × t1−α/2,N −t
is a 100 × (1 − α)% confidence interval for µi
...
COMPARING SEVERAL TREATMENTS

75

A handy rule of thumb: If θˆ is an estimator of θ, then in many situations
ˆ
θˆ ± 2 × SE(θ)
is an approximate 95% CI for θ
...
975,20 = qt(
...
1
p
p
√
• SE(Y¯i ) = s2 /ri = 5
...
37/ ri
Diet µdiet ) ri
C
68
6
B
66
6
A
61
4
D
61
8

3
...
11

SE(ˆ
µdiet )
0
...
97
1
...
84

95% CI
(65
...
0)
(63
...
0)
(58
...
5)
(59
...
8)

Power calculations for the F-test

Recall, the power of a test is the probability of rejecting H0 when it is false
...

Power(µ, σ, r) = Pr(reject H0 |µ, σ, r)
= Pr(Fobs > F1−α,t−1,N −t |µ, σ, r)
Sampling distributions:
Y1,1
...
i
...
normal(µ1 , σ)

...

Yt,1
...
i
...
normal(µt , σ)

Under this sampling model, F = F (Y) has a non-central F distribution
with

CHAPTER 3
...
In many texts, power is
expressed as a function of the quantity Φ
r P
r τi2 p
= λ/t
Φ=
σ2t
Lets try to understand what Φ represents:
P 2
τi /t
2
Φ =
2
σ /r
treatment variation
=
experimental uncertainty
= treatment variation × experimental precision
Note that “treatment variation” means “average squared treatment effect
size”
...

Power(µ, σ, r) = Pr(reject H0 |µ, σ, r)
= Pr(Fobs > F1−α,t−1,N −t |µ, σ, r)
= 1-pf( qf(1-alpha,t-1,N-t) , t-1 , N-t , ncp=lambda )

CHAPTER 3
...
0

F
...
5

4
...
4 0
...
6 0
...
8 0
...
0

t=4, alpha=0
...
between/var
...
6: Power

2

4

6
r

8

10

0
...
85
0
...
0

F
...
5

4
...
05, var
...
within=2

2

4

6
r

8

Figure 3
...
COMPARING SEVERAL TREATMENTS

3
...
0
76
...
571 4
...
0
5
...
But
what are the differences?

3
...
1

Contrasts

Differences between sets of means can be evaluated by estimating contrasts
...
We can estimate
them, and obtain standard errors for them
...

Parameter estimates: Let
Cˆ =

t
X
i=1

k i µi =

t
X

ki y¯i·

i=1

ˆ = C, so Cˆ is a unbiased estimator of C
...
COMPARING SEVERAL TREATMENTS

79

Standard errors:
t
X

ˆ =
V (C)

i=1
t
X

=

V (ki y¯i· )
ki2 σ 2 /ri

i=1

= σ

2

t
X

ki2 /ri

i=1

ˆ is
So an estimate of V (C)
s2C

=s

2

t
X

ki2 /ri

i=1

The standard error of a contrast is an estimate of its standard deviation
s
X k2
i
ˆ =s
SE(C)
ri
t-distributions for contrasts: Consider
Pt
ki y¯i·
Cˆ
= pi=1
P 2
ˆ
s
ki /ri
SE(C)
If the data are normally distributed , then under H0 : C =
Cˆ
ˆ
SE(C)

P

ki µi = 0,

∼ tN −t

Exercise: Prove this result
Hypothesis test:
• H0 : C = 0 versus H1 : C 6= 0
...

• Level-α test : Reject H0 if |C/SE(
C)|
ˆ
ˆ
• p-value: Pr(|tN −t > |C(y)/SE(
C(y))|)
= 2*(1-pt( abs(c_hat/se_c_hat),N-t))
...
COMPARING SEVERAL TREATMENTS

80

Example: Recall in the coagulation example µ
ˆA = 61, µ
ˆB = 66, and their
95% confidence intervals were (58
...
5) and (63
...
0)
...

Hypothesis test: H0 : C = 0
...
53
= 3
...
27,20)) = 0
...

Confidence intervals: Based on the normality assumptions, a 100 × (1 −
α)% confidence interval for C is given by
ˆ
Cˆ ± t1−α/2,N −t × SE(C)

3
...
2

Orthogonal Contrasts

What use are contrasts? Consider the data in Figure 3
...

> anova(lm(y~as
...
factor(x) 4 87
...
900 29
...
690e-05 ***
Residuals
10 7
...
748

In this case, the treatment levels have an ordering to them (this is not always
the case)
...

• They are orthogonal : Ci · Cj = 0
...
COMPARING SEVERAL TREATMENTS

81

20

●
●

12

grain yield
14
16
18

●
●
●
●

●
●
●

●
●
●

●
●
●

10

20

30
plant density

40

50

Figure 3
...
This is the
“sum to zero” part, i
...
Ci · 1 = 0 for each contrast
...
, µ5 are monotonically increasing or monotonically decreasing
...

• C2 is big if “there is a bump”, up or down, in the middle of the treatments, i
...
C2 is measuring the quadratic component of the relationship
...

You can produce these contrast coefficients in R:
> contr
...
L

...
C
^4
[1,] -6
...
5345225 -3
...
1195229
[2,] -3
...
2672612 6
...
4780914

CHAPTER 3
...
481950e-18 -0
...
893692e-16 0
...
162278e-01 -0
...
324555e-01 -0
...
324555e-01 0
...
162278e-01 0
...
This doesn’t
change their sum, and it doesn’t change the orthogonality
...
means<-tapply(y,x,mean)
> trt
...
hat<-trt
...
poly(5)
> c
...
L

...
C
^4
[1,] 3
...
741657 0
...
83666
> 3*c
...
L
...
C ^4
[1,] 43
...
3 2
...
hat^2)
[1] 87
...
Recall, we represented treatment variation as a
vector that lived in t − 1 dimensional space
...
20 43
...
75
Quad trt 1 42
...
00 56
...
30 0
...
40
Quart trt 1 2
...
10 2
...
60 21
...
28
Error
10 7
...
75
Total
14 95
...

CHAPTER 3
...
2
...
e
...
Should we perform hypothesis tests for
all comparisons?
The more hypotheses we test, the higher the probability that at least one
of them will be rejected, regardless of their validity
...

• Comparison-wise type I error rate : Pr(reject H0ij |H0ij is true )
...
Gather data
2
...
Reject each H0ij for which |tij | > t1−αC /2,N −t
The comparison-wise type I error rate is of course
P (|ti j| > t1−αC /2,N −t |H0ij ) = αC
The experiment-wise type I error rate is the probability that we say
differences between treatments exist when no differences exist:
P (|tij | > tcrit for some i, j |H0 ) ≥ αC
with equality only if there are two treatments total
...
What is the experiment-wise rate in
this analysis procedure?

CHAPTER 3
...
Then f 0 (0) = − log(1 − αC ) and
f (x) ≈ f (0) + xf 0 (0)
1 − (1 − αC )x ≈ (1 − 1) − x log(1 − αC )
≈ xαC for small αC
...
Therefore
P (one or more H0ij rejected |H0 ) ≤

X

P (H0ij rejected |H0 )

i,j

t
=
αC
2
Bonferroni: Bonferroni procedure for controlling experiment-wise type I
error rate is as follows:

1
...

2
...
COMPARING SEVERAL TREATMENTS
where αC = αE /

t
2

85

...

Fisher’s Protected LSD (Fisher’s least-significant-difference method)
Another approach to controlling type I error makes use of the F -test
...
Perform ANOVA and compute F -statistic
2
...

3
...

Under this procedure,
• Experiment-wise type I error rate is αE
• Comparison-wise type I error rate is
P (reject H0ij |H0ij ) = P (F > Fcrit and |tij | > tcrit |H0ij )
= P (|tij | > tcrit |F > Fcrit , H0ij )P (F > Fcrit |H0ij )
– If H0 is also true, this is less than αE , and is generally between
αC × αE and αE , depending on how many treatments there are
...

3
...
COMPARING SEVERAL TREATMENTS

86

A3: {i,j }’s are normally distributed
...
If in addition
H0: µi = µ for all i = 1,
...

then the noncentrality parameter is zero and F = M ST /M SE ∼ Ft−1,N −t
...

What should we do if the model assumptions are not correct?
We could have written
A0 : {i,j } ∼ i
...
d
...
We don’t do this because some violations of
assumptions are more serious than others
...
We will discuss A1 when we talk about
blocking
...

3
...
1

Detecting violations with residuals

Violations of assumptions can be checked via residual analysis
...

ˆij is called the residual for observation i, j
...
COMPARING SEVERAL TREATMENTS

87

Checking Normality Assumptions:
• Histogram:
Make a histogram of ˆij ’s
...
If there are enough observations, graphically compare the
histogram to a N (0, s2 ) distribution
...

• Normal probability, or qq-plot:
If ij ∼ N (0, σ 2 ) then the ordered residuals (ˆ(1) ,
...

How non-normal can a sample from a normal population look? You can
always check yourself by simulating data in R
...
9
...

Data: yij = population total in transect j of site i
...

Data description:
site sample mean
1
33
...
72
3
50
...
24
5
10
...
64
ANOVA:

sample median
17
10
5
2
2
4

sample std dev
50
...
35
107
...
39
19
...
01

CHAPTER 3
...
0 0
...
2 0
...
4 0
...
6

Histogram of y

88

Normal Q−Q Plot
●

2

●

Sample Quantiles
0
1

●
●
●
●
●●●●●
●●●

−1

● ● ●
●

●

●

−2

−1

0
y

1

2

−2

−1

Histogram of y

0
1
Theoretical Quantiles

2

Normal Q−Q Plot
Sample Quantiles
−1
0
1
2

Density
0
...
1 0
...
3 0
...
5

●

●●

−2

●

−2

−1

0
y

1

● ●

●

●

●
●●
●●●
●●●
●
●
●●●
●●●●●
●●
●●●●
●
●
●
●●
●●●
●●●●
●●
●●

●

2

−2

−1
0
1
Theoretical Quantiles

Histogram of y

2

Normal Q−Q Plot

0
...
1

Density
0
...
3

2

●
●● ●
●●●
●
●●
●●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●●●
●●●
●●
● ●
●

−2

−1

0
y

1

2

−2

−1
0
1
Theoretical Quantiles

2

Figure 3
...
COMPARING SEVERAL TREATMENTS

site2

0
...
005
0
...
000 0
...
004 0
...
015

site1

89

0

100

200
300
crab[crab[, 1] == i, 2]

400

0

100

400

site4

0
...
000

0
...
04

Density
0
...
010

0
...
015

site3

200
300
crab[crab[, 1] == i, 2]

0

100

200
300
crab[crab[, 1] == i, 2]

400

100

200
300
crab[crab[, 1] == i, 2]

400

0
...
00

0
...
01

0
...
02 0
...
02 0
...
04

site5

0

0

100

200
300
crab[crab[, 1] == i, 2]

400

0

Figure 3
...
012

400

Density
0
...
008

Sample Quantiles
100 200 300

CHAPTER 3
...
000

0

●●

−100

0

100 200 300 400
fit1$res

●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●●●●

−2 −1 0
1
2
Theoretical Quantiles

Figure 3
...
factor(crab[,1])))
Analysis of Variance Table
Response: crab[, 2]
Df Sum Sq Mean Sq F value Pr(>F)
as
...
9669 0
...

Residual diagnostic plots: See Figure 3
...

Conclusion: Data are clearly not normally distributed
...

(a) tabulate residual variance in each treatment:
Trt
1

...

Sample var
...
, ˆin )
s21

...

t

s2t

CHAPTER 3
...
Consider a randomization test, or converting the data to ranks for the overall
test of treatment effects
...
Fitted Value plots
For many types of data there is a mean-variance relationship: typically, groups with large means tend to have large variances
...
yˆij = y¯i· (fitted value)
...
24
10
...
64
33
...
64
68
...
39
19
...
01
50
...
44
125
...

These differences will be ‘large’ in group i, if group i has a ‘large’
variance; and ‘small’ in group i, if group i has a ‘small’ variance
...

92

400

CHAPTER 3
...
12: Fitted values versus residuals
Reject H0 : V (ij ) = σ 2 for all i, j if F0 > Ft−1,t(n−1),1−α
Crab data:
F0 =

14, 229
= 2
...
95 = 2
...
05 level
...

Crab data: So the assumptions that validate the use of the F-test are
violated
...
3
...
COMPARING SEVERAL TREATMENTS

93

was that if the noise = Xij1 + Xij2 + · · · was the result of the addition of
unobserved additive, independent effects then by the central limit theorem
ij will be approximately normal
...
Also
note that by the central limit theorem the errors should be approximately
normally distributed
...
82 2
...
91 1
...
01 1
...
75 2
...
16 2
...
30 2
...
COMPARING SEVERAL TREATMENTS

●
●

94

●
●

●

1

●
●

2

●
●

3

●
●

●
●

●
●
●

4

5

6

4

5

6

log crab population
−2 0 2 4 6

site

1

2

3
site

Figure 3
...
COMPARING SEVERAL TREATMENTS

95

●
●
●
●
●
●
●
●
● ●●●●●

−2 −1 0
1
2
Theoretical Quantiles

residual
0
2

●
●●
●
●
●
●
●●
●
●
●
●
●●
●●
●●
●●
●

●
●
●

●

●
●
●
●
●
●
●
●
●
●
●

●
●
●

●
●
●

●
●
●
●
●
●
●
●
●
●
●
●
●
●
●

●
●

●
●

●
●
●
●
●

−2

●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●

●

●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●

●● ●
●

−4

−4

Sample Quantiles
−2
0
2

4

●
●●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●

4

Normal Q−Q Plot

●

1
...
5
2
...
14: Diagnostics after the log transformation
> anova(lm(log(crab[,2]+1/6)~as
...
factor(crab[, 1])
5 54
...
95 2
...
04604 *
Residuals
144 678
...
71
> anova(lm(log(crab[,2]+1/6)~as
...
factor(crab[, 1])
5 54
...
95 2
...
04604 *
Residuals
144 678
...
71

Other transformations: For data having multiplicative effects , we showed
that
σi ∝ µi ,

CHAPTER 3
...
In general, we may observe:
σi ∝ µαi i
...
the standard deviation of a group depends on the group mean
...
e
...

Consider the class of power transformations, transformations of the
∗
λ
form Yi,j
= Yi,j

...
So if we take λ = 1 − α then
i
we will have stabilized the variances to some extent
...

Estimation of α:
σi ∝ µαi ⇔ σi = cµαi
log σi = log c + α × log µi ,
so log si ≈ log c + α log y¯i·
Thus we may use the following procedure:
(1) Plot log si vs
...

(4) Analyze yij∗ = yij1−αˆ
...
COMPARING SEVERAL TREATMENTS

97

Here are some common transformations:
Mean-var
...

0

1

no transform!

σy ∝

1/2
µi
3/4
µi

∗
yij

yij
1/2
yij

1/2

square root

3/4

1/4

quarter power

yij

σy ∝ µi

1

0

log

log yij

3/2
σy ∝ µi
σy ∝ µ2i

3/2

-1/2

reciproc
...
root

2

-1

reciprocal

σy ∝

=

√

1/2

yij

1/4

−1/2

yij

1/yij

• All the mean-variance relationships here are examples of power-laws
...

• α = 1 is the multiplicative model discussed previously
...

λ

For λ = 0, it’s natural to define the transformation as:
yλ − 1
λ→0
λ
λ
y ln y
=
= ln y
1 λ=0

y ∗(0) = lim y ∗(λ) = lim
λ→0

Note that for a given λ 6= 0 it will not change the results of the ANOVA on
the transformed data if we transform using:
y∗ = yλ

or

y ∗(λ) =

yλ − 1
= ay λ + b
...
COMPARING SEVERAL TREATMENTS

98

To summarize the procedure: If the data present strong evidence of
nonconstant variance,
(1) Plot log si vs
...
If the relationship looks linear, then
(2) Fit a least squares line: lm( log si ∼ log y¯i· )
(3) The slope α
ˆ of the line is an estimate of α
...

This procedure is called the “Box-Cox” transformation (George Box and
David Cox first proposed the method in a paper in 1964)
...

A few words of caution:
• For variance stabilization via the Box-Cox procedure to do much good,
the linear relationship between means and variances should be quite
tight
...
e
...

• Don’t get too carried away with the Box-Cox transformations
...
47
0
...
53 don’t analyze yij = yij , just use yij
...
)
• Remember to make sure that you describe the units of the transformed
data, and make sure that readers of your analysis will be able to understand that the model is additive in the transformed data, but not
in the original data
...

• Try to think about whether the associated non-linear model for yij
makes sense
...
COMPARING SEVERAL TREATMENTS

99

• Don’t assume that the transformation is a magical fix: remember to
look at residuals and diagnostics after you do the transform
...

• Remember that the mean-variance relationship might not be cured by
a transform in the Box-Cox class
...
g
...
d
...
)
• Keep in mind that statisticians disagree on the usefulness of transformations: some regard them as a ‘hack’ more than a ‘cure’:
– It can be argued that if the scientist who collected the data had
a good reason for using certain units, then one should not just
transform the data in order to bang it into an ANOVA-shaped
hole
...
)
• The sad truth: as always you will need to exercise judgment while
performing your analysis
...

Example (Crab data): Looking at the plot of means vs
...
s suggests
α ≈ 1, implying a log-transformation
...

Instead we can use yij∗ = log(yij + 1/6)
...
) For the transformed
data this gives us a ratio of the largest to smallest standard deviation of
approximately 2 which is acceptable based on the rule of 3
...
39
9
...
84
10
...
01
12
...
39
33
...
44
50
...
35
68
...
86
2
...
14
3
...
68
4
...
22
2
...
54
3
...
92
4
...
COMPARING SEVERAL TREATMENTS

●

4
...
0

3
...
5

●

●
●
●

2
...
0

3
...
0

log_mean

Figure 3
...
COMPARING SEVERAL TREATMENTS
lm(formula = log_sd ~ log_mean)
Coefficients:
(Intercept)
log_mean
0
...
9839

101

Chapter 4
Multifactor Designs
4
...

• Treatments:
– Type ∈ {I, II, III}
– Delivery ∈ {A, B, C, D}
• Response: time to death, in minutes
...
But,
which Delivery to use for the experiment testing for Type effects?
which Type to use for the experiment testing for Delivery effects?
To compare different Type×Delivery combinations, we need to do experiments under all 12 treatment combinations
...
e
...
MULTIFACTOR DESIGNS

103

4 assigned to (I, B),

...

It might be helpful to visualize the design as follows:

Type
1
2
3

A
yI,A
yII,A
yIII,A

Delivery
B
C
yI,B
yI,C
yII,B yII,C
yIII,B yIII,C

D
yI,D
yII,D
yIII,D

This type of design is called a factorial design
...

Factors: Categories of treatments
Levels of a factor: the different treatments in a category
So in this case, Type and Delivery are both factors
...

4
...
1

Data analysis:

Lets first look at a series of plots:
Marginal Plots: Based on these marginal plots, it looks like (III, A) would
be the most effective combination
...
But the difference between types I and II seems to depend on delivery
...
Sometimes each group is called a cell
...
Lets take care
of this before we go any further: Plotting means versus standard deviations
on both the raw and log scale gives the relationship in Figure 4
...
Computing
the least squares line gives

CHAPTER 4
...
1: Marginal Plots
...
203
1
...
This suggests a reciprocal transformation, i
...
analyzing
yi,j = 1/ time to death
...
5 shows the mean-variance relationship for the transformed data
...
6)
...
Conversely, looking at Delivery, the experiment is a one
factor ANOVA with 4 treatment levels and 12 reps per treatment
...
MULTIFACTOR DESIGNS

105

B

2
...
5

6

3
...
5

10

4
...
5

A

I

II

III

I

III

D

2

3

4

3

5

4

6

5

7

8

6

9

7

10

C

II

I

II

III

I

Figure 4
...

II

III

106

2

4

6

8

10

12

CHAPTER 4
...
A II
...
B II
...
C II
...
D II
...
0

9

Figure 4
...

●

7

●

●

●
●

●
●
●

−2
...
0

●

log(sds)
−1
...
0

means
5 6

●

3
2

●

●
●

1
...
0
sds

3
...
8

1
...
6
log(means)

Figure 4
...

2
...
MULTIFACTOR DESIGNS

●
●
●

●

●

●

−3
...
2

●

●
●

0
...
04

0
...
4
−3
...
3

0
...
6

●

107

0
...
2

−1
...
4
−1
...
5: Mean-variance relationship for transformed data
...
34877 0
...
621 3
...
30628 0
...
20414 0
...
6401 0
...
45091 0
...
56862 0
...
531 1
...
08643 0
...
MULTIFACTOR DESIGNS

108

10
8

12

10
8

12

●

6
4
2

2

4

6

●

I

II

III

A

B

D

B

0
...
25

0
...
20

0
...
30

0
...
10

0
...
25

0
...
35

0
...
45

C

III

II

III

I

II

III

0
...
2 0
...
4 0
...
A

II
...
A

I
...
B

III
...
C

II
...
C

I
...
6: Plots of transformed poison data

II
...
D

CHAPTER 4
...
The third ANOVA is “valid,” but we’d like a more
specific result: we’d like to know which factors are sources of variation, and
the relative magnitude of their effects
...
e
...

4
...
2

Additive effects model

Yi,j,k = µ + ai + bj + i,j,k ,

i = 1,
...
, t2 , k = 1,
...
, at1 = additive effects of factor 1;
b1 ,
...

Notes:
1
...

2
...
In contrast, the additive model only has

CHAPTER 4
...

Parameter estimation and ANOVA decomposition:
yijk = y¯··· + (¯
yi·· − y¯··· ) + (¯
y·j· − y¯··· ) + (yijk − y¯i·· − y¯·j· + y¯··· )
ˆbj
= µ
ˆ +
a
ˆi
+
+
ˆijk
These are the least-squares parameter estimates, under the sum-to-zero
side conditions:
X
X
a
ˆi =
(¯
yi·· − y¯··· ) = n¯
y··· − n¯
y··· = 0
To obtain the set-to-zero side conditions, add a
ˆ1 and ˆb1 to µ
ˆ, subtract a
ˆ1
from the a
ˆi ’s, and subtract ˆb1 from the ˆbj ’s
...

You should be able to show that these vectors are orthogonal, and so
P P P
P P P 2
P P P ˆ2
P P P 2
ˆi
¯··· )2 =
ˆi +
i
j
k
i
j
k (yijk − y
i
j
ka
i
j
k bi +
SSTotal
=
SSA
+
SSB
+
SSE

CHAPTER 4
...
34877
dat$delivery 3 0
...
10214
---

Mean Sq F value
0
...
708
0
...
982
0
...
Does this
adequately represent what is going on in the data? What do we mean by
additive? Assuming the model is correct, we have:
E[Y |type=I, delivery=A] = µ + a1 + b1
E[Y |type=II, delivery=A] = µ + a2 + b1
This says the difference between Type I and Type II is a1 − a2 regardless of
Delivery
...
MULTIFACTOR DESIGNS

112

• The full model allows differences between Types to vary across levels
of Delivery
• The reduced/additive model says differences are constant across levels
of Delivery
...
, at1 = additive effects of factor 1;
b1 ,
...

(ab)ij = interaction terms = deviations from additivity
...

This is a full model: It fits a separate mean for each treatment combination:
E(Yijk ) = µij = µ + ai + bj + (ab)ij
Parameter estimation and ANOVA decomposition:
yijk = y¯··· + (¯
yi·· − y¯··· ) + (¯
y·j· − y¯··· ) + (¯
yij· − y¯i·· − y¯·j· + y¯··· ) + (yijk − y¯ij· )
ˆ
ˆ
= µ
ˆ +
a
ˆi
+
bj
+
(ab)
+
ˆijk
ij
Deciding between the additive/reduced model and the interaction/full model
ˆ ’s is large or
is tantamount to deciding if the variance explained by the (ab)
ij
not
...
MULTIFACTOR DESIGNS

4
...
3

113

Evaluating additivity:

Recall the additive model
yijk = µ + ai + bj + ijk
• i = 1,
...
, t2 indexes the levels of factor 1
• k = 1,
...

Parameter estimates are obtained via the following decomposition:
yijk = y¯··· + (¯
yi·· − y¯··· ) + (¯
y·j· − y¯··· ) + (yijk − y¯i·· − y¯·j· + y¯··· )
ˆbj
= µ
ˆ +
a
ˆi
+
+
ˆijk
Fitted value:
yˆijk = µ
ˆ+a
ˆi + ˆbj
= y¯i·· + y¯·j· − y¯···
Note that in this model the fitted value in one cell depends on data
from the others
...
e
...
, t2 , and
• y¯i1· − y¯i2· ≈ ˆb1 − ˆb2 for all i = 1,
...

This adequacy can be assessed by looking differences between the sample
means from each cell and the fitted values under the additive model
...
It is called an interaction
...
MULTIFACTOR DESIGNS

114

Interactions and the full model: The interaction terms also can be
derived by taking the additive decomposition above one step further: The
residual in the additive model can be written:
ˆA
¯i·· − y¯·j· + y¯···
ijk = yijk − y
= (yijk − y¯ij· ) + (¯
yij· − y¯i·· − y¯·j· + y¯··· )
I
ˆ
= ˆ + (ab)
ijk

ij

This suggests the following full model, or interaction model
yijk = µ + ai + bj + (ab)ij + ijk
with parameter estimates obtained from the following decomposition:
yijk = y¯··· + (¯
yi·· − y¯··· ) + (¯
y·j· − y¯··· ) + (¯
yij· − y¯i·· − y¯·j· + y¯··· ) + (yijk − y¯ij· )
ˆ ij
ˆ
= µ
ˆ +
a
ˆi
+
bj
+
(ab)
+
ˆijk
Fitted value:
ˆ ij
yˆijk = µ
ˆ+a
ˆi + ˆbj + (ab)
= y¯ij· = µ
ˆij
This is a full model for the treatment means: The estimate of the
mean in each cell depends only on data from that cell
...

Residual:
ˆijk = yijk − yˆijk
= yijk − y¯ij·
Thus the full model ANOVA decomposition partitions the variability among
the cell means y¯11· , y¯12· ,
...

ˆ = y¯ij· − (ˆ
• what is “leftover” : (ab)
µ+a
ˆi + ˆbj )
ij

CHAPTER 4
...

Df Sum Sq Mean Sq F value
pois$deliv 3 0
...
06805 27
...
34877 0
...
708
Residuals 42 0
...
00243

Df
pois$deliv
3
pois$type
2
pois$deliv:pois$type 6
Residuals
36

Sum Sq
0
...
34877
0
...
08643

Mean Sq F value
0
...
3431
0
...
6347
0
...
0904
0
...
10214 = 0
...
08643, that is SSEadd = SSABint + SSEint
• 42 = 6 + 36, that is dof (SSEadd ) = dof (SSABint ) + dof (SSEint )
• SSA, SSB, dof(A), dof(B) are unchanged in the two models
• M SEadd ≈ M SEint , but degrees of freedom are larger in the additive
model
...
MULTIFACTOR DESIGNS

116

2
– E(M SAB) = σ 2 + rτAB
> σ2
...
Under H0 : (ab)ij = 0 ,
FAB = M SAB/M SE ∼ F(t1 −1)×(t2 −1),t1 t2 (r−1)
Evidence against H0 can be evaluated by computing the p-value
...

We may then want to combine them to improve our estimate of σ 2
...
20414
0
...
01571
0
...
06805 28
...
376e-09 ***
0
...
6347 2
...
00262 1
...
3867
0
...
We may want to use the additive model
...
1
...
Now we want to compare means across levels
of one of the factors
...
So we have 4 × 4 = 16 observations for each level of
Type
...
047

CHAPTER 4
...
1

2 2 2 22

11 1

0
...
3
rate

0
...
5

Figure 4
...

• s12 = 0
...

• t-statistic = -1
...
112
...
e
...
For the above example, s12 =
...
00240 ≈ 0
...
65
...
7,
sM SE 2/(4 × 4)

p-value ≈ 0
...
The
relationship between the cell means model and the parameters in the interaction model are as follows:
µij = µ·· + (µi· − µ·· ) + (µ·j − µ·· ) + (µij − µi· − µj· + µ·· )
= µ + ai + bj + (ab)ij
and so

CHAPTER 4
...
1

11 1

118

2 2

2

1

0
...
3
rate

0
...
5

Figure 4
...

•

P

ai = 0

•

P

=0

•

P

i

j bj

i (ab)ij

= 0 for each j,

P

j (ab)ij

= 0 for each i

Suppose we are interested in the difference between F1 = 1 and F1 = 2
...
Note that a1 − a2 is a contrast of treatment

CHAPTER 4
...
MULTIFACTOR DESIGNS

120

• Under the assumption that the data are normally distributed
p
a
ˆ1 − a
ˆ2 ∼ normal(a1 − a2 , σ 2/(r × t2 ))
and is independent of M SE (why?)
• So

(ˆ
a1 − a
ˆ ) − (a1 − a2 )
(ˆ
a1 − a
ˆ2 ) − (a1 − a2 )
q2
=
∼ tν
2
SE(ˆ
a1 − a
ˆ2 )
M SE r×t
2

where ν are the degrees of freedom associated with our estimate of σ 2 ,
i
...
the residual degrees of freedom in our model
...
It is sometimes called the
least significant difference for comparing levels of Factor 1
...

Important note: The LSD depends on which factor you are looking at:
The comparison of levels of Factor 2 depends on
V (¯
y·1· − y¯·2· ) = V (¯
y·1· ) + V (¯
y·2· )
2
= σ /(r × t1 ) + σ 2 /(r × t1 )
= 2σ 2 /(r × t1 )

CHAPTER 4
...
20414
0
...
01571
0
...
06805 28
...
376e-09 ***
0
...
6347 2
...
00262 1
...
3867
0
...
Lets assume
there is no interaction term
...

Df Sum Sq Mean Sq F value
Pr(>F)
pois$deliv 3 0
...
06805 27
...
192e-10 ***
pois$type
2 0
...
17439 71
...
865e-14 ***
Residuals 42 0
...
00243

So there is strong evidence against the hypothesis that the additive effects
are zero for either factor
...
05,
LSDtype = t
...
018 ×
...
018 × 0
...
035
Type
I
II
III

Mean
0
...
23
0
...
2

0
...
4

0
...
1

0
...
2

0
...
4

0
...
MULTIFACTOR DESIGNS

I

II

III

A

B

C

D

Figure 4
...

effects of poison delivery: At αC = 0
...
975,42 × SE(ˆb1 − ˆb2 )
r
2
= 2
...
0024
4×3
= 2
...
02 = 0
...
19
0
...
29
0
...

Interpretation of estimated additive effects: If the additive model is
clearly wrong, can we still interpret additive effects? The full model is
yijk = µij + ijk

CHAPTER 4
...
, at1 } , {b1 ,
...

The additive model is
yijk = µ + ai + bj + ijk
Sometimes this is called the “main effects” model
...
Now,
how do we interpret this effect?

CHAPTER 4
...

• If additivity is correct, a
ˆ1 − a
ˆ2 can further be interpreted as the estimated difference in response between having factor 1 =1 and factor
1=2, for every level of factor 2 in the experiment
...
As we can see, this is not true:
the additive effects have a very definite interpretation under the full model
...

Dissecting the interaction: Sometimes if there is an interaction, we
might want to go in and compare individual cell means
...

y¯11· y¯12· y¯13· y¯14·
y¯21· y¯22· y¯23· y¯24·
A large interaction SS in the ANOVA table gives us evidence, for example,
that µ1j − µ2j varies across levels j of factor 2
...
10
...

Contrasts for examining interactions: Suppose we want to compare
the effect of (factor 1=1) to (factor 1=2) across levels of factor 2
...
MULTIFACTOR DESIGNS

●
●

2

●

●

2
...
2

2
...
3

2
...
4

2
...
1

2

3

4

●

2
...
2

2
...
3

2
...
4

2
...
1

2
...
2

2
...
3

2
...
4

2
...
1

Figure 4
...

CHAPTER 4
...

4
...

Typically, one hopes the e
...
’s are homogeneous or nearly so:
• scientifically: If the units are nearly homogeneous, then any observed
variability in response can be attributed to variability in factor levels
...

But what if the e
...
’s are not homogeneous?
Example: Let F2 be a factor that is a large source of variation/heterogeneity,
but is not recorded (age of animals in experiment, gender, field plot conditions, soil conditions)
...
If a factor
1
...
varies across experimental units

CHAPTER 4
...
If F2 is a known, potentially large source
of variation, we can control for it pre-experimentally with a block design
...

Objective: To have less variation among units within blocks than between
blocks
...
, 6: Level 4 is “standard”
• Response: Nitrogen uptake (ppm×10−2 )
• Experimental material: One irrigated field
Soil moisture is thought to be a source of variation in response
...
Field is divided into a 4 × 6 grid
...
Within each row or block, each of the 6 treatments are randomly
allocated
...
The experimental units are blocked into presumably more homogeneous groups
...
The blocks are complete, in that each treatment appears in each block
...
MULTIFACTOR DESIGNS

128

dry

wet
−−−−−−−−−−−−−− irrigation −−−−−−−−−−−−−−
Figure 4
...

1

5
37
...
18

1
34
...
89

3
42
...
22

3
49
...
85

6
50
...
99

2
46
...
57

3
52
...
61

1
36
...
65

4
40
...
9

4
39
...
29

5
40
...
91

1
39
...
89

4

treatment and data versus location

column
Figure 4
...
MULTIFACTOR DESIGNS

129

3
...

This design is a randomized complete block design
...
32 5 40
...
59
0
...
00 3 65
...
12
Error 108
...
20
Total 506
...
factor(c(trt))
))
Df Sum Sq Mean Sq F value Pr(>F)
as
...
316 40
...
3761 0
...

Residuals
18 305
...
945
#######

40

c(y)
45

50

130

35

35

40

c(y)
45

50

CHAPTER 4
...
factor(c(trt))

6

1

2
3
as
...
52

4
−3
...
3

6
−8
...
7

2

1
2
...
65

4
5
...
92

5
2
...
66

3

6
1
...
91

5
−1
...
34

2
2
...
39

2
−2
...
41

6
0
...
94

3
−3
...
69

1

2

3

4

5

6

row

2
−3
...
13: Marginal plots and residuals

CHAPTER 4
...
factor(c(trt)) + as
...
factor(c(trt)) 5 201
...
263 5
...
004191 **
as
...
004 65
...
1198 0
...
008
7
...
factor(c(trt)):as
...
factor(c(trt)):as
...
33
22
...
00
#######

• Can we test for interaction? Do we care about interaction in this case,
or just main effects? Suppose it were true that “in row 2, timing 6 is
significantly better than timing 4, but in row 3, treatment 3 is better
...

– If we don’t estimate block effects, we’ll have more dof for error
...

– If “row ” is a big source of variation, then ignoring it may lead to
an overly large MSE
...
MULTIFACTOR DESIGNS

132

Consider comparing the F-stat from a CRD with that from an RCB: According to Cochran and Cox (1957)
SSB + r(t − 1)M SErcbd

rt − 1

r(t − 1)
r−1
= M SB
+ M SErcbd
rt − 1
rt − 1

M SEcrd =

In general, the effectiveness of blocking is a function of M SEcrd /M SErcb
...
For the nitrogen example, this ratio
is about 2
...
3

Unbalanced designs

Example: Observational study of 20 fatal accidents
...
R =rainy (rainy/not rainy)
2
...
5

Lets naively compute sums of squares based on the decomposition:
yijk = y¯··· + (¯
yij· − y¯··· ) + (yijk − y¯ij· )
yijk = y¯··· + (¯
yi·· − y¯··· ) + (¯
y·j· − y¯··· ) + (¯
yij· − y¯i·· − y¯·j· + y¯··· ) + (yijk − y¯ij· )

CHAPTER 4
...
5) + 2(5 − 12
...
5)2 + 8(10 − 12
...
5)2 + 10 × (12 − 12
...
5)2 + 10 × (10 − 12
...
But look at the cell means :
• “no rain” - “rain” =5 regardless of interstate
• “interstate” - “two=lane” =10 regardless of rain
There is absolutely no interaction! The problem is that SSR, SSI, SSRI are
not orthogonal (as computed this way) and so SSM 6= SSR + SSI + SSRI
...

• Cell means show, for both road types, speeds were higher, by 5 mph
on avg , on clear days
...

• Marginal means for rain are dominated by interstate accidents
• Marginal means for no rain are dominated by two-lane accidents
We say that the marginal effects are unbalanced
...
MULTIFACTOR DESIGNS

134

Solution: Least-squares means
...
get estimates for each cell
2
...

1
2

1
y¯11·
y¯21·

...

2 ···
y¯12· · · ·
y¯22· · · ·

...

t2
y¯1t2 ·
y¯2t2 ·

µ
ˆ1·
µ
ˆ2·

t1 y¯t1 1· y¯t1 2· · · · y¯t1 t2 · µ
ˆt1 ·
µ
ˆ·1 µ
ˆ·2 · · · µ
ˆ·t2
Accident example:
interstate
rainy
15
not rainy
20
marginal mean
16
LS mean
17
...
5

marginal mean
13
12

LS mean
10
15

Comparison of means can be made in the standard way:

CHAPTER 4
...

Testing for interaction: Consider
H0 : Yijk = µ + ai + bj + ijk
H1 : Yijk = µ + ai + bj + (ab)ij + ijk
How do we evaluate H0 when the data are unbalanced? I’ll first outline the
procedure, then explain why it works
...
Compute SSEf = minµ,a,b,(ab) i j k (yijk − [µ + ai + bj + (ab)ij ])2
P P P
2
...
Compute SSI = SSEr − SSEf
...

Allowing for interaction improves the fit, and reduces error variance
...
If SSI is large, i
...
SSEr is much bigger than SSEf , this suggests the additive model does not fit well and the
interaction term should be included in the model
...

CHAPTER 4
...

A painful example: A small scale clinical trial was done to evaluate the
effect of painkiller dosage on pain reduction for cancer patients in a variety
of age groups
...

> table(trt,ageg)
ageg
trt 50 60 70 80
1 1 2 3 4
2 1 3 3 3
3 2 1 4 3
> tapply(y,trt,mean)
1
2
3
0
...
95 -2
...
200000 -1
...
920000 -0
...
15;

−2

0

2

137

−4

−4

−2

0

2

CHAPTER 4
...
14: Marginal plots for pain data
• compute the LS means and compare to marginal means
...
90 1
...
066667 0
...
40 -3
...
266667 0
...
15 -1
...
150000 -1
...
4395833 -0
...
0916667
> age_lsm
50
60
70
80
-1
...
0000000 -0
...
2305556

70

80

138

−4

−2

0

2

CHAPTER 4
...
50

3
...
60

1
...
70

2
...
15: Interaction plots for pain data
What are the differences between LS means and marginal means? Not as
extreme as in the accident example, but the differences can be explained by
looking at the interaction plot, and the slight imbalance in the design:
• The youngest patients (ageg=50) were imbalanced towards the higher
dose, so we might expect their marginal mean to be too low
...
2 to the LS mean = -1
...

• The oldest patient (ageg=80) were imbalanced towards the lower dose,
so we might expect their marginal mean to be too high
...
14 to the LS mean = -
...

Lets look at the main effects in our model:
> trt_coef<- trt_lsm-mean(cellmeans)
> age_coef<- age_lsm-mean(cellmeans)
> trt_coef
1
0
...
5826389 -1
...
MULTIFACTOR DESIGNS
50
60
-0
...
02569444

70
0
...
74375000

What linear modeling commands in R will get you the same thing?
> options(contrasts=c("contr
...
poly"))
> fit_full<-lm( y~as
...
factor(trt))
> fit_full$coef[2:4]
as
...
factor(ageg)2 as
...
90902778
-0
...
19097222
> fit_full$coef[5:6]
as
...
factor(trt)2
0
...
5826389

Note that the coefficients in the reduced/additive model are not the same:
> fit_add<-lm( y~as
...
factor(trt))
>
> fit_add$coef[2:4]
as
...
factor(ageg)2 as
...
7920935
-0
...
3070743
> fit_add$coef[5:6]
as
...
factor(trt)2
1
...
001899274

4
...
1

Non-orthogonal sums of squares:

Consider the following ANOVA table obtained from R:
Df Sum Sq Mean Sq F value Pr(>F)
as
...
355
4
...
9606 0
...
factor(trt)
2 28
...
127 3
...
06613
...
230
4
...
factor(trt)+as
...
factor(trt)
2 31
...
794 3
...
0498 *
as
...
021
3
...
7207 0
...
230
4
...
MULTIFACTOR DESIGNS

140

Where do these sums of squares come from? What do the F-tests represent? By typing “?anova
...
That is, the
reductions in the residual sum of squares as each term of the formula
is added in turn are given in as the rows of a table, plus the residual
sum of squares
...
For example, let
• A be the main effects of factor 1
• B be the main effects of factor 2
• C be their interaction
...
Calculate SS0 = residual sum of squares from the model
(0) yijk = µ + ijk
1
...
Calculate SS2 = residual sum of squares from the model
(AB) yijk = µ + ai + bj + ijk
3
...
MULTIFACTOR DESIGNS

141

This is actually what R presents in an ANOVA table:
>
>
>
>

ss0<-sum(
ss1<-sum(
ss2<-sum(
ss3<

lm( y~1 )$res^2 )
lm( y~as
...
factor(ageg)+as
...
3554
>
> ss1-ss2
[1] 28
...
75015
> ss3
[1] 57
...
factor(ageg)*as
...
factor(ageg)
3 13
...
452 1
...
27688
as
...
254 14
...
4239 0
...
factor(ageg):as
...
750
8
...
8054 0
...
480
3
...

• In an unbalanced design, the estimates of one set of parameters depends
on whether or not you are estimating the others, i
...
they are not
orthogonal, and in general SSA 6= SSA|B , SSB 6= SSB|A
...

The bottom line: For unbalanced designs, there is no “variance due to
factor 1” or “variance due to factor 2”
...
This is
essentially because of the non-orthogonality, and so the part of the variance

CHAPTER 4
...
This will become more clear when you get to regression next
quarter
...
4

Analysis of covariance

Example(Oxygen ventilation): Researchers are interested in measuring
the effects of exercise type on maximal oxygen uptake
...

Design: Six subjects randomly selected to a 12-week step aerobics program,
the remaining to 12-weeks of outdoor running on flat terrain
...

y = change in O2 uptake
Initial analysis: CRD with one two-level factor
...
The first panel of Figure 4
...
The second thing to do is a twosample t-test:

y¯ − y¯ 7
...
767)
A
B
= 2
...
907) = 0
...
239 ×
...

However, a plot of residuals versus age of the subject indicates a couple of
causes for concern with our inference:
1
...

2
...

CHAPTER 4
...
16: Oxygen uptake data
Question: Is the observed difference in y¯A , y¯B due to
• treatment?
• age?
• both?
A linear model for ANCOVA: Let yi,j be the response of the jth subject
in treatment i:
yi,j = µ + ai + b × xi,j + i,j
This model gives a linear relationship between age and response for each
group:
if i = A, Yi =
i = B, Yi =

intercept
slope
(µ + aA ) + b × xi,j
(µ + aB ) + b × xi,j

+
+

error
i,j
i,j

Unbiased parameter estimates can be obtained by minimizing the residual
sum of squares:
X
(ˆ
µ, a
ˆ, ˆb) = arg min
(yi,j − [µ + ai + b × xi,j ])2
µ,a,b

i,j

CHAPTER 4
...
17: ANOVA and ANCOVA fits to the oxygen uptake data
The fitted model is shown in the second panel of Figure 4
...

Variance decomposition Consider the following two ANOVAs:
> anova(lm(o2_change~grp))
Df Sum Sq Mean Sq F value Pr(>F)
grp
1 328
...
97 8
...
01565 *
Residuals 10 389
...
93

> anova(lm(o2_change~grp+age))
Df Sum Sq Mean Sq F value
Pr(>F)
grp
1 328
...
97 42
...
0001133 ***
age
1 318
...
91 40
...
0001274 ***
Residuals 9 70
...
82

The second one decomposes the variation in the data that is orthogonal to
treatment (SSE from the first ANOVA) into a parts that can be ascribed to
age (SS age in the second ANOVA), and everything else (SSE from second
ANOVA)
...

Now consider two other ANOVAs:

CHAPTER 4
...
09 576
...
519 8
...
18
14
...
09 576
...
6594 1
...
79
71
...
1788
0
...
39
7
...

Blocking, ANCOVA and unbalanced designs: Suppose we have a factor of interest (say F 1) that we will randomly assign to experimental material,
but it is known that there is some nuisance factor (say F 2 ) that is suspected
to be a large source of variation
...
As a result, we can separately estimate the effects
of F 1 and F 2
...

– ANCOVA allows for more precise control of variation due to F 2
...
This is primarily a problem for experiments with small sample sizes:
The probability of design imbalance decreases as a function of sample size
(as does the correlation between the parameter estimates) as long as F 1 is
randomly assigned
...
1

Nested Designs

Example(Potato): Sulfur added to soil kills bacteria, but too much sulfur
can damage crops
...

Factors of interest:
1
...
Sulfur additive ∈ {low,high}
Experimental material: Four plots of land
...

Experimental Design: A Split-plot design
1
...

2
...
Each potato type was
randomly assigned to two subplots per plot
...
NESTED DESIGNS

147

L

A B
A B

H

B A
A B

H

B A
A B

L

B A
A B

Randomization:
Sulfur type was randomized to whole plots;
Potato type was randomized to subplots
...

• 8 responses for each potato type
• 8 responses for each sulfur type
• 4 responses for each potato×type combination
> fit
...
full)
Df Sum Sq Mean Sq
type
1 1
...
48840
sulfur
1 0
...
54022
type:sulfur 1 0
...
00360
Residuals
12 1
...
11070

fit
...
4459 0
...
8803 0
...
0325 0
...
add)
Df Sum Sq Mean Sq F value
Pr(>F)
type
1 1
...
48840 14
...
00216 **
sulfur
1 0
...
54022 5
...
03893 *
Residuals 13 1
...
10246

5
...
5

148

4
...
5

5
...
5

CHAPTER 5
...
5

5
...
5

sulfur

high
...
A

high
...
1: Potato data
...
B

CHAPTER 5
...
4

Sample Quantiles
0
...
2 0
...
add$res
0
...
2

●

●

● ●

●

●
●
●

●

●
●

−0
...
4

●

−2

●

●
●

●

−1
0
1
Theoretical Quantiles

2

●

●

● ●●

●

●

●

●

●

4
...
8 5
...
2
fit
...
4

Figure 5
...

Randomization test: Consider comparing the observed outcome to the
population of other outcomes that could’ve occurred under
• different treatment assignments;
• no treatment effects
...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...
NESTED DESIGNS

150

Number of possible treatment assignments:

• 42 = 6 ways to assign sulfur
• For each sulfur assignment there are

4 4
2

= 1296 type assignments

So we have 7776 possible treatment assignments
...
t
...
s
...
sim<-rep( sample(c("low","low","high","high")),rep(4,4))
t
...
sim<-anova( lm(y~as
...
sim)+as
...
sim)) )
F
...
null<-c(F
...
null,fit
...
s
...
s
...
sim[2,4])
}

> mean(F
...
null>=F
...
obs)
[1] 0
...
s
...
s
...
352

What happened?
rand
anova1
≈ F1,13 ⇒ prand
Ftype
type ≈ ptype
rand
anova1
Fsulfur
6≈ F1,13 ⇒ prand
sulfur 6≈ psulfur

The F-distribution approximates a null randomization distribution if treatments are randomly assigned to units
...

• The precision of an effect is related to the number of independent
treatment assignments made

151

0
...
00

0
...
05
0
...
2 0
...
4

0
...
NESTED DESIGNS

0

5

10
15
F
...
null

20

0

5

10
F
...
null

15

Figure 5
...
It
is difficult to tell the difference between sulfur effects and field effects
...

Note:
• From the point of view of type alone, the design is a RCB
...
We have 2 observations of each
type per block
...

– Degrees of freedom breakdown for the sub-plot analysis:
Source
dof
block=whole plot 3
type
1
subplot error
11
subplot total
15
• From the point of view of sulfur alone, the design is a CRD
...
NESTED DESIGNS

152

– Each whole plot is an experimental unit
...

– Degrees of freedom breakdown for the whole-plot analysis:
Source
dof
sulfur
1
whole plot error 2
whole plot total 3
• Recall, degrees of freedom quantify the precision of our estimates and
the shape of the null distribution
...

• We compare the MS of a factor to the variation among the experimental
units of that factor’s level
...

• The level of an interaction is the level of the smallest experimental unit
involved
...
factor(field) ) )
Df Sum Sq Mean Sq F value
Pr(>F)
type
1 1
...
48840 54
...
326e-05 ***
as
...
59648 0
...
5575 0
...
00360 0
...
1323 0
...
27210 0
...

CHAPTER 5
...
factor(field)))
Df Sum Sq Mean Sq F value Pr(>F)
sulfur
1 0
...
54022 3
...
07936
...
factor(field) 2 1
...
52813 3
...
05989
...
76410 0
...
e
...

MSS<-anova(lm(y~sulfur+as
...
factor(field)))[2,3]
F
...
sulfur
[1] 1
...
sulfur,1,2)
[1] 0
...

5
...
1

Mixed-effects approach

What went wrong with the normal sampling model approach? What is wrong
with the following model?
yijk = µ + ai + bj + (ab)ij + ijk
where
• i indexes sulfur level, i ∈ {1, 2} ;
• j indexes type level, i ∈ {1, 2} ;
• k indexes reps, i ∈ {1,
...
i
...
normal

154

l
h
hh
h

l
l
l

l
l

h
h
h
h

−0
...
0 0
...
4

CHAPTER 5
...
0

1
...
0

2
...
0

l
3
...
0

Figure 5
...
What about independence? Figure 5
...
The figure indicates that residuals are
more alike within a field than across fields, and so observations within a field
are positively correlated
...

dependence within whole-plots
• affects the amount of information we have about factors applied at
the whole-plot level: within a given plot, we can’t tell the difference
between plot effects and whole-plot factor effects
...

If residuals within a whole-plot are positively correlated, the most intuitively straightforward way to analyze such data (in my opinion) is with a
hierarchical mixed-effects model:
yijkl = µ + ai + bj + (ab)ij + γik + ijkl

CHAPTER 5
...
e
...

The index k represents whole-plot reps k = 1,
...
e
...
, r2 ,
{ijkl } ∼ normal(0, σs )
Now every subplot within the same wholeplot has something in common, i
...

γik
...
To use this command, you need the nlme package:
library(nlme)
fit
...
factor(field))
>summary(fit
...
Error DF
t-value p-value
(Intercept) 4
...
2599650 11 18
...
0000
typeB
0
...
0791575 11 7
...
0000
sulfurlow
-0
...
3633602 2 -1
...
4183
> anova(fit
...
2946 <
...
3848 <
...
0229 0

Title: Statistics
Description: This is a statistics PDF.

Buy These Notes Preview

Notesale: Turn your study into money

Already a Member? >

Search for notes by fellow students, in your own course and all over the country.

My Basket

Document Preview