Search for notes by fellow students, in your own course and all over the country.

Browse our notes for titles which look like what you need, you can preview any of the notes via a sample of the contents. After you're happy these are the notes you're after simply pop them into your shopping cart.

My Basket

You have nothing in your shopping cart yet.

Title: anova
Description: everytjing about anova

Document Preview

Extracts from the notes are below, to see the PDF you'll receive please use the links above


The one-way analysis of variance is used to test
the claim that three or more population means are
equal
This is an extension of the two independent
samples t-test
One-way ANOVA – An analysis of variance
procedure using one dependent and one
independent variable
...

◦ All the populations from which the samples values are
obtained, have the same unknown population variance,
that is for k number of populations,

σ

2
1

= σ

2
2

= K = σ

2
k

The response variable is the variable you’re
comparing
The factor variable is the categorical variable
being used to define the groups
◦ We will assume k samples (groups)

The one-way is because each value is classified in
exactly one way
◦ Examples include comparisons by gender, race, political
party, color, etc
...
= µk
— All population means
are equal
— No treatment effect
Ha: Not All µi Are Equal
— At least 2 pop
...
≠ µk is
Wrong

Mean square
(variance)
within

f(X)

X

µ1 = µ2 = µ3

Mean square among

f(X)

As production manager, you want
Mach1 Mach2 Mach3
to see if three filling machines
25
...
40 20
...

You assign 15 similarly trained and 26
...
80 22
...
10 23
...
75
machine, to the machines
...
74 22
...
60

...
10 21
...
40

X

µ1 = µ 2 µ3

The summary statistics for the three filling machines
of each row are shown in the table below

The null hypothesis is that the means are all equal

H : µ = µ = µ =L = µ
0

Row

Mach 2

2

3

k

Mach 3

5

5

5

124
...
05

The alternative hypothesis is that at least one of the
means is different

102
...

If the alternative hypothesis is true, at
least some of the sample means would
differ
...


Variation
◦ Variation is the sum of the squares of the
deviations between a value and the mean of
the value
As long as the values are not identical, there
will be variation
Abbreviated as SS for Sum of Squares

2

Are all of the values identical?
◦ No, so there is some variation in the data
◦ This is called the total variation
◦ Denoted SS(Total) for the total Sum of
Squares (variation)
◦ Sum of Squares is another name for
variation

Are each of the values within each group
identical?
◦ No, there is some variation within the
groups
◦ This is called the within group variation
◦ Sometimes called the error variation
◦ Denoted SS(E) for Sum of Squares
(variation) within the groups

ANOVA measures two sources of variation in the
data and compares their relative sizes

Are all of the sample means identical?
◦ No, so there is some variation between the
groups
◦ This is called the between group variation
◦ Sometimes called the variation due to the
factor
◦ Denoted SS(A) for Sum of Squares (variation)
between the groups

Variance is described as Sum of Squares
Total Variance is partitioned as follows:
SS TOTAL

SS WITHIN

SSBETWEEN

Here is the basic one-way ANOVA table

• variation BETWEEN groups
• for each data value look at the difference
between its group mean and the overall mean

(xi − x )

2

• variation WITHIN groups
• for each data value we look at the difference
between that value and the mean of its group

(x

− xi )

2

ij

Source

SS

df

MS

F

p

Between
(Factor)
Within
(Error)
Total

3

“F” means “F test statistic”

One-way Analysis of Variance
Source
Factor
Error
Total

DF
2
12
14

SS
2510
...
2
2671
...
3
13
...
44

One-way Analysis of Variance

P
0
...
5
161
...
7

MS
1255
...
4

F
93
...
000

“Factor” means “Variability between groups” or “Variability due to
the factor of interest”

“DF” means “degrees of freedom”

“Error” means “Variability within groups” or “unexplained random
variation”
“Total” means “Total variation from the grand mean”

“SS” means “sums of squares”
“MS” means “mean squared”

(∑ x )
= ∑∑ x −
n

2

SST = ∑ (x ij − x )

One-way Analysis of Variance
Source
Factor
Error
Total

DF
a-1
n-a
n-1

SS
MS
SS(Between) MSA
SS(Error)
MSE
SS(Total)

F
MSA/MSE

P

2

obs

SSE = ∑ (x ij − x i ) 2
obs

SSA = ∑ (x i − x) = ∑
2

(∑ x i ) 2
ni

obs

n-1 = (a-1) + (n-a)

ij

2
ij

MSA = SS(Between)/(a-1)
MSE = SS(Error)/(n-a)

SST = SSA + SSE; MS =

(∑ x )


2

ij

n

SS
MSA
; F=
DF
MSE

SS(Total) = SS(Between) + SS(Error)

SSA = ∑ ( x i − x ) = ∑
2

(∑ x i ) 2

obs

ni

(∑ x )


(∑ x )

2

2

2
SST = ∑ ( x ij − x ) 2 = ∑∑ x ij −

ij

obs

n

[

ij

n

]

124
...
052 102
...
65)
= ∑
+
+
−
5
5 
15
 5

= 25
...
312 + 24
...
+ 20
...


= 7783
...
162

= 7794
...
162

= 47
...
2172

2

4

Source

SST = SSA + SSE
SSE = SST − SSA

SS

df

MS

F

p

Between
47
...
2172 − 47
...
0532

Total

11
...
Five of the six numbers could be anything, but
once the first five are known, the last one is fixed so
the sum is 240
...
2172

The between group df is one less than the
number of groups
◦ We have three groups, so df(A) = 2

The within group df is the sum of the individual
df’s of each group
◦ The sample sizes are 5, 5, and 5
◦ Df(E) = 4 + 4 + 4 = 12 or df(E)= 15 - 3 = 12

The total df is one less than the sample size
◦ df(Total) = 15 – 1 = 14

Filling in the degrees of freedom gives this …

Source

SS

Between
47
...
0532 15 - 3 = 12

MS

F

Variances
p

◦ The variances are also called the Mean of the Squares
and abbreviated by MS, often with an accompanying
variable MS(A) or MS(E)
◦ They are an average squared deviation from the mean
and are found by dividing the variation by the degrees
of freedom
◦ MS = SS / df

58
...
1640

3-1=2

23
...
0532 15 - 3 = 12

Within
(Error)

F

F test statistic
◦ An F test statistic is the ratio of two sample
variances
◦ The MS(A) and MS(E) are two sample
variances and that’s what we divide to find F
...
9211

F=
58
...
1640

3-1=2

11
...
2172

Within (Error)

Total

df

There is a “family” of F
Distributions
...

F cannot be negative, and it is a
continuous distribution
...

Its values range from 0 to ∞
As F → ∞ the curve approaches
the X-axis
...
60

15 - 1 = 14

23
...
9211

H 0: µ1 = µ2 = µ3
Test Statistic:
Ha: Not all mean equal
MST
23
...
05
F=
=
= 25
...
9211
ν1 = 2 ν2 = 12

If means are equal, F =
MST / MSE ≈ 1
...
05

Conclusion:

Always One-Tail!
© 1984-1994 T/Maker Co
...
05

0

3
...
9597

Level
1
2
3

N
5
5
5

SS
47
...
053
58
...
582
0
...
01%

Mean
24
...
610
20
...
032
0
...
959

F
25
...
000

R-Sq(adj) = 77
...
8
22
...
0
25
...
960

There is enough evidence to support the claim
that there is a difference in the mean scores of the
front, middle, and back rows in class
...
Five specimens were annealed at each
of four temperatures
...
The results are
presented in the following table
...
72

20
...
63

18
...
89

800

16
...
04

18
...
28

20
...
66

17
...
49

18
...
58

900

16
...
49

16
...
53

13
...
517

Level
750
800
850
900

N
5
5
5
5

DF
3
16
19

SS
58
...
84
95
...
55
2
...
42%

Mean
19
...
992
16
...
270

StDev
1
...
924
1
...
439

F
8
...
001

R-Sq(adj) = 54
...
0
16
...
0
20
...
517

7

Confidence interval for each mean, µi

(X
x ± tα
2

,n −a

MSE
ni

When the null hypothesis is rejected, it may
be desirable to find which mean(s) is (are)
different
...

MSE = [SSE/(n - k)]

Two means are considered different if the
confidence interval for the difference
between the corresponding sample means
does not contain 0
...

How do we calculate the confidence
intervals?

Tukey 95% Simultaneous Confidence Intervals
All Pairwise Comparisons among Levels of Machine
Individual confidence level = 97
...
9381
-5
...
3200
-4
...
7019
-2
...
0
-2
...
0
2
...
6381

Center
-2
...
4019

----+---------+---------+---------+----(------*-----)
----+---------+---------+---------+-----5
...
5
0
...
5

8

The standard two-way ANOVA tests are valid under the following
conditions:

Only two classification factor is considered

◦ The design must be complete
Observations are taken on every possible treatment

Factor B
1

2

j

◦ The design must be balanced
The number of replicates is the same for each treatment

1
Factor A

◦ The number of replicates per treatment, k must be at least 2

2

◦ Within any treatment, the observations x i j 1 , K , x i j k
are a simple random sample from a normal population

i

◦ The sample observations are independent of each other (the
samples are not matched or paired in any way)
◦ The population variance is the same for all treatments
...

1 a b 2
∑∑ xij
...
2j
...

∑ xi
...

abn

A chemical engineer is studying the effects of various reagents and
catalyst on the yield of a certain process
...
4 runs of the process were
made for each combination of 3 reagents and 4 catalysts
...


SSAB
( a − 1)( b − 1)

MSE =

SSE
ab ( n − 1)

Reagent
Catalyst

1

2

3

A

86
...
4
86
...
5

93
...
2
94
...
1

77
...
6
89
...
7

B

71
...
1
80
...
4

74
...
1
71
...
1

87
...
7
78
...
1

C

65
...
4
76
...
7

66
...
1
76
...
1

72
...
8
83
...
8

D

63
...
4
77
...
2

73
...
6
84
...
9

79
...
7
80
...
9

9


Title: anova
Description: everytjing about anova