mathematics | More Info | Notesale | Buy and Sell Study Notes Online | Extra Student Income | University Notes

Search for notes by fellow students, in your own course and all over the country.

Browse our notes for titles which look like what you need, you can preview any of the notes via a sample of the contents. After you're happy these are the notes you're after simply pop them into your shopping cart.

My Basket

Buy These Notes

You have nothing in your shopping cart yet.

Title: mathematics
Description: basic and discrete statistics and a few coding and insurance notes

Buy These Notes Preview

Document Preview

Extracts from the notes are below, to see the PDF you'll receive please use the links above

STA 2200 PROBABILITY AND STATISTICS II
Purpose At the end of the course the student should be able to handle problems involving
probability distributions of a discrete or a continuous random variable
...

DESCRIPTION
Random variables: discrete and continuous, probability mass, density and distribution
functions, expectation, variance, percentiles and mode
...
Moment generating function and transformation Change of variable technique for
univariate distribution
...
Statistical inference including one sample normal and t tests
...
,
Prentice Hall, 2003 ISBN 0-13-177698-3
2) J Crawshaw & J Chambers A Concise Course in A-Level statistics, with worked examples,
3rd ed
...
Appl
...
) [0266-4763; 1360-0532]
2) Statistics (Statistics) [0233-1888]
Further Reference Text Books And Journals:
a) HJ Larson Introduction to Probability Theory and Statistical Inference(Probability and
Mathematical Statistics) 3rd ed
...
M
...
O
...
M
...
JKUAT Press, 2005
c) I Miller & M Miller John E Freund’s Mathematical Statistics with Applications, 7th ed
...
Sci
...
RANDOM VARIABLES
1
...
Such a quantity whose value is determined by the outcome
of a random experiment is called a random variable
...

A discrete random variable is function whose range is finite and/or countable, Ie it can only
assume values in a finite or countably infinite set of values
...
(There are uncountably many real
numbers in an interval of positive length
...
2 Discrete Random Variables and Probability Mass Function
Consider the experiment of flipping a fair coin three times
...
X= number of tails that appear in 3 flips of a fair coin
...

Now, what are the possible values that X takes on and what are the probabilities of X taking a
particular value?
From the above we see that the possible values of X are the 4 values
X = 0, 1, 2, 3
Ie the sample space is a disjoint union of the 4 events {X = j } for j=0,1,2,3
Specifically in our example:
X  0 = HHH
X  1 = HHT, HTH, THH
X  2 = TTH, HTT, THT
X  3 = TTT
Since for a fair coin we assume that each element of the sample space is equally likely (with
probability 18 , we find that the probabilities for the various values of X, called the probability
distribution of X or the probability mass function (pmf)
...

We can say that this pmf places mass 83 on the value X = 2
...

The total mass (i
...
total probability) must add up to 1
...
The mass function P(X  x) (or just p(x) has the following properties:

0  p(x)  1 and  p(x)  1
all x

More generally, let X have the following properties
i) It is a discrete variable that can only assume values x1 , x2 ,
...
P( X  xn )  pn
Then X is a discrete random variable if 0  pi  1 and

n

p
i 1

i

1

Remark: We denote random variables with capital letters while realized or particular values
are denoted by lower case letters
...
Find the pmf of the random variable ‘the sum of the scores facing down
...
1 0
...
3 0
...
1  0
...
3  0
...
2

8
1
16

all w

P 3  w  0  P(W  3)  P(W  2)  P(W  1)  0
...
15  0
...
35
P 1  w  1  P(W  0)  0
...
A die is loaded such that the probability of a face showing up is proportional to the face
number
...

2
...
Wtrie down the
probability distribution of X hence compute P X  15 and P3  X  30

3

3
...
Show that X is a discrete random variable
...
The pmf of a discrete random variable X is given by P( X  x )  kx for x  1, 2 , 3, 4 , 5 , 6
Find the value of the constant k, P X  4 and P3  X  6
5
...
Let N represent the number of tosses required to
realize a head
...
A discrete random variable Y has a pmf given by P(Y  y )  c 34  for y  0 ,1, 2 ,
...
Verify that f(x) 
for x  0 ,1, 2 ,
...

k (k  1)
8
...

x
a) f(x)  cx for x 1, 2 , 3, 4 , 5
c) f(x)  c 16  for x  0 ,1, 2 , 3
...
k
d) f(x)  c2 x for x  for x  0 ,1, 2 ,
...
A coin is loaded so that heads is three times as likely as the tails
...

1
...
The sample space is uncountable
...
Let T denote the time that lapses before the 1st arrival, the T is a continuous
random variable that assumes values in the interval [0 , )
Definition: A random variable X is continuous if there exists a nonnegative function f so that, for
every interval B, P X  B    f(x) dx
...

Definition: Let X be a continuous random variable that assumes values in the interval
( , ) , The f(x) is said to be a probability density function (pdf) of X if it satisfies the
following conditions
b



a

-

f(x)  0 for all x , pa  x  b    f(x) dx  1 and

 f(x) dx  1

The support of a continuous random variable is the smallest interval containing all values of x
where f(x) >= 0
...
Instead, we talk about the
probability of the random variable assuming a value within a given interval
...

Example 1
Let X be a continuous random variable
...
Therefore f(x)is indeed a

0

pdf of X
...
01x for x  0
f(x)  
Find  hence compute P50  X  150 and PX  100
0 , elsewhere
Solution





f(x)  0 for all x in 0  x   Thus 1    e 0
...
01x
0



Now P50  X  150  0
...
01x dx   e 0
...
01 e 0
...
01x
100

0



100
0



150
50




0

 100    0
...

 e 0
...
5  0
...
6321206

Example 3
A continuous random variable X has a probability density function given by
0
...
5 x  c , 2  x  3 Find c hence compute P1  X  2
...

 0 , elsewhere

Solution



all x

f(x)dx  1  

2

1
0 4

dx  

3

2

12 x  cdx  12  54  c

 c   34

P1  X  2
...
5  

2

1
1 4

dx  

2
...
5

 14  163  167

Exercise

cx, 0  x  1
1) Suppose that the random variable X has p
...
f
...
4 Distribution Function of a Random Variables
Definition: For any random variable X, we define the cumulative distribution function (CDF), F(x)

 x
f(t) If X is discrete
 
as F(x)  P X  x   t  
for every x
...

F (x) is a right continuous function of x
...
d
...
of X is F(x) and the p
...
f
...

 0 , elsewhere
Determine the cdf of X hence compute P X  3
Solution
F(x) 

x

x

 f(t)   (x  1) 

t  

0
 x  3)
F(x)   x ( 40
1


1
20

t 1

1
20

(2  3 
...
Obtain the cdf of X hence compute P X  23 
0
,
elsewhere

6

Solution
x

x

-

0



F(x)   f(t)dt   12 tdt  14 t

P X 

1
...

3
...

2
3

2 x
0

0, x  0
 2
 14 x 2 thus F(x)   x4 , 0  x  2
1, x  2


  1  PX  23   1  14  23 2  89

Example 3
A continuous random variable X has a probability density function given by
0
...
5 x  c , 2  x  3 Find the cdf of X hence compute P1
...
5
...
(reason for introducing k)
0 for x  0
x
 4 for 0  x  2
1
 F(2)  2  k  F(x)   2
x -3x
 4  1 for 2  x  3
1, for x  2

Exercise
C x , 0  x  4
1
...
The pdf of a random variable X is given by g(x)  
Find the value of
0, elsewhere

the constant k, the cdf of X and the value of m such that G(x)  12
3
...
If the cdf of a random variable Y is given by F(x)  1  2 for Y  3 and F(x)  0 for Y  3 ,
y
find P X  5 , P X  8 and the pdf of X

1
...
Eg Y  2 X  3 , Y  X 3 , Y  X  c etc
7

For a 1-1 relationship between X and Y eg Y  2 X  3, f(x )and g(y) yields exactltly the
same probabilities only the random variable and the set of values it can assume changes
...
Now
PY = 1  PX2 = 1  PX = 1  16
PY = 4  P X2 = 4  PX = 2  13 and









PY = 9  P X = 9  PX = 3 
In some cases several values of X will give rise to the same value of Y
...

2

1
2

Example 2
 x 1
for x  0 , 1, 2 , 3 , 4

2
Give the pmf of a r
...
Suppose the pmf of a r
...
Let the pmf of a r
...
Suppose the pmf of a r
...

4
...
v X be given by f(x)   2
, determine the pmf of

0
,
elsewhere

1 if x is odd
Y
0 if x is even

x

 1  5  for x  0 ,1, 2 , 3 ,
...
Suppose the r
...
6 Change of Variable Technique

...
r
...
v X, then a r
...
v X has a pdf given by f(x)  

...
v X has pdf f(x)  
 0 otherwise
Solution

Y  8 x 3  X  12 Y

1

3



1
2

determine the pdf of Y  8 X 3





2
1, 0  x  1
1
2
dx
dx 1  2 3
g(y)  f(x)
 24 12 y 3  16 y  3  
6y
dy
dy
0, elsewhere

NB Y  8 X 3 is the cdf of X
Exercise:

5 x 4 , 0  x  1
1
...
v X with f(x)  
, determine the pdf of Y  2 ln x and its range
...
A r
...
The probability density function of X is given by f ( x)   ( x  4)
 0 otherwise

Obtain the probability density function of Y  tan1 ( 2x )

9

4
...
v X with pdf is as below to a uniform R
...
7 Expectation and Variance of a Random Variable
1
...
1 Expected Values
One of the most important things we'd like to know about a random variable is: what value
does it take on average? What is the average price of a computer? What is the average value
of a number that rolls on a die? The value is found as the average of all possible values,
weighted by how often they occur (i
...
probability)
Definition: Let X be a random variable with probability distribution p( X  x)
...

  xp ( X  x)dx if X is continuous
 
Theorem: Let X be a r
...
with probability distribution p(X=x) and let g(x) be a real-valued
function of X
...

  g ( x) p( X  x)dx if X is continuous
 
Theorem: Let X be a r
...
with probability distribution p( X  x)
...

(ii) Eax  b  a  b where a and b are arbitrary constants
(iii) Ekg(x)   kEg(x)  where g(x) is a function of X

n
 n
(iv) Eag1 (x)  bg 2 (x)   aEg1 (x)   bEg 2 (x)  and in general E  ci g i (x)    ci Eg i (x) 
 i 1
 i 1
are functions of X
...

(i) Ec   cP ( X  x)dx  c  P( X  x)dx  c(1)  c
All x

(ii) Eax  b 

All x

 (ax  b) P( X  x)dx   axP( X  x)dx   bP( X  x)dx

All x

a

All x

All x

 xP( X  x)dx  b  P( X  x)dx  a  b

All x

(iii) Ekg(x)  

All x

 kg( x) P( X  x)dx  k  g ( x) P( X  x)dx  kEg(x) 

All x

All x

10

(iv) Eag1 (x)  bg 2 (x)  

 ag (x)  bg (x) P( X  x)dx   ag (x) P( X  x)dx   bg (x) P( X  x)dx
1

2

1

All x

2

All x

All x

 Eag1 (x)   Ebg 2 (x)   aEg1 (x)   bEg 2 (x)  ie from part iii

1
...
2 Variance and Standard Deviation
Definition: Let X be a r
...
The units for variance are square units
...
It’s actually the positive square root
of Var(X)
...

Theorem: Var ( X )  E ( X   ) 2  E ( X ) 2   2
Proof:
Var ( X )  E( X   )2  E X 2  2 X   2  E( X )2  2E( X )   2  E( X )2   2 Since
E (X )  





Theorem: Var (aX  b)  a 2 var( X )
Proof:
Recall that EaX  b  a  b therefore









Var (aX  b)  E(aX  b)  (a  b)  Ea X     E a 2  X     a2 E  X     a 2 var( X )
Remark
(i) The expected value of X always lies between the smallest and largest values of X
...

x
0
1
2
3
P(X=x) 1/8 1/4 3/8 1/4
Solution
x
p ( X  x)
xp ( X  x)

x 2 p ( X  x)

0

1

2

3

1

1

3

1

8

0

1

0

1

4
4
4

3
3

8
4
2

3
9

4

7

4
4

3

E ( X )     xp ( X  x)  1
...
752  0
...
v X is as shown below, find the mean and standard
deviation of; a) X b) Y  12 X  6
...
6833

 242
...
Suppose X has a probability mass
2
...
01 0
...
4 0
...
04
P(X=x) 0
...
2 0
...
1 0
...
Let X be a random variable with P(X = 1) = 0
...
3, and P(X = 3) = 0
...
What is the
expected value and standard deviation of; a)X b) Y  5 X  10 ?
4
...
3
d 0
...
Also find
the mean and variance of Y  10 X  25
5
...

6
...
,8
7
...
For a discrete random variable Y the probability distribution is f(y)   10
,
0
,
elsewhere

calculate E (Y ) and var(Y)
kx for x  1, 2 , 3 , 4
9
...
A team of 3 is to be chosen from 4 girl and 6 boys
...
A fair six sided die has; ‘1’ on one face, ‘2’ on two of it’s faces and ‘3’ on the remaining
three faces
...
If T is the total score write down the probability
distribution of T hence determine; a) the probability that T is more than 4
...
The pdf of a continuous r
...
hence
 0 , elsewhere
Compute P(1 0  r  2) , E(X) and Var(X)
...
A continuous r
...
A continuous r
...
Also find the mean and the variance of X
 d 2 if x  1
15
...
An archer shoots an arrow at a target
...
d
...
is given by
find

 0 , if x  3
the value of the constant k
...
A continuous r
...
Also find the mean and the variance of X
e  x for x  0
18
...
v X has the pdf given by f(x)  
, find the mean and
 0 , elsewhere
3x

standard deviation of; a) X b) Y  e 4

1
...
That is the value of x for which f | (x) = 0
...
More formally, m should satisfy FX (m) = 0
...

Note: If there is a value m such that the graph of y= f(x) is symmetric about x=m, then both
the expected value and the median of X are equal to m
...
25 and FX (Q3 ) = 0
...
75 - 0
...
5 , so the quartiles give
an estimate of how spread-out the distribution is
...
01n or n 100 , that is, the probability that X is
smaller than xn is n%
...

Solution
On the interval 0  x  1, the cdf of X is given by F(x)  x 2 thus
a) At lower quartile Q l , F(Ql )  Ql  0
...
25  0
...
5  m  0
...
75  Q3  0
...

k (1  x) for 0  x  1
A continuous r
...

 0 , elsewhere

2
...
1 Discrete Distribution
Among the discrete distributions that we will look at includes the Bernoulli, binomial,
Poisson, geometric and hyper-geometric
2
...
1 Bernoulli distribution
Definition: A Bernoulli trial is a random experiment in which there are only two possible
outcomes - success and failure
...

 Checking items from a production line: success = not defective, failure = defective
...

A Bernoulli random variable X takes the values 0 and 1 and P( X  1)  p and
P( X  0 )  1  p
Definition: A r
...
It can be easily checked that
the mean and variance of a Bernoulli random variable are   p and  2  p(1  p1)
2
...
2 Binomial Distribution
Consider a sequence of n independent, Bernoulli trials, with each trial having two possible
outcomes, success or failure
...
LetX
denote the number of successes on n trials
...
n

14

We abbreviate this as X ~ Bin ( n , p) read as “X follows a binomial distribution with
parameters n and p ”
...

The mean and variance of a Binomial random variable are respectively given by;
  np and  2  np(1  p1)
Let’s check to make sure that if X has a binomial distribution, then

n

 P( X  x )  1
...
The probability of heads on any toss is 0:3
...
Calculate: (i) P( X  2 ) (ii) P( X  3) (iii) P(1  X  5 )
Solution
If we call heads a success then X has a binomial distribution with parameters n=6 and p=0:3
...
3)2 (0
...
324135
(ii) P( X  3 )6 C3  (0
...
7)3  0
...
324  0
...
059  0
...
578
Example 2
A quality control engineer is in charge of testing whether or not 90% of the DVD players
produced by his company conform to specifications
...
The day's production is
acceptable provided no more than 1 DVD player fails to meet specifications’
...

a) What is the probability that the engineer incorrectly passes a day's production as
acceptable if only 80% of the day's DVD players actually conform to speciication?
b) What is the probability that the engineer unnecessarily requires the entire day's
production to be tested if in fact 90% of the DVD players conform to speciffications?
Solution
a) Let X denote the number of DVD players in the sample that fail to meet speciffications
...
2
P( X  1)  P( X  0 )  P( X  1) 12 C0  (0
...
8)12 12C1  (0
...
8)11

 0
...
206  0
...
1
...
1)0 (0
...
1)1 (0
...
659
So P(X  1)  0
...
If the probability of a bit being
corrupted over this channel is 0:1 and such errors are independent, what is the probability that
no more than 2 bits in a packet are corrupted?
If 6 packets are sent over the channel, what is the probability that at least one packet will
contain 3 or more corrupted bits?
Let X denote the number of packets containing 3 or more corrupted bits
...
Then in the first question, we want
P(C  2 )  P(C  0 )  P(C  1)  P(C  2 )
12 C0 (0
...
9)12 12C1 (0
...
9)1112C2 (0
...
9)10

 0
...
377  0
...
889
...
889 = 0
...

Let X be the number of packets containing 3 or more corrupted bits
...
111
...
111)0 (0
...
494
...
111) = 0
...
111) (0
...
77
So the probability that X exceeds its mean by more than 2 standard deviations is
P( X    2 )  P( X  2
...

Now P( X  3 )  1  P( X  2 )  1  P( X  0 )  P( X  1)  P( X  2 )



 1 - 6 C0  (0
...
889)6  6 C1  (0
...
889)5  6 C2  (0
...
889)4
 1 - (0
...
3698  0
...
032



Exercise
1
...
What is the probability that exactly 6 heads will occur
...
If 3% of the electric bulbs manufactured by a company are defective find the probability
that in a sample of 100 bulbs exactly 5 bulbs are defective
...
An oil exploration firm is formed with enough capital to finance 10 explorations
...
1
...

4
...
She had 25 free throws in last
week’s game
...
What is the probability that she
made at least 5 hits?
5
...
This coin is tossed 3 times
...
According to the 2009 current Population Survey conducted by the U
...
Census Bureau,
40% of the U
...
population 25 years old and above have completed a bachelor’s degree or
more
...

7
...
It he incredibly lucky or
unusual?
8
...
6, what’s
the probability that in a group of 8 cases you have; (a) less than 2 smokers? (b0 More
than 5? (c) What are the expected value and variance of the number of smokers?
9
...
Calculate the probability that in a sample of 100 disk drives, that not more than
three will malfunction
10
...

a) What is the expected number and the standard deviation of cars on Thika super
highways that will do over 17 km per litre
...
1
...
Other such random events
where Poisson distribution can apply includes;
 the number of hits to your web site in a day
 the number of calls that arrive in each day on your mobile phone
 the rate of job submissions in a busy computer centre per minute
...

Poisson probabilities are useful when there are a large number of independent trials with a
small probability of success on a single trial and the variables occur over a period of time
...
The
xe 
formula for the Poisson probability mass function is P( X  x) 
, x  0 ,1, 2 ,
...
 is the shape parameter which indicates the average number of
events in the given time interval
...
We will
x 0

need to recall that e  1 







2





3





4


...
Instead, it uses the fixed interval of time or space in which
the number of successes is recorded
...

Determine the probability that in any one-minute interval there will be
a) 0 jobs; b) exactly 2 jobs; c) at most 3 arrivals
...
1353353
b) Exactly 3 job arrivals: P( X  3) 

23 e 2
 0
...
8571
 1 2 3! 
e) more than 3 arrivals P( X  3)  1  P( X  3) 1 - 0
...
1429
Example 2
If there are 500 customers per eight-hour day in a check-out lane, what is the probability that
there will be exactly 3 in line during any five-minute period?
Solution
The expected value during any one five minute period would be 500 / 96 = 5
...
The
96 is because there are 96 five-minute periods in eight hours
...
2
customers in 5 minutes and want to know the probability of getting exactly 3
...
1288 (approx)
3!
Example 3
If new cases of West Nile in New England are occurring at a rate of about 2 per month, then
what’s the probability that exactly 4 cases will occur in the next 3 months?
Solution
X ~ Poisson (=2/month)
(2 * 3) 4 e ( 2*3) 64 e ( 6)
P(X  4 in 3 months) 

 13
...
Calculate the Poisson distribution whose λ (Average Rate of Success)) is 3 & X (Poisson
Random Variable) is 6
...
Customers arrive at a checkout counter according to a Poisson distribution at an average
of 7 per hour
...
Manufacturer of television set knows that on an average 5% of their product is defective
...
What is the probability that the TV set will fail to meet the guaranteed
quality?
4
...
Find the probability that in a given year will be less that 3
accidents
...
Suppose that the change of an individual coal miner being killed in a mining accident
during a year is 1
...
Use the Poisson distribution to calculate the probability that in
the mine employing 350 miners- there will be at least one accident in a year
...
The number of road construction projects that take place at any one time in a certain city
follows a Poisson distribution with a mean of 3
...
(0
...
The number of road construction projects that take place at any one time in a certain city
follows a Poisson distribution with a mean of 7
...
(0
...
The number of traffic accidents that occur on a particular stretch of road during a month
follows a Poisson distribution with a mean of 7
...
Find the probability that less than three
accidents will occur next month on this stretch of road
...
018757)
9
...
Find the probability of observing exactly
three accidents on this stretch of road next month
...
052129)
10
...
8
...
(0
...
Suppose the number of babies born during an 8-hour shift at a hospital's maternity wing
follows a Poisson distribution with a mean of 6 an hour
...
(0
...
The university policy department must write, on average, five tickets per day to keep
department revenues at budgeted levels
...
8 tickets per day
...

(0
...
A taxi firm has two cars which it hires out day by day
...
5
...
If calls to your cell phone are a Poisson process with a constant rate =0
...
The average number of defects per wafer (defect density) is 3
...
What is the probability that the
redundancy will not be sufficient if the defects follow a Poisson distribution?
16
...
0001
a) What is the probability that no error will occur in 20 minutes?
b) How long would the program need to run to ensure that there will be a 99
...

 The sum of independent Poisson variables is a further Poisson variable with mean
equal to the sum of the individual means
...

2
...
4 Geometric Distribution
Suppose a Bernoulli trial with success probability p is performed repeatedly until the first
success appears we want to find the probability that the first success occurs on the yth trial
...
The sample space
S={s;fs;ffs, fffs ffffs …}
...
What is
the probability of a sample point, say P( fffs)  P(Y  4) )? Since successive trials are
independent (this is implicit in the statement of the problem), we have
P( fffs)  P(Y  4)  q3p where q  1  p and 0  p  1
Definition: A r
...
Y is said to have a geometric probability distribution if and only if
19

 pq y 1for y  1, 2 , 3
...

otherwise
0
This is abbreviated as X ~ Geo(p)
...
To be sure everything is consistent; we should check that the probabilities of all the
sample points add up to 1
...
P is s 
1 r
The cdf of a geometric distributions is given by
F(y)  P(Y  y)  P(Y  1)  P(Y  2)  P(Y  3) 
...

a) Find the probability that her first hit is on the second shot
b) Find the mean and standard deviation of the number of shots required to realize the
1st hit
Solution
Let X be the random variable ‘the number of shoots required to realize the 1st hit’
x 1
x ~ Geo(0
...
71  0
...

a) P( X  2)  p1     0
...
3  0
...
pq y 1 

b)  

1





1- p
1  0
...
78
 1
...
7
0
...
They have determined that 4% of the applicant pool are fluent in Farsi
...
04

b) P( X  25)  1     1  0
...
What is the probability that 5 accounts are audited before an account in error is found?
What is the probability that the first account in error occurs in the first five accounts audited?
Solution
P(Y  5)  0
...
97) 4  0
...
975  0
...
Over a very long period of time, it has been noted that on Friday’s 25% of the customers
at the drive-in window at the bank make deposits
...

2
...
Find the probability that the fourth person orders a diet drink
...

3
...
)
4
...
a) How many cabs can you expect to pass you for you to find one that
is free and b) what is the probability that more than 10 cabs pass you before you find
one that is free
...
An urn contains N white and M black balls
...
If we assume that each selected ball is replaced before the
next one is drawn, what is;
a) the probability that exactly n draws are needed?
b) the probability that at least k draws are needed?
c) the expected value and Variance of the number of balls drawn?
6
...
He then receives $2n ,
where n is the number of tosses
...
00 in one play of the game?
b) If the player must pay $5
...
An oil prospector will drill a succession of holes in a given area to find a productive
well
...
2
...
A well-travelled highway has itstraffic lights green for 82% of the time
...

9
...
The probability of success is 0
...

a) What is the probability that the 3rd hole drilled is the first to yield a productive well?
b) If the prospector can afford to drill at most 10 well, what is the probability that he will
fail to find a productive well?
2
...
5 The negative binomial distribution
Suppose a Bernoulli trial is performed until the tth success is realized
...

If r=1 the negative binomial distribution reduces to a geometric distribution
...
1
...
g
...
A sample of n elements is randomly
selected from the population
...
v
...

Definition: A r
...
Y is said to have a hyper geometric probability distribution if and only if
C C
P(Y  y)  r y N  D n  y for y  1, 2, 3,
...
v with a hyper geometric distribution, then   E (Y ) 
N
Example 1
Boxes contain 2000 items of which 10% are defective
...
3398  0
...
1975  0
...
0653

Example 2
How many ways can 3 men and 4 women be selected from a group of 7 men and 10 women?
Solution
C C
7350
The answer is 7 2 10 4 =
= 0
...

This can be extended to more than two groups and called an extended hypergeometric
problem
...
A bottle contains 4 laxative and 5 aspirin tablets
...
Find the probability that; a) exactly one, b) at most 1 c) at least 2 are laxative
tablet
...
Want is the probability of getting at most 2 diamonds in the5 selected without
replacement from a well shuffled deck?
3
...
She mixed up the letters and delivered 10 letters at random to
computing department
...
In a class there are 20 students
...
One day prefects checked at random on 10 lockers
...
A box holds 8 green, 4 white and 8 red beads
...
What is the probability that 3 red, 2 green and 1 white beads
are drawn?

22

2
...
2
...
Often referred as the Rectangular distribution because the graph of
the pdf has the form of a rectangle, making it the simplest kind of density function
...
The total area is equal to 1
...

b  a 2
ab
The expected Value and the Variance of X are given by  
and  2 
2
12
respectively
...
From past experience he feels that take off
time is uniformly distributed between 80 and 120 minutes after check in
...
b) the
waiting time will be between 1
...
5
ba
40
3
 1
...
5  x    1
...
51  0 
But  

40
 1
...
5  x    1
...
Uniform: The amount of time, in minutes, that a person must wait for a bus is uniformly
distributed between 0 and 15 minutes, inclusive
...
5 minutes? What is the probability that will be between 0
...
Slater customers are charged
for the amount of salad they take
...

Let x = salad plate filling weight, find the expected Value and the Variance of x
...
The average number of donuts a nine-year old child eats per month is uniformly
distributed from 0
...
Determine the probability that a randomly
selected nine-year old child eats an average of;
a) more than two donuts
b) more than two donuts given that his or her amount is more than 1
...

4
...
Suppose that
none of these plane tickets are completely sold out and they always have room for
23

passagers
...
45 AM and (
...
Determine the probability that he waits for
a) At most 10 minutes
b) At least 15 minutes
2
...
2 Exponential Distribution
The exponential distribution is often concerned with the amount of time until some specific
event occurs
...
Other examples include the length, in minutes, of long
distance business telephone calls, and the amount of time, in months, a car battery lasts
...
Values for an exponential random variable occur in the following
way
...
For example, the amount of money
customers spend in one trip to the supermarket follows an exponential distribution
...

The exponential distribution is widely used in the field of reliability
...
01e 0
...
Determine
 0 otherwise
the probability that the battery; a) Falls before 25 hours
...
c) life exceeds 120 hours
...

Solution
25

a)

P(T  25)  F(25)   e 0
...
01( 25)  0
...
01t dt  e 0
...
50  0
...
01t dt  e 1
...
3012

d)  

120


1
 100  P(T  100)   e 0
...
3679
100
0
...
Jobs are sent to a printer at an average of 3 jobs per hour
...
The time required to repair a machine is an exponential random variable with rate
λ= 0
...

4
...

6
...

8
...

b) what is the probability that the repair time will take at least 4 hours given that
the repair man has been working on the machine for 3 hours?
Buses arrive to a bus stop according to an exponential distribution with rate
λ= 4 busses/hour
...
What
is the expected time of the next bus?
Break downs occur on an old car with rate λ= 5 break-downs/month
...

a) What is the probability that he will return home safely on his car
...

Suppose that the amount of time one spends in a bank is exponentially distributed with
mean 10 minutes
...
2
...
The
time is known to have an exponential distribution with the average amount of time equal
to 4 minutes
...

b) Half of all customers are finished within how long? (Find median)
c) Which is larger, the mean or the median?
On the average, a certain computer part lasts 10 years
...

a) What is the probability that a computer part lasts more than 7 years?
b) On the average, how long would 5 computer parts last if they are used one after
another?
c) Eighty percent of computer parts last at most how long?
d) What is the probability that a computer part lasts between 9 and 11 years?
Suppose that the length of a phone call, in minutes, is an exponential random variable
with decay parameter = 1/12
...
Let X = the
length of a phone call, in minutes
...
2
...
Thus if the results holds for
  k then they must also hold for   k  1
...
For
example, an insurance company observes that large commercial fire claims occur randomly
in time with a mean of 0
...
Not only in real life, the Gamma
distribution is also wildly used in many scientific areas, like Reliability Assessment, Queuing
Theory, Computer Evaluations, or biological studies
...
Then the total time has a Gamma
distribution with parameters  and 
...

Some examples of gamma distributions are plotted below
...

Remarks
a) If   1 then we have the standard gamma distribution
...

26

  x  1 t
t e dt and it’s computation is not trivial
...
Var (X)   2  
The cdf, F(x) is of the form F(x) 

2

Definition: Let  be a positive integer
...
Var (X)   2  2
Example 1 Suppose the reaction time of a randomly selected individual to a certain
stimulus has a standard gamma distribution with   2 sec
...
Therefore f(x)  xe  x , x  0 and f(x)  0 elsewhere
5

P(3  X  5)   xe  x dx   ( x  1)e  x  4e 3  6e 5  0
...
09158

0
0

Example 2 Suppose the survival time X in weeks of a randomly selected male mouse
exposed to 240 rads of gamma radiation has a gamma distribution with   8 and   15
...

b) What is the probability that a mouse survives (i) between 60 and 120 weeks
...
4264 weeks
E(X)   ) = (8)(15) = 120 weeks
120
8 1
1
x
b)
P(60  X  120)  
x 7 e 15 dx  
y 7 e  y dy
8
60 15  (8)
4  (8)

a)

e y
  y  7 y  42 y  210 y  840 y  2520 y  5040 y  5040
7!



7

5

4

4

3

2



8

4

8

261104e  6805296e
 0
...
9989
7!
27

3

2





2

Questionn The time between failures of a laser machine is exponentially distributed with a
mean of 25,000 hours
...
2
...
Simply
0

2

0

 ( 12 )  





0

e  x dx  12 
2

Beta Distribution
Definition: A random variable X is said to have a standard beta distribution with parameters
 and  if it’s probability density function is given by
x 1 1  x 
,0  x  1 and f(x) = 0 elsewhere we denote this as X ~ Beta  ,  
B( ,  )
Theorem: If X has a standard beta distribution with parameters  and  , then
 1

f(x) 

E(X)   


 

and
...
MOMENTS AND MOMENT-GENERATING FUNCTIONS
Definition: The kth moment of a r
...
X taken about zero, or about origin is defined to be

 

E X k and denoted by  k/
...
v
...

Definition: The moment-generating function (mgf), m(t), for a r
...
X is defined to be
M x (t ) or simply M (t )  E etx

 

We say that an mgf for X exists if there is b > 0 such that M (t )   for t  b
...

1!
2!
3!
4!
 tx (tx ) 2 (tx )3 (tx ) 4

M (t )  E etx   1  



...

2! all x
3! all x
all x
all x
 1  tE ( X ) 

t2
t3
E ( X 2 )  E ( X 3 ) 
...

2!
3!
2
2t
3t /
M | (t )  1/   2/ 
3 
...
 M ||(0)   2/  E ( X 2 )
2!
3!
Remark: The mgf of a particular distribution is unique and we can recognize the pdf if we
are given the mgf
...
v Y is given by M (t )  16 et  13 e2t  12 e3t Find the mean and variance of Y
Solution
E (Y )  M | (0)  16 et  23 e 2t  32 e3t
 16  23  32  73



E (Y )  M (0) 
2

||





1
6

e  43 e  92 e
t

2t

3t

t 0



Var (Y )  E (Y 2 )   2  6   73   95

t 0

 16  43  92  6

2

Example 2 Find the mgf of a r
...

Example 3 Find the mgf of a r
...
M (t )  E etx   etx 16  56    16 56 et  65 t 
t
1  6 e 6  5e
x 0
x 0

 

 



29





M | (t )  5et 6  5et





2

 E ( X )  5et 6  5et





2
t 0

 5 and

E (Y 2




)  5e 6  5e 

M ||(t )  5et 6  5et

and

2

 50e2t 6  5et
t 2

t



3



 50e2t 6  5et



3



t 0

 55

Var ( X )  E ( X 2 )   2  55  52  30
Exercise
2
1) The mgf of a r
...
v X has a gamma distribution with parameters , Find the mgf of X hence obtain the
mean and variance of X

3
...

The mgf about X  a is given by M x , a (t )  E et  xa   e at E etx  e at M x (t )



4



 

NORMAL DISTRIBUTION

4
...
It is widely used in statistical inference
...

Definition A r
...

4
...
1 Properties of normal distribution
1) The normal distribution curve is bell-shaped and symmetric, about the mean
2) The curve is asymptotic to the horizontal axis at the extremes
...

4) The mean can be any numerical value: negative, zero, or positive
5) The standard deviation determines the width of the curve: larger values result in wider,
flatter curves
6) Probabilities for the normal random variable are given by areas under the curve
...
5 to the left of the mean and 0
...

7) It has inflection points at    and   
...
26% of values of a normal random variable are within  1 standard deviation of its
mean
...
6826
b) 95
...
ie P  2  X    2   0
...
72% of values of a normal random variable are within  3 standard deviation of its
mean
...
9972
30

4
...
Therefore, the density of Z, which is usually denoted  (z) is given by;
1
 (z) 
exp  12 z 2 for    z  
2
The cumulative distribution function of a standard normal random variable is denoted (z ) ,
and is given by





z

( z )    (t )dt
-

4
...
1 Computing Normal Probabilities
It is very important to understand how the standardized normal distribution works, so we will
spend some time here going over it
...
but the values of (z ) has been exhaustively tabulated
...

Table 1 below reports the cumulative normal probabilities for normally distributed variables
in standardized form (i
...
Z-scores)
...
For a given
value of Z, the table reports what proportion of the distribution lies below that value
...
5 ; half the area of the standardized normal curve lies to the left
of Z  0
...
696  z  1
...
345  z  1
...
65, -1
...
0, -1
...
02, -1
...
43)
c) P(0
...
75)
Solution
a) Look up and report the value for (z ) from the standard normal probabilities table
P(Z  1
...
65)  0
...
65)  0
...
0)  0
...
0)  0
...
02)  (1
...
1515 P(Z  -1
...
65)  0
...
365  z  1
...
75) - (0
...
9599 - 0
...
3249
d) P(-0
...
865)  (1
...
696)  0
...
2432  0
...
7257

e) P(-2
...
65)  (1
...
345)  0
...
0095  0
...
43)  P(-1
...
43)  2(1
...
9236)  1  0
...
6026, 0
...
3446
c) P(-0
...
2665
b) P(Z  t) = 0
...
7265, 0
...
9972 , 0
...
9750
Solution
Here we find the probability value in Table I, and report the corresponding value for Z
...
6026  t  0
...
950  t  1
...
3446  t  0
...
4026  ( t) = 0
...
25
P(Z  t)  0
...
2735  t  - 0
...
5446  ( t) = 0
...
11
c) P(-0
...
28) = 0
...
3897  0
...
40
d) P(-t  z  t )  2 (t) -1 = 0
...
9986  t = 2
...
9505  (t)  0
...
96
P(-t  z  t )  2 (t) -1 = 0
...
9875  t = 2
...
Given Z ~ N0 ,1 , find;
a) P(Z  z) if
z = 1
...
89, 1
...
53
b) P(Z  z) for z = 1
...
15
c) P(0  z  1
...
396  z  1
...
96  z  1
...
33)
2
...
973, 0
...
4634
33

b) P(Z  a) = 0
...
9545, 0
...
21  z  t )  0
...
9544 , 0
...
3750

d)

4
...
Then X ~ N ,  2  But we know that
1 X 
2

 from which the claim follows
...
It is also easily shown that the cumulative distribution function satisfies



f (x ) 





X 
F(x)  

  
and so the cumulative probabilities for any normal random variable can be calculated using
the tables for the standard normal distribution
...
Standardization can be
X
accomplished using the formula for a z-score: Z 
~ N0 ,1
...

Let X ~ N  ,  2 then P(a  X  b)  P a-  Z  b-   b-   a- where







    

Z  X- ~ N (0 ,1)
Example 1 A r
...
9772 - 0
...
8185
Example 2 Suppose X ~ N30 , 16
...
5)  0
...
25  PZ  2
...
25)  0
...
25)  0
...
5  0
...
If
GRE ~ N(500,100 2 ) , how high does your GRE score have to be to qualify for a scholarship?
Solution
Let X  GRE
...
05 this is too hard to solve as it
500
~ N 0, 1 and find z for the problem,
stands - so instead, compute Z  X100
P(Z  z)  1 - ( z) = 0
...
95  z  1
...
645)  66
...

34

Example 4
Family income is believed to be normally distributed with a mean of $25000 and a standard
deviation on $10000
...
What percentage of the population will benefit from
the law?
Solution
Let X = Family income
...
, so
 25000
X ~ N 25000 , 100002  Z  X10000
~ N 0, 1
P( X  10,000)  PZ  1
...
5)  0
...

Hence, a slightly below 7% of the population lives in poverty
...
5  Z  0
...
5) 1  2  0
...
383
Thus, about 38% of the taxpayers will benefit from the new law
...
Find; a) P( X  140) b) P( X  120) c) P(130  X  135)
2) The random variable X is normally distributed with mean 500 and standard deviation 100
...
Use graphs with labels to illustrate your answers
...
The speeds are normally
distributed with a mean of 90 km/hr and a standard deviation of 10 km/hr
...
John
owns one of these computers and wants to know the probability that the length of time
will be between 50 and 70 hours
5) Entry to a certain University is determined by a national test
...
Tom wants to
be admitted to this university and he knows that he must score better than at least 70% of
the students who took the test
...
Will he be admitted to
this university?
6) A large group of students took a test in Physics and the final grades have a mean of 70
and a standard deviation of 10
...
09) where measurements are in cm
...
5 cm or larger than 4
...

Out of 500 bolts how many would be accepted? Ans 430
8) Suppose IQ ~ N(100,22
...
a woman wants to form an Egghead society which only
admits people with the top 1% IQ score
...
9
9) A manufacturer does not know the mean and standard deviation of ball bearing he is
producing
...
4 cm and those
under 1
...
Out of 1,000 ball bearings, 8% are rejected as too small and
5
...
What is the mean and standard deviation of the ball bearings produced?
Ans mean=2
...
2

35

4
...

4
...
1 Introduction
Suppose a fair coin is tossed 10 times, whar is the probability of observing: a) exactly 4 heads
b) at most 4 heads?
Solution
Let X be the r
...
5

 P( X  x)10Cx  0
...
5 10Cx  12  for x = 0,1,2,
...
2051
512
193
10
b) P( X  4)  10 C0 10C1 10C2 10C3 10C4 0
...
3770
512
x

10 x

10

4
...
2 Normal approximation:
Many interesting problems can be addressed via the binomial distribution
...
Eg: Compute P( X  12) for 25 tosses of a fair coin
...
Fortunately, as n becomes large, the binomial distribution
becomes more and more symmetric, and begins to converge to a normal distribution
...
Hence, the
normal distribution can be used to approximate the binomial distribution
...
The histogram looks bell-shaped, as long as the number of trials is not too small

In general, the distribution of a binomial random variable may be accurately approximated by
that of a normal random variable, as long as np  5 and nq  5 , and assuming that a

...
is made to account for the fact that we are using a continuous
distribution (the normal) to approximate a discrete one (the binomial)
...
Why are these
reasonable choices of μ, σ2?
4
...
3 Continuity Correction
In the binomial, P( X  a)  P( X  a  1)  1 whenever a is an integer
...

36

The usual way to solve this problem is to associate 1/2 of the interval from a to a + 1 with
each adjacent integer
...
This adjustment is called a continuity correction
...
5  np
P( X  x)  P( X  x  1)  P( X  x  0
...
5 np
P( X  x)  P( X  x  1)  P( X  x  0
...
5  X  b  0
...
5 np
npq

...
5  np

...
5  X  x  0
...
The equalities which hold in the binomial distribution do
not hold in the normal distribution, because there is a gap between consecutive values of a
...

For example, in the binomial, P( X  6)  P( X  7) since 6 is the next possible value of X
that is less than 7
...
5)
...
In the
normal, we approximate this by finding P( X  5
...
v is less than or equal to 4
...
5
...

The mean of the normal is   np  5 and the standard deviation is

  10(0
...
5)  1
...
v is less than or equals to 4
...
ie
P( X  4)  P( X  4
...
5
...
3162)  0
...
377
Example 1 Suppose 50% of the population approves of the job the governor is doing, and
that 20 individuals are drawn at random from the population
...
What is the probability that;
...
5    np  10 and   npq  5 Since np  5 and nq  5 , it is
probably safe to assume that X ~ N (10 , 5)

P
 1

...
118
a) P( X  7)  P(6
...
5)  P 6
...
5510  
Z



  






std normal
Binomial
Normal





std normal

 (1
...
565)  0
...
0588  0
...
5)  PZ  1
...
1318


  
Binomial

c)
d)

Normal

std normal

P( X  11)  P( X  11
...
6708  1  0
...
7488  0
...
5)  PZ  0
...
6708  0
...
What are the odds that the
Democrats will win 19 or more races? Use the normal approximation to the binomial
Solution
Note that n  25 , p  0
...

Using the normal approximation to the binomial,
P( X  19)  P( X  18
...
4289  1  1
...
9235  0
...

Example 3 Tomorrow morning Iberia flight to Madrid can seat 370 passengers
...
90 that a given ticket-holder will show up
for the flight
...
How confident
can Iberia be that no passenger will need to be
...
(denied boarding)?
Solution:
We will assume that the number (X) of passengers showing up for the flight has a binomial
distribution with mean   400  0
...
9  0
...
5)  PZ  1
...
9599 So the probability that
 

 
Binomial

Normal

std normal

nobody gets bumped is approximately 0
...
(Almost 96%)
...
A coin is loaded such that heads is thrice as likely as the tails
...

2
...
If a random sample of 200 customers is
selected, what is the approximate probability that;
a) at least 75 pay with a credit card?
b) not more than 70 pay with a credit card?
c) between 70 and 75 customers, inclusive, pay with a credit card?
3
...
3
...
Crafty Computers limited produces PCs
...
25
...
What is the
probability that between 70 and 80 PCs inclusive have a virus? Would you advice the
director JKUAT ICSIT to buy Computers from this company in future?
5
...
Based on past experience the airline feels that each dessert
is equally likely to be chosen
...

7
...

9
...

A baseball player has a long term batting average of 0
...
What is the chance he gets an
average of 0
...
We know that about 12%
of Americans are black, what is the probability that the sample contains 170 or fewer
blacks?
Let T be the lifetime in years of new bus engines
...

 x 3 for t > 1
a) Find the value of d and the mean and median of T
...
By using a normal approximation to the bonomial, find the
probability that at most 10 of the engines last for 4 years or more
...
5 Sums of Independent Random Variables
Theorem 1 If X 1 , X 2 ,
...
v each with mean  and

2
1 n
variance    then r
...
v, then



X1  X 2 ~ N 1  2 , 12   22
Proof

Let the mgf of X 1 be M1 (t )  e

t1  12 t 2 12

and for X 2 be M 2 (t )  e

t1  12 t 212

t2  12 t 2 22

Y1 has mgf M1 (t )  M1 (t )  M 2 (t )  e
e
e
2
2
a normal r
...
Let Y1  X 1  X 2 so

t ( 1  2 )  12 t 2 (12  22 )

which is the mgf of







Theorem 3 If X1 , X 2 ,
...
vs and each X i ~ N i ,  i2 , then the r
...
 X n ~ N 1  2 
...
 
Proof is by induction (left as exercise to the learner)
2
1

2
2

2
n



Example If X ~ N60 ,16 and Y ~ N70 , 9 are 2 independent r
...
9772
b) P120  X  Y  135  P1205130  Z  1355130   (1)  (2)  0
...
0228  0
...
6  PZ  0
...
6)  0
...
4)  (1
...
6554 - 0
...
6006
39

Exercise
1
...
v, Find (a) P X Y  142
(b) P134  X  Y  166 (c) PY  X  4 (d) P12  Y  X  24
2
...
Njoroge walks to the library bto read a newspaper
...

Total time spent in the library is also normally distributed with mean 25 minutes and
standard deviation 12 minutes
...

b) he spends more time walking than in the library

4
...
A common estimator for μ is the sample mean x
...
Since the particular individuals included in our sample are
n i 1
random, we would observe a different value of x if we repeated the procedure
...
Its value is determined partly by which people are randomly chosen
to be in the sample
...

Many possible samples, many possible x ’s

mean = 1
...
55

0

mean = 1
...
45

0

mean = 1
...
7

mean = 1
...
61

0

2

mean = 1
...
44

mean = 1
...
64

0

2

mean = 1
...
67

mean = 1
...
7

0

4

mean = 1
...
53

0

2

mean = 1
...
72

0

2

4

6

8

10

We only see one!
We will have a better idea of how good our one estimate is if we have good knowledge of
how x behaves; that is, if we know the probability distribution of x
...
6
...
The mean of the sample means will be the mean of the population
2
...

3
...

4
...

5
...
Some books define sufficiently
large as at least 30 and others as at leas t25
...
At their weekly
partners meeting each reported the number of hours they charged clients for their services last
week
...
F26 (eg, Mr
...
Ie there are 6 c 2  15 possible samples
...
g
...
If we divide individual frequencies by total
frequency (ie 15) we get “relative frequency” or probability
...
distribution
...
066667
...

Note the shape is similar to Normal distribution
41

The sampling distribution is simply this probability distribution defined over all possible
samples of size n from the population of size N
...
g
...
g
...
Then the sampling distribution can only be imagined
...
Now the random variable is x , it is no longer just X
...

Sampling Distribution of the Sample Means:- Distribution obtained by using the means
computed from random samples of a specific size
...

Standard Error or the Mean:- The standard deviation of the sampling distribution of the
sample means
...

4
...
2 The Mean and Standard Deviation of x
What are the mean and standard deviation of x ?
Let’s be more specific about what we mean by a sample of size n
...
, X n with common mean  and common standard deviation 
...
6
...
, X n are normally distributed, then x is also normally distributed
...
If X 1 , X 2 ,
...

In brief if X 1 , X 2 ,
...
Then for a sufficiently large n, the sampling distribution of X is approximately Normal
with mean  and variance n
...
A moron is a person with IQ less than 80
...
Let idiot be defined as one with an IQ less than 90
...
(Hint this random variable is for a single person X)
If a sample of 25 students is available, what is the probability that the average IQ exceeds
105? What is the probability that the average IQ exceeds 115 (Hint this random variable is for
an average over 25 persons or X )
Solution





IQ  X ~ N 1101 , 102 , and therefore for a sample of 25 people average IQ  X ~ N1101 , 4
The probability that a randomly chosen person is a moron is given by
110
  (3)  0
...
0228
P( X  90)  PZ  9010
The probability that the average IQ exceeds 105 is P( X  105)
The random variable under consideration here is the average
...
Standard deviation of the sampling distribution =
 10  2
...
5)  (2
...
9938

dueto symmetry

We now find probability that the average IQ exceeds 115 ie

P( X  115)  P( Z  1152110)  P( Z  2
...
5) = 0
...
If a random
sample of 50 employees is taken, what is the probability that their average salary is;
a) less than $45,000?
b) between $45,000 and $65,000?
c) more than $70,000
2) Library usually has 13% of its books checked out
...
ANS= 0
...
02 cm
...
98 and 5
...
96 and 5
...
Find the probability that in a
random sample of 4 instrument produced by this machine, the average length of life
a) less than 10
...
b) between 11and 13 months
...
What is the probability that a
car can be assembled at this plant in a period of time
a) less than 19
...
1 Introduction
In research, one always has some fixed ideas about cetain population parameters based on
say, prior experiments, surveys or experience
...
There is
therefore a need to ascertain whether these ideas /claims are correct or not
...

We then decide whether our sample observations (statistic) have come from a postulated
population or not
...
68
On the basis of observation data, one then performs a test to decide whether the postulated
hypothesis should be accepted or not
...

Null Hypothesis ( denoted H0 ): Statement of zero or no change and is the hypothesis which
is to be actually tested for acceptance or rejection
...
If the original claim does not include equality (<, not
equal, >) then the null hypothesis is the complement of the original claim
...
The decision is based on the null hypothesis
...
S
...
it Challenges the status quo
...
The type of test (left, right, or
two-tail) is based on the alternative hypothesis
...
2 The Hypothesis Testing Process
Claim: The population mean age is 50
...
Suppose the sample mean age was x  20
...
If the null hypothesis were
44

true, the probability of getting such a different sample mean would be very small, so you
reject the null hypothesis
...

It is unlikely that you
would get a sample mean
of this value

When infact this were the
true population mean

-

If the sample mean is close to the assumed population mean, the null hypothesis is not
rejected
...

How far is “far enough” to reject H0? The critical value of a test statistic creates a “line in the
sand” for decision making -- it answers the question of how far is far enough
...
2
...
These errors are of two types:
Type I error; Mistake of rejecting the null hypothesis when it is true (saying false when
true)
...

The probability of a Type I Error is (denoted ) is Called the level of significance of the test
and it is Set by researcher in advance
...
05 and  = 0
...
If no level of
significance is given, use  = 0
...
The level of significance is the complement of the level of
confidence in estimation
...
The probability of a Type II Error is denoted by β
Remarks
45

1) The confidence coefficient (1-α) is the probability of not rejecting H0 when it is true
...

3) The power of a statistical test (1-β) is the probability of rejecting H0 when it is false
Possible Hypothesis Test Outcomes
Actual Situation
Decision

H0 True

H0 False

Do Not Reject H0

No Error

Type II Error

Probability 1 - α

Probability β

Type I Error

No Error

Probability α

Probability 1 - β

Reject H0

5
...
1 Relationship between Type I & Type II Error
Type I and Type II errors cannot happen at the same time
- A Type I error can only occur if H0 is true
- A Type II error can only occur if H0 is false
If Type I error probability (  ) increases, then Type II error probability ( β ) decreases
5
...
3 Level of Significance and the Rejection Region
Critical region: Set of all values which would cause us to reject H0
Critical value(s): The value(s) which separate the critical region from the non-critical region
...

H0: μ = 3
H1: μ ≠ 3

This is a two-tail test because there is a rejection region in both tails
Test statistic: Sample statistic used to decide whether to reject or fail to reject the null
hypothesis
Probability Value (P-value): The probability of getting the results obtained if the null
hypothesis is true
...
If the level of significance is the area beyond the critical values,
then the probability value is the area beyond the test statistic
...
It is either "reject the null hypothesis"
or "fail to reject the null hypothesis"
...

46

Conclusion: A statement which indicates the level of evidence (sufficient or insufficient), at
what level of significance, and whether the original claim is rejected (null) or supported
(alternative)
...
2
...

Here are the steps to performing hypothesis testing
a) Write the null and alternative hypothesis
...

c) specify the level of significance,  and find the critical value using the tables
d) Compute the test statistic
e) Make a decision to reject or fail to reject the null hypothesis
...
If the given claim
contains equality, or a statement of no change from the given or accepted condition, then it is
the null hypothesis, otherwise, if it represents change, it is the alternative hypothesis
...
s
...
v
...
s
...
v
...
s
...
v
...
s
...
v
...
If
the test statistic falls into the rejection region, reject the null hypothesis
...
Conclusions
are based on the original claim, which may be the null or alternative hypotheses
...
3 Approaches to Hypothesis Testing
There are three approaches to hypothesis testing namely Classical Approach, p vale approach
and the confidence interval approach
5
...
1 The Classical Approach
The Classical Approach to hypothesis testing is to compare a test statistic and a critical value
...

The Classical Approach also has three different decision rules, depending on whether it is a
left tail, right tail, or two tail test
...

5
...
2 P-Value Approach
The P-Value Approach, short for Probability Value, approaches hypothesis testing from a
different manner
...

The level of significance (alpha) is the area in the critical region
...

The p-value is the area to the right or left of the test statistic
...

If the test statistic is in the critical region, then the p-value will be less than the level of
significance
...
This rule
always holds
...

You will fail to reject the null hypothesis if the p-value is greater than or equal to the level of
significance
...
However, many statistical packages will give the p-value but not the critical value
...

Another benefit of the p-value is that the statistician immediately knows at what level the
testing becomes significant
...
06 would be rejected at an 0
...
05 level of significance
...

Here are a couple of statements to help you keep the level of significance the probability
value straight
...
It does not depend on
the sample at all
...

It is the probability at which we consider something unusual
...
It depends on the sample
...
It is the probability of getting the results we obtained if
the null hypothesis is true
...
3
...

a ) If the hypothesized value of the parameter lies within the confidence interval with a 1alpha level of confidence, then the decision at an alpha level of significance is to fail to
reject the null hypothesis
...

However, it has a couple of problems
...

 It requires that you compute the confidence interval first
...

5
...

This is true not only for means, but all of the testing we're going to be doing
...

The statistic and the critical values depends on whether σ, is known or unknown
...
4
...
The test statistic is
the standard formula you've seen before
...

Example Test at 5% level the claim that the true mean # of TV sets in US homes is equal to
3
...
84 (σ = 0
...

Determine the critical values
- For  = 0
...
96
Compute the test statistic ZSTAT

so the test statistic is: ZSTAT 
49

Xμ
σ/ n



2
...
8/ 100

 2
...
A simple random sample of 10 people from a certain population has a mean age of 27
...
Let =
...

2
...
The principal of the school thinks that the
average IQ of students at Bon Air is at least 110
...
Among the sampled students, the average IQ is
108
...
Central bank believes that if consumer confidence is too high, the economy risks over
heating
...
In either case,
the bank may choose to intervene by altering interest rates
...
We may assume the measure is normally distributed with standard
deviation 10
...
Which returned a sample mean of 54
for the index
...
05
...
A manager will switch to a new technology if the production process exceeds 80 units per
hour
...
Past experience has shown that the
standard deviation is 8
...
4
...
The test statistic is
very similar to that for the z-score, except that sigma has been replaced by s and z has been

x
s/ n
The critical value is obtained from the t-table
...

replaced by t
...
Ten 100kg bags are examined
...
Is there reason to believe that the machine is defective at 5% level
of significance?
50

Solution
Hypothesis H0: μ = 12

H1: μ ≠ 12

(This is a two-tail test)

σ is unknown so this is a t test
...
05 and 9 degrees freedom
t9, 0
...
262 ie reject H0: μ = 12 if tc  2
...
5 and s  1
...
5  12

 1
...
0801 / 10

Decision since tc  1
...
262 , we fail to reject H0 and conclude that the machine is not
defective
...

On the basis of this data, test whether the average profit is greater than 30M KSH at 1% level
of significance
Solution
Hypothesis H0: μ = 30 H1: μ > 30 (This is a 1-tail test)
σ is unknown so this is a t test
...
01 and 9 degrees freedom

t9, 0
...
82 ie reject H0: μ = 12 if tc  2
...
415 and s  3
...
415  30

 0
...
6601 / 10

Decision since tc  0
...
82 , we don’t reject H0 and conclude that the average profit is not
greater than 30M KSH
...
Identify the critical t value for each of the following tests:
a
...
05 and 11 degrees of freedom
b
...
01 and n=17
2
...
0 and s  2 Do the following hypothesis tests
...
H0: μ =8
...
7 at =0
...
7 H1: μ ≠ 8
...
05
3
...
6 degrees
Fahrenheit
...
The body
temperatures of n = 130 healthy adults were measured (half male and half female)
...
249 with a standard
51

deviation s = 0
...
Do these statistics contradict the belief that the average body
temperature is 98
...
A study is to be done to determine if the cognitive ability of children living near a lead
smelter is negatively impacted by increased exposure to lead
...
From a pilot study, the mean and standard deviation
were estimated to be x  89 and s = 14
...
Test at 5% level whether there is a
negative impact
...
The average cost of a hotel room in New York is said to be $168 per night
...
5
and s  $15
...
Test the appropriate hypotheses at  = 0
...

6
...
1 21
...
7 12
...
8 11
...
2 11
...
6 10
...
2
An earlier study reported that the mean shoot length is 15cm
...

7
...
2
...
Can we conclude that the BMI is not 35? Let =
...

subject
1
2
3
4
5
6
7
8
9
10 11 12 13 14
BM
23
25 21 37 39 21 23 24 32 57 23 26 31 45
8
...
How should a hypothesis test be stated? If there is
strong evidence that the mean return on the investment is below 10% this will give a
cautionary warning to a potential investor
...
82 x and s = 2
...
We know the distance that an athlete can jump is normally distributed but we do not
know the standard deviation
...
48 7
...
97 5
...
48 7
...
49
7
...
51 5
...
13 6
...
19 6
...
93 Test whether these values are consistent with
a mean jump length of 7m
...
The manufacturing process should give a weight of 20 ounces
...
356
(ounces) and s = 0

Title: mathematics
Description: basic and discrete statistics and a few coding and insurance notes

Buy These Notes Preview

Notesale: Turn your study into money

Already a Member? >

Search for notes by fellow students, in your own course and all over the country.

My Basket

Document Preview