Search for notes by fellow students, in your own course and all over the country.
Browse our notes for titles which look like what you need, you can preview any of the notes via a sample of the contents. After you're happy these are the notes you're after simply pop them into your shopping cart.
Document Preview
Extracts from the notes are below, to see the PDF you'll receive please use the links above
Descriptive Statistical Tools
Outline of Discussion
Lesson Proper
◦ Descriptive Statistics
Data Organization
Data Analysis
Statistical Measures
Case Study
Outline of Discussion
Lesson Proper
◦ Descriptive Statistics
Data Organization
Data Analysis
Statistical Measures
Case Study
Descriptive Statistics
What is Descriptive Statistics?
◦ Also known as deductive statistics
◦ Deals with gathering, classification, and
presentation of data
◦ Summarizes values to describe group
characteristics of data
Descriptive Statistics
Data Presentation
◦ Presenting data in tabular or graphical form is
not enough to get all the relevant information
◦ Data must be organized and analysis must be
readily made
Data organization tools
Frequency distribution table, histogram, ogive
Data analysis tools
Stem-and-leaf diagram, boxplot, time-series plot, probability
plot, scatter plot
Descriptive Statistics
Data Presentation
◦ The common statistics included may also not
be adequate to describe data
◦ There are many more measures used
Measures of central tendency
Mean, trimmed mean, median, mode
Measures of dispersion
Standard deviation, variance, range, interquartile range, mean
absolute deviation, coefficient of dispersion
Measures on individual data points
Standard deviation unit, standard score
Descriptive Statistics
Data Presentation
◦ The common statistics included may also not
be adequate to describe data
◦ There are many more measures used
Measures of location
Percentile
Measures of skewness and kurtosis
Coefficient of skewness, coefficient of peakedness
Measure of linear relationship
Correlation coefficient, Pearson’s r coefficient
Outline of Discussion
Lesson Proper
◦ Descriptive Statistics
Data Organization
Data Analysis
Statistical Measures
Case Study
Descriptive Statistics
Data Organization
◦ Frequency Distribution Table
How to group data
Original data presention
Number of cells
Cell width
Cell boundaries
Long Exam 1 Scores
71
48
...
25
65
...
75
53
...
75
51
...
25
65
...
5
63
...
75
66
...
25
66
...
75
67
...
75
54
...
25
42
...
5
67
...
25
61
...
75
66
...
75
49
...
75
60
...
5
54
...
25
58
...
75
56
...
25
63
...
25
55
...
5
49
...
5
59
...
5
69
...
5
79
...
5
49
...
5
59
...
5
69
...
5
79
...
5
49
...
5
59
...
5
69
...
5
79
...
5
49
...
5
59
...
5
69
...
5
79
...
5
49
...
5
59
...
5
69
...
5
79
...
5
49
...
5
59
...
5
69
...
5
79
...
5
49
...
5
59
...
5
69
...
5
79
...
0943
0
...
1509
0
...
1887
0
...
0755
0
...
0943
0
...
3585
0
...
7547
0
...
9623
1
Descriptive Statistics
Data Organization
◦ Cumulative Frequency Distribution Table
Can answer the ff
...
5
49
...
5
59
...
5
69
...
5
79
...
5
49
...
5
59
...
5
69
...
5
79
...
0943
0
...
1509
0
...
1887
0
...
0755
0
...
0943
0
...
3585
0
...
7547
0
...
9623
1
Descriptive Statistics
Data Organization
◦ Ogive
Line graph of CFD
Cell Boundaries
[42 - 47)
[47 - 52)
[52 - 57)
[57 - 62)
[62 - 67)
[67 - 72)
[72 - 77)
[77 - 82]
For ≤ CFD, x-axis is upper cell boundaries
For ≥ CFD, x-axis is lower cell boundaries
Y-axis is cumulative frequency
Y-axis can also be relative frequency
Cumulative Frequency Distribution Table
fi
xi
rel fi
≤ CFD
≥ CFD
5
6
8
11
10
7
4
2
44
...
5
54
...
5
64
...
5
74
...
5
0
...
1132
0
...
2075
0
...
1321
0
...
0377
5
11
19
30
40
47
51
53
53
48
42
34
23
13
6
2
rel CFD
0
...
2075
0
...
566
0
...
8868
0
...
5
49
...
5
59
...
5
69
...
5
79
...
0943
0
...
1509
0
...
1887
0
...
0755
0
...
0943
0
...
3585
0
...
7547
0
...
9623
1
Ogive for relative CFD
1
...
8
0
...
4
0
...
5
49
...
5
59
...
5
69
...
5
79
...
0943
0
...
1509
0
...
1887
0
...
0755
0
...
0943
0
...
3585
0
...
7547
0
...
9623
1
Ogive for ≥ CFD
60
50
40
30
20
10
0
42
47
52
57
62
67
72
77
Descriptive Statistics
Data Organization
◦ Ogive
Line graph of CFD
Cell Boundaries
[42 - 47)
[47 - 52)
[52 - 57)
[57 - 62)
[62 - 67)
[67 - 72)
[72 - 77)
[77 - 82]
For ≤ CFD, x-axis is upper cell boundaries
For ≥ CFD, x-axis is lower cell boundaries
Y-axis is cumulative frequency
Y-axis can also be relative frequency
Cumulative Frequency Distribution Table
fi
xi
rel fi
≤ CFD
≥ CFD
5
6
8
11
10
7
4
2
44
...
5
54
...
5
64
...
5
74
...
5
0
...
1132
0
...
2075
0
...
1321
0
...
0377
5
11
19
30
40
47
51
53
53
48
42
34
23
13
6
2
rel CFD
0
...
2075
0
...
566
0
...
8868
0
...
g
...
g
...
5
Get quartile 3 (75th percentile)
V Quartile 1 = 68, W Quartile 1 = 68
...
5
V Quartile 3 = 68, W Quartile 3 = 68
...
g
...
)
Y-axis: value of variable
Descriptive Statistics
Data Analysis
◦ Time-series Plot
How to interpret a time-series plot
Example: Average weight of manufactured 100g potato chip
packs per day
Mean is around 100g
No pattern meaning random
Considering acceptance limits, a few less than 97g or more than 103g
are rejected
Time-series Plot
106
Average Weight
104
102
100
98
96
94
92
1
3
5
7
9
11 13 15 17 19 21 23 25 27 29
Day
Descriptive Statistics
Data Analysis
◦ Time-series Plot
How to interpret a time-series plot
Example: Average weight of manufactured 100g potato chip
packs per day of another company
Mean is around 100g (at first)
Downward trend (something might be wrong with manufacturing)
Considering acceptance limits, many are being rejected
Time-series Plot
Average Weight
104
102
100
98
96
94
92
90
88
1
3
5
7
9
11 13 15 17 19 21 23 25 27 29
Day
Descriptive Statistics
Data Analysis
◦ Time-series Plot
How to interpret a time-series plot
Example: Monthly sales of jackets of an apparel store
Mean is around 80 units
Cyclic pattern (there might be a reason to this cycle)
Monthly sales of jackets increase when nearing 12th and 24th month
(because people are buying more jackets during the Ber months)
Time-series Plot
120
100
Sales
80
60
40
20
0
1
3
5
7
9 11 13 15 17 19 21 23 25 27 29
Month
Descriptive Statistics
Data Analysis
◦ Time-series Plot
How to interpret a time-series plot
Example: Price of ABS-CBN shares over 20 years
Descriptive Statistics
Data Analysis
◦ Time-series Plot
How to interpret a time-series plot
Example: Price of TEL (PLDT) shares over 5 years
Descriptive Statistics
Data Analysis
◦ Probability Plot
How to construct a probability plot
Normal probability plot is most commonly used
Long Exam 1 Scores (rounded off)
Original data in tabular form
71
49
70
Data is sorted in increasing order
46
65
61
70
61
53
Each data is plotted against:
49
59
51
46
63
63
46
60
55
43
68
57
50
76
65
64
63
68
76
52
76
67
58
55
67
66
52
57
73
62
59
72
60
43
55
60
58
71
56
79
53
63
78
56
Descriptive Statistics
Data Analysis
◦ Probability Plot
How to interpret a probability plot
If the points lie in a straight line, distribution is correct
Since this is a normal probability plot, distribution is normal
Descriptive Statistics
Data Analysis
◦ Probability Plot
How to interpret a probability plot
If the points are curved downwards, distribution is positively
skewed because small and large points are larger than
expected
Descriptive Statistics
Data Analysis
◦ Scatter Plot
How to construct a scatter plot
The values of two variables are plotted against each other
Descriptive Statistics
Data Analysis
◦ Scatter Plot
How to interpret a scatter plot
If the points form a diagonal line, variables are correlated
Perfect diagonal line means a correlation of one
Descriptive Statistics
Data Analysis
◦ Scatter Plot
How to interpret a scatter plot
If the points form a diagonal line, variables are correlated
Perfect horizontal line means correlation is zero
Descriptive Statistics
Data Analysis
◦ Scatter Plot
How to interpret a scatter plot
Later, correlation values such as Pearson’s R coefficient will
be discussed
Outline of Discussion
Lesson Proper
◦ Descriptive Statistics
Data Organization
Data Analysis
Statistical Measures
Case Study
Descriptive Statistics
Statistical Measures
◦ Measures of Central Tendency
Describes the tendency of sample data to cluster
around a particular value
Mean
Median
Mode
Descriptive Statistics
Statistical Measures
◦ Measures of Central Tendency
Mean
First moment about the origin
Average value of data
k% trimmed mean
Mean after eliminating the (k/2)% highest and (k/2)% lowest
data points
Less affected by extreme values
Descriptive Statistics
Statistical Measures
◦ Measures of Central Tendency
Median
Divides the data set into two equal halves
Less affected by extreme values (does not concern with
“weight” of values)
50th percentile (Quartile 2)
Descriptive Statistics
Statistical Measures
◦ Measures of Central Tendency
Mode
Most frequently occurring data point
Unimodal distribution: one mode/peak
Bimodal distribution: two modes/peaks
Descriptive Statistics
Statistical Measures
◦ Measures of Variability/Dispersion
Describes the variability or scattering of data
Used to gauge the reliability or accuracy of averages
(e
...
lower variability, closer to average)
Range
Interquartile range
Standard deviation
Variance
Mean absolute deviation
Coefficient of dispersion
Descriptive Statistics
Statistical Measures
◦ Measures of Variability/Dispersion
Range
Difference between smallest and largest value
Interquartile Range
Difference between Quartile 3 (75th percentile) and
Quartile 1 (25th percentile)
Descriptive Statistics
Statistical Measures
◦ Measures of Variability/Dispersion
Example:
Determine the 64th percentile, range, and interquartile range
of the following data set
Quiz Question
Number of hours of sleep per day
9
7
8
6
11
6
5
6
9
4
Hint: Arrange first in increasing order: 4, 5, 6, 6, 6, 7, 8, 9, 9, 11
Hint: Xth percentile = X*(n+1), range = max – min, interquartile
range = 75th percentile – 25th percentile
Descriptive Statistics
Statistical Measures
◦ Measures of Variability/Dispersion
Variance
Second moment about the origin
Squared deviation from the mean
Always positive
Sum all squared difference of a data point from the mean, then divide
over total data points minus one
Unit is unit2 of the variable
Descriptive Statistics
Statistical Measures
◦ Measures of Variability/Dispersion
Standard deviation
Most commonly used measure of variability/dispersion
Deviation of data from the mean
Always positive
Square root of variance
Unit is same as that of the variable
Descriptive Statistics
Statistical Measures
◦ Relative measure of Variability/Dispersion
Coefficient of dispersion
Used to compare different populations
The lesser the value, the more consistent the data
s is the sample standard deviation
x is the sample mean
Descriptive Statistics
Statistical Measures
◦ Relative measure of Variability/Dispersion
Example:
Determine who among two friends, A and B, have more
consistent sleeping hours
Quiz Question
Sleeping hours
Friend A
Friend B
xbar
9
6
...
5
Hint: Solve for each friend’s coefficient of variability
Descriptive Statistics
Statistical Measures
◦ Measures on Individual Data Points
Standard deviation unit
Distance of a point from the mean
The lesser the value, the closer the point is to the mean
s is the sample standard deviation
xi is the certain point in subject
xbar is the sample mean (or any point of reference)
Descriptive Statistics
Statistical Measures
◦ Measure of Symmetry
Coefficient of skewness
Determines the symmetry of a distribution
Third moment about the mean
xbar is the sample mean
n is the total number of data points
s is the sample standard deviation
a3 = 0; symmetric data set
a3 < 0; skewed to the left data set
a3 > 0; skewed to the right data set
Descriptive Statistics
Statistical Measures
◦ Measure of Symmetry
Coefficient of skewness
Descriptive Statistics
Statistical Measures
◦ Measure of Kurtosis
Coefficient of peakedness
Determines the height of a unimodal distribution
xbar is the sample mean
n is the total number of data points
s is the sample standard deviation
a4 = 3; data is mesokurtic (normal)
a4 > 3; data is leptokurtic (high peakedness)
a4 < 3; data is platykurtic (low peakedness)
Descriptive Statistics
Statistical Measures
◦ Measure of Kurtosis
Coefficient of peakedness
Descriptive Statistics
Statistical Measures
◦ Measure of Linear Relationship
Correlation coefficient
Determines the linearity between variables of a population
Pearson’s r coefficient
Determines the linearity between variables of a sample
Nonzero correlation coefficient means there is a linear relationship
Zero correlation coefficient means either they are independent of each
other or their relationship is nonlinear
Excel scatterplot uses r2 which is more accurate and reliable
Outline of Discussion
Lesson Proper
◦ Descriptive Statistics
Data Organization
Data Analysis
Statistical Measures
Case Study
Summary
Data Organization
◦
◦
◦
◦
◦
Frequency Distribution Table
Histogram
Cumulative Frequency Distribution Table
Ogive
Pareto Chart
Summary
Data Analysis
◦
◦
◦
◦
◦
Stem and Leaf Diagram
Boxplot
Time-series Plot
Probability Plot
Scatter Plot
Outline of Discussion
Lesson Proper
◦ Descriptive Statistics
Data Organization
Data Analysis
Statistical Measures
Case Study
CASE STUDY
For Each Data Set, Get the following:
◦
◦
◦
◦
◦
◦
◦
◦
Mean
Median
Mode
25th & 75th Percentile
Skewness
Kurtosis
Histogram
Ogives
Which Data sets are skewed?
Which Data sets are leptokurtic?
For each Data Set which measure of
Central Tendency is more appropriate to
be used when drawing conclusions?
Suppose that Data Set 2 is data used for
arm’s reach for Filipinos
...
95% of the Filipinos must be able
to reach this emergency button, how far
should the emergency button be?
A hazardous chemical shelf is to be
installed
...
You wish that only 5% of
the Filipino will be able to reach the
hazardous chemical easily, what is the best
height to be used?
Data Set 4 are length of steel rods
...
24 and anything
greater than 122