Search for notes by fellow students, in your own course and all over the country.

Browse our notes for titles which look like what you need, you can preview any of the notes via a sample of the contents. After you're happy these are the notes you're after simply pop them into your shopping cart.

My Basket

You have nothing in your shopping cart yet.

Title: notes statistics 1 : definitions , calculating variation , position ...
Description: notes of statistics part 1

Document Preview

Extracts from the notes are below, to see the PDF you'll receive please use the links above


Good Luck

Contents:
1-Definitions:
2-Measures of Central Tendency:
3-Measures of Variation:
4-Measures of Position:

1-Definitions:
Statistics
Collection of methods for planning experiments, obtaining data, and then organizing, summarizing,
presenting, analyzing, interpreting, and drawing conclusions
...


Population
All subjects possessing a common characteristic that is being studied
...

Parameter
Characteristic or measure obtained from a population
...

Descriptive Statistics
Collection, organization, summarization, and presentation of data
...
Performing hypothesis testing, determining
relationships between variables, and making predictions
...

Quantitative Variables
Variables which assume numerical values
...
Usually obtained by counting
...
Usually obtained by measurement
...

Ordinal Level
Level of measurement which classifies data into categories that can be ranked
...

Interval Level
Level of measurement which classifies data that can be ranked and differences are meaningful
...

Ratio Level
Level of measurement which classifies data that can be ranked, differences are meaningful, and there is
true zero
...

Random Sampling
Sampling in which the data is collected using chance methods or random numbers
...

Convenience Sampling
Sampling in which data is which is readily available is used
...

Each of these strata is then sampled using one of the other sampling techniques
...
Some of these groups are
randomly selected, and then all of the elements in those groups are selected
...
This can either be a population mean (denoted by
mu) or a sample mean (denoted by x bar)
Median
The midpoint of the data after being ranked (sorted in ascending order)
...

Mode
The most frequent number
Skewed Distribution
The majority of the values lie together on one side with a very few values (the tail) to the other side
...
In a negatively
skewed distribution, the tail is to the left and the mean is smaller than the median
...
In a symmetric distribution, the mean is
the median
...
This sum is divided by the total of the
weights
...
(Max + Min) / 2
Range
The difference between the highest and lowest values
...
It is the sum of the squares of the
deviations from the mean divided by the population size
...


Sample Variance

Unbiased estimator of a population variance
...
The units on
the variance are the units of the population squared
...
The population standard deviation is the square root of the population
variance and the sample standard deviation is the square root of the sample variance
...
The units on the standard
deviation is the same as the units of the population/sample
...
We won't work with the Coefficient of
Variation in this course
...
Chebyshev's theorem can be applied to any distribution regardless of its shape
...
Approximately 68% lies within 1 standard
deviation of the mean; 95% within 2 standard deviations; and 99
...

Standard Score or Z-Score
The value obtained by subtracting the mean and dividing by the standard deviation
...

Percentile
The percent of the population which lies below that value
...

Quartile
Either the 25th, 50th, or 75th percentiles
...

Decile
Either the 10th, 20th, 30th, 40th, 50th, 60th, 70th, 80th, or 90th percentiles
...
The lower hinge is the first
Quartile unless the remainder when dividing the sample size by four is 3
...
The upper hinge is the 3rd Quartile
unless the remainder when dividing the sample size by four is 3
...
Some
textbooks, and the TI-82 calculator, define the five values as the minimum, first Quartile, median, third
Quartile, and maximum
...

InterQuartile Range (IQR)
The difference between the 3rd and 1st Quartiles
...

Mild Outliers
Values which lie between 1
...
0 times the InterQuartile Range below the 1st Quartile or above the 3rd
Quartile
...

Extreme Outliers
Values which lie more than 3
...
Note, some texts use hinges instead of Quartiles
...
The arithmetic mean, the median, midrange, or mode
...

Mean
This is what people usually intend when they say "average"

Population Mean:

Sample Mean:

Frequency Distribution:
The mean of a frequency distribution is also the weighted mean
...
The median is the number in the middle
...
5 * (n + 1)
Raw Data
The median is the number in the "depth of the median" position
...

Ungrouped Frequency Distribution

Find the cumulative frequencies for the data
...
If the depth of the median is exactly 0
...

Grouped Frequency Distribution
This is the tough one
...
Some textbooks have you simply take the
midpoint of the class
...
The
correct process is to interpolate
...


Multiply this proportion by the class width and add it to the lower boundary of the median class
...
There may be no mode if no one value appears more than any
other
...

For grouped frequency distributions, the modal class is the class with the largest frequency
...

Summary
The Mean is used in computing other statistics (such as the variance) and does not exist for open ended
grouped frequency distributions (1)
...

The Median is the center number and is good for skewed distributions because it is resistant to change
...
The mode can be used with nominal data whereas the
others can't
...

The Midrange is not used very often
...

Property

Mean

Median

Mode

Midrange

Always Exists

No (1)

Yes

No (2)

Yes

Uses all data values

Yes

No

No

No

Affected by extreme values

Yes

No

No

Yes

Find out what proportion of the distance into the median class the median by dividing the sample size by 2,
subtracting the cumulative frequency of the previous class, and then dividing all that bay the frequency of
the median class
...


Mode

The mode is the most frequent data value
...
There may also be two modes (bimodal), three modes (trimodal), or more than three modes (multimodal)
...

Midrange
The midrange is simply the midpoint between the highest and lowest values
...
It is often not appropriate for skewed distributions such as salary
information
...

The Mode is used to describe the most typical case
...
The mode may or may not exist and there may be more than one value for the mode (2)
...
It is a very rough estimate of the average and is greatly affected by
extreme values (even more so than the mean)
...
It is simply the highest value minus the lowest value
...

Variance
"Average Deviation"
The range only involves the smallest and largest numbers, and it would be desirable to have a statistic
which involved all of the data values
...
So, the average deviation will always be zero
...


Population Variance
So, to keep it from being zero, the deviation from the mean is squared and called the "squared deviation
from the mean"
...


Unbiased Estimate of the Population Variance
One would expect the sample variance to simply be the population variance with the population mean
replaced by the sample mean
...
This formula has the problem that the estimated value isn't the same as the parameter
...


Standard Deviation
There is a problem with variances
...
That means that the units were
also squared
...


The sample standard deviation is not the unbiased estimator for the population standard deviation
...
It does have a standard deviation key
...

Sum of Squares (shortcuts)
The sum of the squares of the deviations from the means is given a shortcut notation and several alternative
formulas
...


"Within k standard deviations" interprets as the interval:

to


...

Empirical Rule
The empirical rule is only valid for bell-shaped (normal) distributions
...

 Approximately 68% of the data values fall within one standard deviation of the mean
...

 Approximately 99
...

The empirical rule will be revisited later in the chapter on normal probabilities
...
The symbol is z, which is why it's also called a z-score
...
This is the nice feature of the
standard score -- no matter what the original scale was, when the data is converted to its standard score, the
mean is zero and the standard deviation is 1
...
The data must be ranked
...

2
...

4
...

If this is an integer, add 0
...
If it isn't an integer round up
...
If your depth ends in 0
...

It is sometimes easier to count from the high end rather than counting from the low end
...
Rather than counting 80%
from the bottom, count 20% from the top
...

If you wish to find the percentile for a number (rather than locating the kth percentile), then

1
...
Add 0
...
Divide by the total number of values
Convert it to a percent
Deciles (10 regions)
The percentiles divide the data into 100 equal regions
...

The instructions are the same for finding a percentile, except instead of dividing by 100 in step 2,
divide by 10
...
Instead of dividing by 100 in step 2, divide by 4
...
The 1st quartile is the 25th percentile, the 3rd quartile is the
75th percentile
...
The TI-82 calculator will
find the quartiles for you
...

Hinges
The lower hinge is the median of the lower half of the data up to and including the median
...

The hinges are the same as the quartiles unless the remainder when dividing the sample size by four is three
(like 39 / 4 = 9 R 3)
...
If the median is split between two values (which happens whenever the sample size is even), the
median isn't included in either since the median isn't actually part of the data
...
5
...
The lower hinge is the median of the lower half and would be in position 5
...
The upper hinge is the
median of the upper half and would be in position 5
...
5
...
The lower half is positions 1 - 11 and the upper half is positions 11 - 21
...
The upper hinge is the median of
the upper half and would be in position 6 when starting at position 11 -- this is original position 16
...
Some textbooks use the quartiles instead of the hinges
...
A box is drawn between the lower and upper
hinges with a line at the median
...

Interquartile Range (IQR)
The interquartile range is the difference between the third and first quartiles
...
There are mild outliers and extreme outliers
...

Extreme Outliers


Title: notes statistics 1 : definitions , calculating variation , position ...
Description: notes of statistics part 1