Chapter 2 - Gathering Data
SAMPLES PARAMETERSANDSTATISTICS
POPULATIONAND
Aparameterisanumericalcharacteristicofapopulation whereasastatisticis anumericalcharacteristic
ofasample
Randomsamplewheneverpossibleasitpreventssystematicbias
simplerandomsamplingensureseverymemberofthepopulationhasanequalchanceofbeingselected
Stratified2ndClustersamplingareothertypesofrandom sampling
Experiments involve
imposingtreatments ontheexperimentalunitswhereasobservationstudiesdonot
lurkingvariablesaremorelikelytobefoundinobservationalstudies
Well designedrandomizedexperimentscan evidenceofacauseandeffect relationship
givestrong
, Chapter 3 - Descriptive Statistics
MEASURESOFCENTRAL TENDENCY
Mean median andmodearemeasuresofcentraltendency
Amodalclassistheclasson ahistogramwiththemostobservations
Mean medianfor
right skeweddistributions
Meana medianforleftskeweddistributions
Mean medianforperfectlysymmetricdistributions
Medianmaybeabettermeasurewhenoutliersarepresent
OFVARIABILITY
MEASURES
Adeviationisthedistancefromamean
Thesumofdeviationsisalwayszero
Meanabsolutedeviationis theaveragedistancefromthemean
MAD ElXi I 1 In
Samplevarianceistheaveragesquareddistancefromthemean
g2 ECxi I 74n i
standarddeviation s is ECxi.IT nT
ThelargerSandS2 thelargerthevariability
Thestandarddeviationcanbelessthanthemeanandthe3rdquartile
s cannotbelessthanMAD
WHYDIVIDEBY n 1
weloseonedegreeoffreedomwhenweestimatethepopulationmeanwiththesamplemean
Onlydividingby ntendstounderestimatethevarianceandstandarddeviation
z Scores
Anobservation's 2 scoreis ameasureofthesizeofthatobservationrelativetootherobservationsintheset
Unitlesshaveameanofzero andastandarddeviationofone
Empiricalrule
68 of z scoresliebetween Iand I
n95 ofz scoresliebetween 2and2
All oralmostall2 scoresliebetween 3and3
Positivez scores observation mean
Negative z scores observationa mean
INTERPRETINGTHESTANDARDDEVIATION
Chebyshev'sinequalitygivesalowerboundontheproportionofobservationsthatliewithinacertaindistance
ofthemean
UseChebyshev'sinequalitywhenthedistribution is notmoundshaped
Formound shapeddistributions theempiricalruleStates
68 ofobservationsliewithin 1standarddeviationofthemean
95 ofobservationsliewithin2standarddeviations ofthemean
Alloralmostallofobservationsliewithin3standarddeviationsofthemean
PERCENTILES
Thepthpercentileisthevalueofthevariablesuchthatp oftheordereddatavaluesareat or
belowthevalue
Howtocalculatepercentiles
1 Orderobservationsfromsmallesttolargest
2 Calculate n x7100
3 If nx17100 isnotanintegerrounduptothenextwholenumbertogetthecorrespondingposition
fromthedataset
4 If n Mooisanintegerthecorrespondingpositionistheaverageof nxMooand n xHoo I
SAMPLES PARAMETERSANDSTATISTICS
POPULATIONAND
Aparameterisanumericalcharacteristicofapopulation whereasastatisticis anumericalcharacteristic
ofasample
Randomsamplewheneverpossibleasitpreventssystematicbias
simplerandomsamplingensureseverymemberofthepopulationhasanequalchanceofbeingselected
Stratified2ndClustersamplingareothertypesofrandom sampling
Experiments involve
imposingtreatments ontheexperimentalunitswhereasobservationstudiesdonot
lurkingvariablesaremorelikelytobefoundinobservationalstudies
Well designedrandomizedexperimentscan evidenceofacauseandeffect relationship
givestrong
, Chapter 3 - Descriptive Statistics
MEASURESOFCENTRAL TENDENCY
Mean median andmodearemeasuresofcentraltendency
Amodalclassistheclasson ahistogramwiththemostobservations
Mean medianfor
right skeweddistributions
Meana medianforleftskeweddistributions
Mean medianforperfectlysymmetricdistributions
Medianmaybeabettermeasurewhenoutliersarepresent
OFVARIABILITY
MEASURES
Adeviationisthedistancefromamean
Thesumofdeviationsisalwayszero
Meanabsolutedeviationis theaveragedistancefromthemean
MAD ElXi I 1 In
Samplevarianceistheaveragesquareddistancefromthemean
g2 ECxi I 74n i
standarddeviation s is ECxi.IT nT
ThelargerSandS2 thelargerthevariability
Thestandarddeviationcanbelessthanthemeanandthe3rdquartile
s cannotbelessthanMAD
WHYDIVIDEBY n 1
weloseonedegreeoffreedomwhenweestimatethepopulationmeanwiththesamplemean
Onlydividingby ntendstounderestimatethevarianceandstandarddeviation
z Scores
Anobservation's 2 scoreis ameasureofthesizeofthatobservationrelativetootherobservationsintheset
Unitlesshaveameanofzero andastandarddeviationofone
Empiricalrule
68 of z scoresliebetween Iand I
n95 ofz scoresliebetween 2and2
All oralmostall2 scoresliebetween 3and3
Positivez scores observation mean
Negative z scores observationa mean
INTERPRETINGTHESTANDARDDEVIATION
Chebyshev'sinequalitygivesalowerboundontheproportionofobservationsthatliewithinacertaindistance
ofthemean
UseChebyshev'sinequalitywhenthedistribution is notmoundshaped
Formound shapeddistributions theempiricalruleStates
68 ofobservationsliewithin 1standarddeviationofthemean
95 ofobservationsliewithin2standarddeviations ofthemean
Alloralmostallofobservationsliewithin3standarddeviationsofthemean
PERCENTILES
Thepthpercentileisthevalueofthevariablesuchthatp oftheordereddatavaluesareat or
belowthevalue
Howtocalculatepercentiles
1 Orderobservationsfromsmallesttolargest
2 Calculate n x7100
3 If nx17100 isnotanintegerrounduptothenextwholenumbertogetthecorrespondingposition
fromthedataset
4 If n Mooisanintegerthecorrespondingpositionistheaverageof nxMooand n xHoo I