Inferential Statistics – Test 1
Contents
501: Confidence interval for a proportion ....................................................................................... 2
Inferential statistics ......................................................................................................................... 2
Sampling distribution of the proportion ............................................................................................. 3
Confidence interval ......................................................................................................................... 4
502 – confidence interval for a mean.............................................................................................. 4
Sampling distribution of the mean .................................................................................................... 4
From normal distribution to the t-distribution .................................................................................... 5
Calculating the confidence interval .................................................................................................. 6
503 – testing what the mean is (two sided)...................................................................................... 6
Two-sided test of the mean .............................................................................................................. 6
3 experiments and the study of diEerences in means ......................................................................... 7
Sampling the distribution of the mean and the proportion compared .................................................. 7
520 – Describing and testing relationships between non-scale variables: chi-square ....................... 8
Measurement of association ............................................................................................................ 8
Pearson’s correlation coe.icient ................................................................................................... 9
Spearman’s rho correlation coe.icient.......................................................................................... 9
Kendall’s tau (b and c) .................................................................................................................. 9
Cramér’s V ................................................................................................................................... 9
Describing and assessing associations ....................................................................................... 10
The sampling distribution of a relation between two dichotomous variables.......................................10
The sampling distribution of ‘a table’ ........................................................................................... 10
Using a chi-square distribution ................................................................................................... 10
521 – Testing relationship between scale variables: Pearson and Spearman correlations ................ 11
Pearson’s r .....................................................................................................................................11
Spearman’s r ..................................................................................................................................11
522 – Theorizing about relationship between variables using linear equations ................................ 12
Linear equations bivariate ratio........................................................................................................12
Dichotomous and a ratio variable ....................................................................................................12
EEect of a nominal independent variable on a ratio variable ..............................................................13
EEect of a ratio variable and a dummy on a ratio variable ..................................................................13
530 - Describing the linear relationship in a sample ....................................................................... 13
Finding the best line describing the relationship between two ratio variables .....................................13
Describing a relationship between two ratio variables: standardization..............................................14
Why checking the data is crucial .....................................................................................................14
,531 – Assessing a bivariate relationship using data ........................................................................ 14
Sampling distribution of an eEect ....................................................................................................14
540 - describing and testing the eMect of a dummy variable on a ratio variable................................ 15
Assessing the diEerence between two groups when variances are equal or diEerent ..........................15
541 – describing and testing the eMect of a nominal variable on a ratio variable .............................. 16
Assessing the diEerences between several group, more than 2 .........................................................16
Two related but di.erent questions ............................................................................................. 16
551 – assessing the overall quality of a model: R-squared and the F-test ........................................ 18
501: Confidence interval for a proportion
Inferential statistics
Inferential
statistics
(To what extent) can we say something about the large population on the basis of a
single sample
When there are no problems during this process and random sampling is used à
‘inference’
Simple random sampling: everyone has the same change to get included in the sample
2
, Sample statistics (you selected the sample), random sampling, population
parameters…if you know two, you can make statements about the third
Sampling distribution of the proportion
Percentage: between 0 -100%
Fraction: between 0 -1
Proportion between 0-1
Parameter: number describing a whole population (e.g., population mean)
Statistic: number describing a sample (e.g., sample mean)
Population proportion: π ß Greek letter for population parameters
Sample proportion: p ß latin letter for sample statistics
à p is a (simple) sample statistics
Sampling distribution of a proportion:
In a large number of samples from a population with π, of size n, many will have (slightly)
diNerent p’s
The expected p = π
à Similar to normal distribution
However,
- Not all proportions are ‘possible’; with n=100, 50,5% impossible
- Can NEVER be < 0 or > 1
Binomial distribution
However, normal is reasonable approximation
Summary of sampling distribution of a proportion:
- Shape: normal distribution
- Mean of all proportions = π (which is unknown)
- SD of the sampling distribution (standard error) depends on sample size: n and
on π
à The bigger n is the closer de SD are
Standard deviation of the sampling distribution is called: standard error
Depends on:
- Population standard deviation
- Sample size (n)
- Association X and Y in the sample
3
Contents
501: Confidence interval for a proportion ....................................................................................... 2
Inferential statistics ......................................................................................................................... 2
Sampling distribution of the proportion ............................................................................................. 3
Confidence interval ......................................................................................................................... 4
502 – confidence interval for a mean.............................................................................................. 4
Sampling distribution of the mean .................................................................................................... 4
From normal distribution to the t-distribution .................................................................................... 5
Calculating the confidence interval .................................................................................................. 6
503 – testing what the mean is (two sided)...................................................................................... 6
Two-sided test of the mean .............................................................................................................. 6
3 experiments and the study of diEerences in means ......................................................................... 7
Sampling the distribution of the mean and the proportion compared .................................................. 7
520 – Describing and testing relationships between non-scale variables: chi-square ....................... 8
Measurement of association ............................................................................................................ 8
Pearson’s correlation coe.icient ................................................................................................... 9
Spearman’s rho correlation coe.icient.......................................................................................... 9
Kendall’s tau (b and c) .................................................................................................................. 9
Cramér’s V ................................................................................................................................... 9
Describing and assessing associations ....................................................................................... 10
The sampling distribution of a relation between two dichotomous variables.......................................10
The sampling distribution of ‘a table’ ........................................................................................... 10
Using a chi-square distribution ................................................................................................... 10
521 – Testing relationship between scale variables: Pearson and Spearman correlations ................ 11
Pearson’s r .....................................................................................................................................11
Spearman’s r ..................................................................................................................................11
522 – Theorizing about relationship between variables using linear equations ................................ 12
Linear equations bivariate ratio........................................................................................................12
Dichotomous and a ratio variable ....................................................................................................12
EEect of a nominal independent variable on a ratio variable ..............................................................13
EEect of a ratio variable and a dummy on a ratio variable ..................................................................13
530 - Describing the linear relationship in a sample ....................................................................... 13
Finding the best line describing the relationship between two ratio variables .....................................13
Describing a relationship between two ratio variables: standardization..............................................14
Why checking the data is crucial .....................................................................................................14
,531 – Assessing a bivariate relationship using data ........................................................................ 14
Sampling distribution of an eEect ....................................................................................................14
540 - describing and testing the eMect of a dummy variable on a ratio variable................................ 15
Assessing the diEerence between two groups when variances are equal or diEerent ..........................15
541 – describing and testing the eMect of a nominal variable on a ratio variable .............................. 16
Assessing the diEerences between several group, more than 2 .........................................................16
Two related but di.erent questions ............................................................................................. 16
551 – assessing the overall quality of a model: R-squared and the F-test ........................................ 18
501: Confidence interval for a proportion
Inferential statistics
Inferential
statistics
(To what extent) can we say something about the large population on the basis of a
single sample
When there are no problems during this process and random sampling is used à
‘inference’
Simple random sampling: everyone has the same change to get included in the sample
2
, Sample statistics (you selected the sample), random sampling, population
parameters…if you know two, you can make statements about the third
Sampling distribution of the proportion
Percentage: between 0 -100%
Fraction: between 0 -1
Proportion between 0-1
Parameter: number describing a whole population (e.g., population mean)
Statistic: number describing a sample (e.g., sample mean)
Population proportion: π ß Greek letter for population parameters
Sample proportion: p ß latin letter for sample statistics
à p is a (simple) sample statistics
Sampling distribution of a proportion:
In a large number of samples from a population with π, of size n, many will have (slightly)
diNerent p’s
The expected p = π
à Similar to normal distribution
However,
- Not all proportions are ‘possible’; with n=100, 50,5% impossible
- Can NEVER be < 0 or > 1
Binomial distribution
However, normal is reasonable approximation
Summary of sampling distribution of a proportion:
- Shape: normal distribution
- Mean of all proportions = π (which is unknown)
- SD of the sampling distribution (standard error) depends on sample size: n and
on π
à The bigger n is the closer de SD are
Standard deviation of the sampling distribution is called: standard error
Depends on:
- Population standard deviation
- Sample size (n)
- Association X and Y in the sample
3