HYPOTHESIS TESTING
Inference on Test procedure Test statistic Assumptions
1 sample std norm. test
z = __p̂ - p__
H0: p = p0
a proportion √p(1-p) - random/very large samples
H1: p ≠ p0
n
Where p is the proportion of x
1 sample t-test t = _X̅ - μ_
H0: μ = μ0 _s_ - random samples
a mean
H1: μ ≠ μ0 √n - norm. dist. population
Where μ is the proportion of x - on n-1 df
2 sample std. norm. test
z = ___p̂1 - p̂2 ___
H0: p1 = p2 - independent/random/very
2 proportions √ p̂(1- p̂) (1 + 1)
H1: p1 ≠ p2 large samples
n (n1 n2)
Where p1, p2 are the proportions of x, y
Paired sample t-test t = _X̅ - 0_
H0: μ1 = 0 _s_
2 means (paired) “ ”
H1: μ1 ≠ 0 √n
Where μ is the population mean inc./dec. in x (μ1-μ2) - on n-1 df
Independent samples t-test - independent/random
H0: μ1 = μ2 z = __ X̅ 1 - X̅ 2 __ - norm. dist. population
H1: μ1 ≠ μ2 √_s12_ + _s22_ - samples of equal size
2 means (independent)
Where μ1, μ2 are the means of x, y n12 n22 provides robustness
- df found in R output against dist. departures
X2 goodness of fit test - usually in R output - independent observations
multiple proportions H0: x is distributed in a ratio of _:_:_:_ - on ‘no. of cells – 1’ df - random samples
H1: x follows another distribution (observed – expected)2 / expected - all expected values >5
X2 test of association expected value = (row total x column total)
H0: there is no association between x and y grand total
2 categorical variables “ ”
H1: there is an association - usually in R output
Regression test R output:
- assume relationship (linear)
H0: b = 0 - p-value of b is evidence against H0 (2-tailed)
- independent/random
2 numerical variables H1: b ≠ 0 - R2 shows strength of correlation, determines
- norm. dist.
Where b is the slope of the line relating x to y reliability of estimation
- uniform variance
- n-2 df
ANOVA - independence/random
H0: mean x are the same for all x Between samples variance observations between samples
multiple means
H1: mean x vary between the x Within samples variance - norm. dist. populations
- equal variance
Confidence level Z-value
90% (0.95) 1.65
95% (0.975) 1.96
99% (0.995) 2.58
Inference on Test procedure Test statistic Assumptions
1 sample std norm. test
z = __p̂ - p__
H0: p = p0
a proportion √p(1-p) - random/very large samples
H1: p ≠ p0
n
Where p is the proportion of x
1 sample t-test t = _X̅ - μ_
H0: μ = μ0 _s_ - random samples
a mean
H1: μ ≠ μ0 √n - norm. dist. population
Where μ is the proportion of x - on n-1 df
2 sample std. norm. test
z = ___p̂1 - p̂2 ___
H0: p1 = p2 - independent/random/very
2 proportions √ p̂(1- p̂) (1 + 1)
H1: p1 ≠ p2 large samples
n (n1 n2)
Where p1, p2 are the proportions of x, y
Paired sample t-test t = _X̅ - 0_
H0: μ1 = 0 _s_
2 means (paired) “ ”
H1: μ1 ≠ 0 √n
Where μ is the population mean inc./dec. in x (μ1-μ2) - on n-1 df
Independent samples t-test - independent/random
H0: μ1 = μ2 z = __ X̅ 1 - X̅ 2 __ - norm. dist. population
H1: μ1 ≠ μ2 √_s12_ + _s22_ - samples of equal size
2 means (independent)
Where μ1, μ2 are the means of x, y n12 n22 provides robustness
- df found in R output against dist. departures
X2 goodness of fit test - usually in R output - independent observations
multiple proportions H0: x is distributed in a ratio of _:_:_:_ - on ‘no. of cells – 1’ df - random samples
H1: x follows another distribution (observed – expected)2 / expected - all expected values >5
X2 test of association expected value = (row total x column total)
H0: there is no association between x and y grand total
2 categorical variables “ ”
H1: there is an association - usually in R output
Regression test R output:
- assume relationship (linear)
H0: b = 0 - p-value of b is evidence against H0 (2-tailed)
- independent/random
2 numerical variables H1: b ≠ 0 - R2 shows strength of correlation, determines
- norm. dist.
Where b is the slope of the line relating x to y reliability of estimation
- uniform variance
- n-2 df
ANOVA - independence/random
H0: mean x are the same for all x Between samples variance observations between samples
multiple means
H1: mean x vary between the x Within samples variance - norm. dist. populations
- equal variance
Confidence level Z-value
90% (0.95) 1.65
95% (0.975) 1.96
99% (0.995) 2.58