ONE-SAMPLE Z TEST
Univariate Numerical Statistical hypotheses H0: μvar = μ H1: μvar ≠ μ
Graphical summary
Summaries Summary
know SD, one numeric variable, RQ compares our average
Bar and pie chart (mean) score with another average
1 Categorical Frequency table Assumptions
graph bar(count), over Check with: histogram- histogram var, freq
Variable tab1 var1, percent
(var1) Shapiro Wilk- swilk var (p-value must be >.05
Summary statistics Variables required numerical
1 Numerical Histogram
summarize var1,
Variable histogram var1 Stata command ztest var == mean, sd(SD)
detail
Degrees of freedom No df (standardised with perfect bell curve)
Bivariate Numerical
Graphical summary Effect size N/A (z= 1.96 +or- is the value that cuts off 5% in both tails)
Summaries Summary
Contingency tables ONE-SAMPLE T TEST
Clustered bar chart
2 Categorical (cont/%) H0: μvar = μ
graph bar(percent), Statistical hypotheses
Variables tabulate var1 var2, H1: μvar ≠ μ
over(var1) over(var2)
row
unknown SD, RQ compares our average (mean) score with
2 Numerical Correlation Scatter plot another average
Assumptions
Variables correlate var1 var2 scatter var1 var2 equally distributed: histogram- histogram var, freq
Shapiro Wilk- swilk var (p-value must be >.05
Summary statistics
Box plot
1 Categorical, 1 by cat_var, sort: Variables required numerical
graph hbox var1, over
Numerical summarize num_var,
var2 Stata command ttest var == mean
detail
Degrees of freedom df = n-1 (n=population size)
Effect sizes Cohen's d Pearson's r Choen's W
Cohen’s D: d = M - μ / S (M- sample mean, μ- hyp pop.
Correlational Chi-square test Effect size
Used for: t-test (x3) mean, S- sample SD)
test (x3)
CHI-SQUARE GOODNESS OF FIT TEST
Direction + or - + or - + only
Statistical hypotheses H0: μ1= μ2 H1: Proportions are not as stated
Negligible < 0.2 < 0.1 < 0.1
observations are independent, categorical (nominal or
Small Assumptions
0.2 - 0.5 0.1 - 0.3 0.1 - 0.3 ordinal), expected freq. in each category is ≥ 5
Medium 0.5 - 0.8 0.3 - 0.5 0.3 - 0.5 Variables required 1 categorical
Large ≥ 0.8 0.5 - 1 ≥ 0.5 with data: csgof cat_var, expperc(perc1, perc2,...)
Stata command
Tells you how meaningful the relationship between variables is. without data: display chi2tail(df,test_statistic)
Large effect = findings have practical significance. Degrees of freedom df = #catogories - 1
Measurement Effect size Cohen’s W: W = √x / N (x=chi-square value, n=sample size)
Definition Example
Levels
Nominal No order, categorical Gender (M/F/other) INDEPENDENT SAMPLES T TEST (TWO-SAMPLE T-TEST)
Statistical hypotheses H0: μ1 = μ2 H1: μ1 ≠ μ2
Ordinal Ordered, categorical Level of education
Interval 2 independent groups, equality of variance (Levene's: robvar
No absolute 0, numeric Temperature
var, by(cat_var) failure to reject the null hypothesis (p-
Assumptions
Depression scores, age, value>0.05) is what we want, normal distribution of DV,
Ration Absolute 0, numeric
cm observations are independent (within and between groups)
Variables required IV cat (2 levels) and DV numeric
Subjects
Between-subjects design Within-subjects design
Design Stata command ttest DV var, by(IV cat_var)
Degrees of freedom df = sample size1 + sample size2 - 2
Subjects are assigned to Same group of people
different conditions, each experiencing all Effect size Cohen's D: d = mean1 - mean 2 / pooled SD
experiencing only one conditions: Does
Definition
condition: Does Group A condition A cause a PEARSON'S CORRELATION
cause a different score of different score of Y than Statistical hypotheses H0 = no association H1 = is an association
Y than Group B? condition B?
numeric variables, monotonic linear relationship
Assumptions
More statistical power (scatterplot), observations are independent
No risk of order effects Less participants
Pros Variables required 2 numeric
Shorter experimental time Less variation between
participants Stata command pwcorr var1 var2, obs sig
Less statistical power Degrees of freedom df = n - 2 (n=population size)
Longer experimental time
Cons Larger sample size
Potential for order effects
required r - correlation is both a test statistic AND measure of effect
Effect size
size
Both have two variables: A categorical (two-group) IV and a
numeric DV
Univariate Numerical Statistical hypotheses H0: μvar = μ H1: μvar ≠ μ
Graphical summary
Summaries Summary
know SD, one numeric variable, RQ compares our average
Bar and pie chart (mean) score with another average
1 Categorical Frequency table Assumptions
graph bar(count), over Check with: histogram- histogram var, freq
Variable tab1 var1, percent
(var1) Shapiro Wilk- swilk var (p-value must be >.05
Summary statistics Variables required numerical
1 Numerical Histogram
summarize var1,
Variable histogram var1 Stata command ztest var == mean, sd(SD)
detail
Degrees of freedom No df (standardised with perfect bell curve)
Bivariate Numerical
Graphical summary Effect size N/A (z= 1.96 +or- is the value that cuts off 5% in both tails)
Summaries Summary
Contingency tables ONE-SAMPLE T TEST
Clustered bar chart
2 Categorical (cont/%) H0: μvar = μ
graph bar(percent), Statistical hypotheses
Variables tabulate var1 var2, H1: μvar ≠ μ
over(var1) over(var2)
row
unknown SD, RQ compares our average (mean) score with
2 Numerical Correlation Scatter plot another average
Assumptions
Variables correlate var1 var2 scatter var1 var2 equally distributed: histogram- histogram var, freq
Shapiro Wilk- swilk var (p-value must be >.05
Summary statistics
Box plot
1 Categorical, 1 by cat_var, sort: Variables required numerical
graph hbox var1, over
Numerical summarize num_var,
var2 Stata command ttest var == mean
detail
Degrees of freedom df = n-1 (n=population size)
Effect sizes Cohen's d Pearson's r Choen's W
Cohen’s D: d = M - μ / S (M- sample mean, μ- hyp pop.
Correlational Chi-square test Effect size
Used for: t-test (x3) mean, S- sample SD)
test (x3)
CHI-SQUARE GOODNESS OF FIT TEST
Direction + or - + or - + only
Statistical hypotheses H0: μ1= μ2 H1: Proportions are not as stated
Negligible < 0.2 < 0.1 < 0.1
observations are independent, categorical (nominal or
Small Assumptions
0.2 - 0.5 0.1 - 0.3 0.1 - 0.3 ordinal), expected freq. in each category is ≥ 5
Medium 0.5 - 0.8 0.3 - 0.5 0.3 - 0.5 Variables required 1 categorical
Large ≥ 0.8 0.5 - 1 ≥ 0.5 with data: csgof cat_var, expperc(perc1, perc2,...)
Stata command
Tells you how meaningful the relationship between variables is. without data: display chi2tail(df,test_statistic)
Large effect = findings have practical significance. Degrees of freedom df = #catogories - 1
Measurement Effect size Cohen’s W: W = √x / N (x=chi-square value, n=sample size)
Definition Example
Levels
Nominal No order, categorical Gender (M/F/other) INDEPENDENT SAMPLES T TEST (TWO-SAMPLE T-TEST)
Statistical hypotheses H0: μ1 = μ2 H1: μ1 ≠ μ2
Ordinal Ordered, categorical Level of education
Interval 2 independent groups, equality of variance (Levene's: robvar
No absolute 0, numeric Temperature
var, by(cat_var) failure to reject the null hypothesis (p-
Assumptions
Depression scores, age, value>0.05) is what we want, normal distribution of DV,
Ration Absolute 0, numeric
cm observations are independent (within and between groups)
Variables required IV cat (2 levels) and DV numeric
Subjects
Between-subjects design Within-subjects design
Design Stata command ttest DV var, by(IV cat_var)
Degrees of freedom df = sample size1 + sample size2 - 2
Subjects are assigned to Same group of people
different conditions, each experiencing all Effect size Cohen's D: d = mean1 - mean 2 / pooled SD
experiencing only one conditions: Does
Definition
condition: Does Group A condition A cause a PEARSON'S CORRELATION
cause a different score of different score of Y than Statistical hypotheses H0 = no association H1 = is an association
Y than Group B? condition B?
numeric variables, monotonic linear relationship
Assumptions
More statistical power (scatterplot), observations are independent
No risk of order effects Less participants
Pros Variables required 2 numeric
Shorter experimental time Less variation between
participants Stata command pwcorr var1 var2, obs sig
Less statistical power Degrees of freedom df = n - 2 (n=population size)
Longer experimental time
Cons Larger sample size
Potential for order effects
required r - correlation is both a test statistic AND measure of effect
Effect size
size
Both have two variables: A categorical (two-group) IV and a
numeric DV