ANOVA
AN(C)OVA: Step-by-Step
Step 1: Defining the objectives: Why do we do ANOVA?
*We are looking for a causal relationship: dependence
Test whether treatments (categorical variables) lead to different levels for a
(set of) metric outcome variables.
*It measures marketing mix effectiveness
***#X IV’S= Non metric-Categorical (ordinal&nominal) DV= Metric
***Why do we do ANOVA instead of T tests?
We have to have T tests at the
amount of populations (levels of the
different factors) that we have.
Probability of erroneously finding
effect increases with number of
tests.
Q1: Do Store Sales increase with Use of Coupon? T test (independent samples)
#of factors=1= coupon usage
#of populations=2= 2 coupon levels
Q2: Do Store Sales increase with promo intensity (# price cuts=low, medium,
high)?: 1 way ANOVA
#of factors=1= promo intensity- price cut-
#of populations>=2=
Q3a: Do Store Sales increase with couponing and promo intensity?: 2 way
ANOVA
#of factors=2= promo intensity AND couponing
Q3b: What if we control for region wealth?: 2 way ANCOVA
Step 2 : Designing the AN(C)OVA
• Sample size: 20-30 respondents per population: Count func. Ile gelen tablo!
• Treatments (which variables?) and interactions: do you want treatment
variables to interact? + choose your DV
• Use covariates? Help us measure impact of treatment variables by taking
out variation of outcome variable- nothing to do with treatments. Adding
covariates can greatly improve the accuracy of the model and may
, significantly affect the final analysis results. Including a covariate in the
model can reduce the error in the model to increase the power of the factor
tests.
Step 3 :Checking assumptions: We have to have them all!
1)Observations must be independent! Each obs. one treatment combination
2)Variances (St. deviations) of the outcome must be equal across treatment
groups(populations): Homoscedasticity
Levene’s Test of Equality of Error Variances
Ho=equal variances: Do not reject: We want α> 0.05, we don’t want to reject!
*Reminder: PR(>F) & α<0.05= reject the null, we want α>cutoff
What happens if we have Heteroscedasticity???
If sample size is similar across treatment groups→ robust, test is strong enough from
deviations of assumption
Transform dependent variable (logarithm)→redo test
Adjust cutoff for significance
3)DV must be normally distributed: we want α>cutoff
What if there is no normality?
Large sample→robust
Small sample→transform dependent variable
Step 4 :Estimating the model
ANOVA calculation between groups
Measure variation between groups Measure variation within groups
**If between variation is larger than within, conclusion is # of price cuts(the
treatment variable we are looking for) made the difference.
Assess whether variation between groups is larger than variation within groups
Fratio=(MSSbetween/MSSwithin)
, *We can check the significance by checking alfa and comparing variations but
we can’t know the direction of the effect!
Step 5 :Interpreting the results-lm function is used in ANOVA-
1)Check the significance and type of interactions
Check the row of the interaction α<0.05 ise significant interaction. If they have
significant interaction this means they are dependent.
INTERACTION: Does the impact of a CHANGE in one treatment variable on the
dependent variable, depend on the LEVEL of the other treatment variable?
e.g. Does the impact of changing the promotion intensity (# price cuts) on
store sales, depend on whether a coupon is offered?
Ordinal interaction: same direction, magnitude depends on treatment
Disordinal interaction: both direction and magnitude depend on treatment
2) Effect of covariates-they must be continuous-
May add a covariate to check the significance again if there is no interaction
3) Significance of main effects
Check significance for the variables-main effects-
4)Effect size
Partial Eta Squared: Indication of % of variance in outcome variable explained
by specific variable. “% variance explained “How much of the variation of the
outcome is explained by that specific treatment variable.
.01 small. & .06 moderate & .14 large
5)Direction and significance of different specific population differences
Which jump between populations- levels of variables- is important?
*contrast effects or planned comparisons (apriori-theoretical expectations)
check estimate, emmean and p value: see diretion and significance
*post hoc tests or multiple comparisons: same with contrast effect
Step 6 : Validating the outcomes
Control through covariates
Replicate study
AN(C)OVA: Step-by-Step
Step 1: Defining the objectives: Why do we do ANOVA?
*We are looking for a causal relationship: dependence
Test whether treatments (categorical variables) lead to different levels for a
(set of) metric outcome variables.
*It measures marketing mix effectiveness
***#X IV’S= Non metric-Categorical (ordinal&nominal) DV= Metric
***Why do we do ANOVA instead of T tests?
We have to have T tests at the
amount of populations (levels of the
different factors) that we have.
Probability of erroneously finding
effect increases with number of
tests.
Q1: Do Store Sales increase with Use of Coupon? T test (independent samples)
#of factors=1= coupon usage
#of populations=2= 2 coupon levels
Q2: Do Store Sales increase with promo intensity (# price cuts=low, medium,
high)?: 1 way ANOVA
#of factors=1= promo intensity- price cut-
#of populations>=2=
Q3a: Do Store Sales increase with couponing and promo intensity?: 2 way
ANOVA
#of factors=2= promo intensity AND couponing
Q3b: What if we control for region wealth?: 2 way ANCOVA
Step 2 : Designing the AN(C)OVA
• Sample size: 20-30 respondents per population: Count func. Ile gelen tablo!
• Treatments (which variables?) and interactions: do you want treatment
variables to interact? + choose your DV
• Use covariates? Help us measure impact of treatment variables by taking
out variation of outcome variable- nothing to do with treatments. Adding
covariates can greatly improve the accuracy of the model and may
, significantly affect the final analysis results. Including a covariate in the
model can reduce the error in the model to increase the power of the factor
tests.
Step 3 :Checking assumptions: We have to have them all!
1)Observations must be independent! Each obs. one treatment combination
2)Variances (St. deviations) of the outcome must be equal across treatment
groups(populations): Homoscedasticity
Levene’s Test of Equality of Error Variances
Ho=equal variances: Do not reject: We want α> 0.05, we don’t want to reject!
*Reminder: PR(>F) & α<0.05= reject the null, we want α>cutoff
What happens if we have Heteroscedasticity???
If sample size is similar across treatment groups→ robust, test is strong enough from
deviations of assumption
Transform dependent variable (logarithm)→redo test
Adjust cutoff for significance
3)DV must be normally distributed: we want α>cutoff
What if there is no normality?
Large sample→robust
Small sample→transform dependent variable
Step 4 :Estimating the model
ANOVA calculation between groups
Measure variation between groups Measure variation within groups
**If between variation is larger than within, conclusion is # of price cuts(the
treatment variable we are looking for) made the difference.
Assess whether variation between groups is larger than variation within groups
Fratio=(MSSbetween/MSSwithin)
, *We can check the significance by checking alfa and comparing variations but
we can’t know the direction of the effect!
Step 5 :Interpreting the results-lm function is used in ANOVA-
1)Check the significance and type of interactions
Check the row of the interaction α<0.05 ise significant interaction. If they have
significant interaction this means they are dependent.
INTERACTION: Does the impact of a CHANGE in one treatment variable on the
dependent variable, depend on the LEVEL of the other treatment variable?
e.g. Does the impact of changing the promotion intensity (# price cuts) on
store sales, depend on whether a coupon is offered?
Ordinal interaction: same direction, magnitude depends on treatment
Disordinal interaction: both direction and magnitude depend on treatment
2) Effect of covariates-they must be continuous-
May add a covariate to check the significance again if there is no interaction
3) Significance of main effects
Check significance for the variables-main effects-
4)Effect size
Partial Eta Squared: Indication of % of variance in outcome variable explained
by specific variable. “% variance explained “How much of the variation of the
outcome is explained by that specific treatment variable.
.01 small. & .06 moderate & .14 large
5)Direction and significance of different specific population differences
Which jump between populations- levels of variables- is important?
*contrast effects or planned comparisons (apriori-theoretical expectations)
check estimate, emmean and p value: see diretion and significance
*post hoc tests or multiple comparisons: same with contrast effect
Step 6 : Validating the outcomes
Control through covariates
Replicate study