Berbelotte van de Kamp
Lecture 1:
To see whether differences are statistically significant you will need a one-way
ANOVA. You can perform several T-tests instead, but you will probably have
inflation of surprise. The probability of finding a surprising result when
performing multiple comparisons on data goes up. The chance of type 1 error
(reject null hypothesis when it is true) goes up.
The Ha of one-way ANOVA is that at least two group means are different from one
another. Therefore, it is non-directional.
One way anova example: NSL.aov <- aov(MannerPath ~ Cohort, data = NSL)
Assumptions of one-way ANOVA:
The observations in the samples are independent from one another.
The response variable is at least interval-scaled.
Each sample is drawn from a normally distributed population and/or the
sample sizes are equal.
The variance is homogeneous or homoscedastic. (the variances of the
populations represented by the groups should be equal.)
The normality is good when the Shapiro test for each group comes back with a
p-value over the alpha of 0.05. But the Shapiro test is dependent on the sample
size.
With the Levene test (leveneTest()) you can test homogeneity of variance or
homoscedasticity. The null hypothesis of the Levene test is that the groups have
equal variances. You will also need a p-value higher than 0.05. You can also test
this with the Fligner-Killeen median test (fligner.test()). In this case you will
also need the p-value to be greater than 0.05.
Between-group variability measures variance of group means. In the aov()
output, you can find the cohort row (in the column mean Sg) which represents
the mean between-group variability.
Mean between-group variability= Sum Sq/ degrees of freedom (df).
Within-group variability is variability that can be attributed to chance factors.
It is also called error or residual variability. The greater the average between-
group variability in comparison with the average within-group variability, the
greater the F-ratio.
, There are tests you can do when one or more assumptions of one-way ANOVA are
not met:
oneway.test() --> when variance is not homogeneous
Kruskal-Wallis one-way ANOVA (kruskal.test()) --> response variable is
on the ordinal scaled or when the samples come from markedly non-
normal distributions. However at least one of the two assumptions should
still be met. It can also be used on interval- or ratio-scaled data when one
needs to reduce the impact of outliers.
Non-parametric ANOVA--> all the assumptions concerning the
distribution are violated except for the assumption of independence
Repeated-measures and mixed ANOVA should be used when
observations are dependent.
Post-hoc tests:
The F-ratio test can tell is that there is some significant difference somewhere,
but it doesn’t say where. To find out which groups differ significantly you can
perform a post hoc test.
The post hoc tests:
The Tukey Honest Significant differences (TukeyHSD(aov)): the
function returns the adjusted honest p-value. There are two assumptions
that should be met: homogeneous variances and independence of
variables. The function gives the differences between the group means for
the three pair of groups.
Non-parametric test (nparcomp()), if the assumption of equal variances
is violated. The output is which groups are compared and the p-value.
Pairwise.t.test(): it offers a range of corrections for inflation of surprise
such as Bonferroni correction.
Lecture 2:
You can use an interaction plot to see if the data has interaction.
The assumptions of the two-way ANOVA are the same as the one-way ANOVA.
To evaluate the contribution of each variable and interaction term(s), you have
to calculate sums of squares (preferably type 3), which are required for
calculating the F-score.
If some of the assumptions are not met for the two-way ANOVA you can use
these tests:
White’s adjustment for factorial ANOVA when the assumption of
homogeneity of variance is violated. Anova()
Linear regression model and perform a bootstrap validation of the
confidence intervals around the estimated coefficients. This is when
the normality and homogeneous assumptions are violated.
Repeated measures or Mixed ANOVA when the observations are not
independent.
Post-hoc tests: As in one-way ANOVA, one can use post hoc tests for factors
with more than two levels to find out which groups are different and which are
not.