Ingekorte Samenvatting MRM
Overview of research methods
1. Experiments = establish causal relationship (X → Y)
2. ANOVA = ensure causality; compare means between groups (1 categorical IV, 1
continuous DV)
→ 5 assumptions:
- homogeneity of variance (Levene’s test, largest/smallest variance <3)
- Normal distribution of residuals
- independence of observations
- Measurement level = IV categorical and DV and covariate = continuous
- No extreme outliers
3. N-way ANOVA = test main effects + interaction between 1 categorical IVs
4. ANCOVA = ensure causality; compare adjusted means controlling for covariate
(categorical IVs + continuous covariate)
→ extra assumptions: 5 ANOVA +
- homogeneity of regression slopes = relationship DV and covariate the same
across all groups
- Independence of covariate and factors = covariate should not differ
systematically between groups
5. Regression = relationships; predict continuous DV using continuous/dummy IV
→ R2, Adjusted R2 = model fit; % of variance in DV explained by IV
→ assumptions:
- Linearity and additivity = linear relationship between IV and DV
- Statistical independence of errors = no autocorrelation
- Homoscedasticity = variance of residuals is constant across values
- Normal distributions of residuals
- No multicollinearity → VIF >4 moderate and, >10 strong multicollinearity
6. Binary logistic regression = predict binary outcome (0/1)
7. Moderation = tests when/for whom X affects Y (interaction)
→ mean-centering = subtract mean to make coefficient interpretable
→ log-transformation = non-linear relationship to make the variable look normally
distributed (within 10% effects can be interpreted as percentage)
8. Mediation = tests how/why X affect Y through M
→ Baron & Kenny (multiple regressions) and Bootstrapping (CI not include 0 = sig.)
9. Moderated mediation = mediation effect depends on Z
10. PCA = simplify variables; reduce uncorrelated variables to few uncorrelated
components
→ eigenvalue >1, together explain >70%/ individual explain 5% of total variance, scree
plot (elbow rule = stop before the elbow)
→ marker items = high loading (>0.05) means strong connection
11. EFA = simplify and explain variables; identify latent constructs behind observed
variables
, → marker items = high loading (>0.05) means strong connection
12. Reliability (Conbrach’s Alpha) = check measurement quality (internal consistency)
→ >0.7 acceptable reliability (implemented after PCA/EFA)
13. Cluster analysis = identify segments; group similar observations into segments
→ Hierarchical = builds nested clusters (explore number of clusters)
- Single linkage, complete linkage, average linkage, centroid method
(mean-based), wards method (minimize within-cluster variance)
→ K-Means = assign cases to fixed k clusters (optimize segmentation)
Key terms
Eigenvalue = amount of variance explained by each factor; should be >1
Communality = how much each variable’s variance is explained by all retained factors (sum
of squared loadings) → not <0.3, preferably >0.5
Factor loadings = correlations between variables and factors, showing which variables
define each factor (>0.5 = strong)
Standardized variables = rescaling variables (mean 0, SD = 1) so that they all have the same
unit of measurement, ensuring they contribute equally to the analysis
Model hit rate = overall classification accuracy of the model
→ correct prediction/total observations x 100%
Model naive rate = accuracy if you always predict the majority category (baseline)
→ largest group/total cases x 100%
Overview of research methods
1. Experiments = establish causal relationship (X → Y)
2. ANOVA = ensure causality; compare means between groups (1 categorical IV, 1
continuous DV)
→ 5 assumptions:
- homogeneity of variance (Levene’s test, largest/smallest variance <3)
- Normal distribution of residuals
- independence of observations
- Measurement level = IV categorical and DV and covariate = continuous
- No extreme outliers
3. N-way ANOVA = test main effects + interaction between 1 categorical IVs
4. ANCOVA = ensure causality; compare adjusted means controlling for covariate
(categorical IVs + continuous covariate)
→ extra assumptions: 5 ANOVA +
- homogeneity of regression slopes = relationship DV and covariate the same
across all groups
- Independence of covariate and factors = covariate should not differ
systematically between groups
5. Regression = relationships; predict continuous DV using continuous/dummy IV
→ R2, Adjusted R2 = model fit; % of variance in DV explained by IV
→ assumptions:
- Linearity and additivity = linear relationship between IV and DV
- Statistical independence of errors = no autocorrelation
- Homoscedasticity = variance of residuals is constant across values
- Normal distributions of residuals
- No multicollinearity → VIF >4 moderate and, >10 strong multicollinearity
6. Binary logistic regression = predict binary outcome (0/1)
7. Moderation = tests when/for whom X affects Y (interaction)
→ mean-centering = subtract mean to make coefficient interpretable
→ log-transformation = non-linear relationship to make the variable look normally
distributed (within 10% effects can be interpreted as percentage)
8. Mediation = tests how/why X affect Y through M
→ Baron & Kenny (multiple regressions) and Bootstrapping (CI not include 0 = sig.)
9. Moderated mediation = mediation effect depends on Z
10. PCA = simplify variables; reduce uncorrelated variables to few uncorrelated
components
→ eigenvalue >1, together explain >70%/ individual explain 5% of total variance, scree
plot (elbow rule = stop before the elbow)
→ marker items = high loading (>0.05) means strong connection
11. EFA = simplify and explain variables; identify latent constructs behind observed
variables
, → marker items = high loading (>0.05) means strong connection
12. Reliability (Conbrach’s Alpha) = check measurement quality (internal consistency)
→ >0.7 acceptable reliability (implemented after PCA/EFA)
13. Cluster analysis = identify segments; group similar observations into segments
→ Hierarchical = builds nested clusters (explore number of clusters)
- Single linkage, complete linkage, average linkage, centroid method
(mean-based), wards method (minimize within-cluster variance)
→ K-Means = assign cases to fixed k clusters (optimize segmentation)
Key terms
Eigenvalue = amount of variance explained by each factor; should be >1
Communality = how much each variable’s variance is explained by all retained factors (sum
of squared loadings) → not <0.3, preferably >0.5
Factor loadings = correlations between variables and factors, showing which variables
define each factor (>0.5 = strong)
Standardized variables = rescaling variables (mean 0, SD = 1) so that they all have the same
unit of measurement, ensuring they contribute equally to the analysis
Model hit rate = overall classification accuracy of the model
→ correct prediction/total observations x 100%
Model naive rate = accuracy if you always predict the majority category (baseline)
→ largest group/total cases x 100%