1. Variables & Operationalisation contingencies indicated a stat. sig/non-sig., [S/M/L] – sized rsp between [what
Statistical sig. (p-value) tells us if a relationship exists we’re comparing], χ2 (1, N = x) = xx.xx, p < .xxx, ϕ = .xxx.
Variable: Anything that can take on multiple values (e.g., height, anxiety) 17. The Research Process
Operational Definition: How a variable is measured (e.g., anxiety via BAI or Identify problem/topic of interest
GSR). Gathering background info
2. Measurement Scales Generating a hypothesis
Nominal: Gender Eye Colour Testing the hypothesis
Ordinal: Magnitude (Ranking) Drawing conclusions
Interval: Magnitude, Equal Intervals (Temperature, Scale Scores) 18. External Validity
Ratio: Magnitude, Equal Intervals, Absolute Zero (# of items, Height) The degree to which results can be generalised to other groups of
3. Central Tendency individuals, situations & times
Mode: Nominal/Any Data (only option for nominal;ignores spread) Sampling impacts external validity
Median: Ordinal,Skewed Interval/Ratio (not sensitive to outliers) Independent Samples t-Test Non-Parametric Tests
Mean: Interval/Ratio,Symmetric Data (most powerful but affected by outliers) - Used to compare 2 independent sample (group) means - Compare two groups of conditions
4. Variability (Dispersion) - Each ppts can only appear in ONE sample
Range: Max-Min. Affected by outliers Basic NHST Process:
Interquartile Range: Range of middle 50% of scores. Robust to outliers (Inferential stats can work out exactly how much conditions are likely to differ
Variance (σ²): Average squared deviation from the mean due to chance alone)
Standard Deviation (σ): Squared variance. Most commonly used Consider Research Question: What is H1, H0? Appropriate inferential test
5. Graph Types by Data Type Test assumptions: Adjust analysis plan if necessary
Data Type: (Nominal) (Ordinal) (Interval/Ratio) Conduct the inferential test & follow up tests if applic.: SPSS or by hand,
Central Tendency: (Mode) (Median) (Mean/Median) stat. sig.?, if sig what direction of effect?
Dispersion: (Count Categories) (Ranger, IQR) (SD, Variance, IQR) Calculate effect size and confidence intervals: Practical sig. of finding?
Graphs: (Bar,Pie,Freq. Tables) (Bar,Pie,Grpd Freq Table) Interpret and report results: Statistical sentence,write-up
(Histo,Polygon,Stem&Leaf) How To Read Table
6. Correlation Step 1: Check Levene’s Test for Equality of Variances
Visual: Use scatterplots to see direction & strength - Column: Sig.
Direction: Positive (↗), Negative (↘), None - If Sig. > 0.05, use “Equal variances assumed” row 1. Mann-Whitney U Test
Strength (r): Weak (.1-.3) Moderate (.3-.7) Strong (.7-1.0) - If Sig. < 0.05, use “Equal variances not assumed” row - Compare two independent samples or ordinal (ranked) data
Perfect Correlation: r = ±1 Step 2: Check the Sig. (2-tailed) for the t-test - Also used when severe violations of the normality assumption prevent the
No Correlation: r = 0 - Column: Sig. (2-tailed) -> this is the p-value use of independent samples t-test
7. Types of Correlation Coefficients - If p < 0.05, diff between grps is statistically significant 2. A Priori Power Analyses for Replication
Statistic: (Pearson’s r) (Spearman’s p) (Point-Biserial) (Phi (φ)) - If p ≥ 0.05, there is no significant difference - Convert r to d, use G*Power
Variables Used: (2scale(int/rat)vars) (1/2ordinal+1 scale) Step 3: Look at Mean Difference
(1dichotomous+1sc) (2di vars) - Column: Mean Difference = Grp1 Mean – Grp2 mean
Example: (TV hrs vs Happiness) (Grade,Mid vs Final) (Gender vs Jump DIst.) - Negative Number = Grp 2 had a higher score
(Gender vs Handedness) - Positive Number = Grp 1 had a higher score
8. Correlation vs Causation Step 4: Confidence Interval
Correlation ≠ Causation - Column: 95% Confidence Interval of the Difference
3 Criteria for Causality: - If the CI does not include 0, the result is significant (confirms p-value)
Covariation (variables change together) - If it includes 0, the difference is not significant
Temporal Precedence (cause before effect) How to Write It Up
No Confounds (control confounds) “An independent samples t-test was conducted to compare[..]. There was/no
9. Misuse of Mean statistically significant difference between [..] (M = x.00, SD = x.00) and [..] (M
Skewed Data: Mean may misrepresent (eg. Income distributions) = x.00, SD = x.00) , t(x) = x.xx, p = .xxx, [..] tailed.”
Use median for skewed data Main Assumptions For This Test (Can Check All 4)
10. Reliability & Validity 1. Scale of Measurement: DV (continuous), IV (categorical)
Reliability: Consistency of a measure 2. Independence of Observations: Each ppts’ score must be independent of
Validity: Accurately measures what it’s supposed to others (no pairing)
3. Normality: DV should be approximately normally distributed within each
PSEM 2 group
1. Types of Research - Shapiro-Wilk (preferred for small samples)
Descriptive: Observe, record & describe behaviour - Kolmogorov-Smirnov (less preferred)
Relational/Predictive: Identify relationships between variables; predict If p > 0.05, distribution is not sig. diff. from normal (normality assumption
outcomes met)
Causal/Experimental: Determine cause-effect relationships through If p < 0.05, distribution is not normal (may need non-parametric test like
manipulation Mann-Whitney) Tests for Nominal Data
2.Qualitative vs Quantitative Research Can also check: Histograms (w/ normal curve) Q-Q Plots (points close to line 1. Chi-Square Test
Feature: (Assumptions) (Data) (Collection Methods) (Analysis) = good) - To compare two independent samples of nominal (categorical)
Qualitative: (Reality is subjective,constructed) (Words,images,observations) 4. Homogeneity of Variance (Equal Varience): Variances of DV in 2 grps - To assess whether two categorical variables are related
(Interviews, Surveys, observation) (Thematic,interpretive) should be roughly equal (Refer to Levene’s Test in Independent Samples t- 2. Reporting Chi-Square Test
Quantitative: (Reality is fixed, measurable) (Numbers) test Output) “A chi-square test of contingencies indicated a stat. sig./non-sig.., [S/M/L] –
(Measurements,experiments) (Statistical,objective) If Levene’s Sig. > 0.05, equal variances not assumed sized rsp between [what we’re comparing], χ2 (1, N = x) = xx.xx, p < .xxx, ϕ
3. Triangulation If Sig. < 0.05, equal variances not assumed (use bottom row of t-test output) = .xxx.
Using multiple methods (qual+quant) to understand complex Practical Significance 3. McNemar Test of Change
phenomena. Statistical sig. (p-value) tells us if a relationship exists Tests whether category membership on a binary variable changes between 2
(eg. Literacy Centre eval. Show ave. (quant) + personal impact (qual)) Practical significance (effect size) tells us how meaningful the rsp is in the real conditions in time
4. Research Design Types world 4. Are non-parametric tests assumption-free?
Category: (By Purpose) (By Method) (By Design) 1. Effect Size: Correlation Coefficient (r) - No, their assumptions are less restrictive than those of parametric tests
Types&Features: (Descriptive,relational,causal) (Quant,qual,causal) (true Measures the strength and direction of a linear rsp between 2 variables - Ordinal DV for Mann-Whitney & Wilcoxon. Nominal DV for Chi-Square &
exp,quasi-exp,non-exp) Ranges from -1 to +1: McNemar
5. True Experiments r = +1 -> Perfect positive linear relationship - Shape & spread f distributions roughly equivalent for Mann. Distribution of
-Manipulate IV, Measure DV r = -1 -> Perfect negative linear relationship diff scores roughly symmetrical to Wilcoxon.
-Use random assignment r = 0 -> No linear relationship ANOVA (Analysis of Variance)
-Control extraneous/confounding variables 2. Strength of the Relationship Used to test for stat. sig. between 3 or more independent samples means
Types: (r value: strength) - More flexible than t-test, but does not test direction
Between-Subjects: each group gets 1 IV level ± 0.00 – 0.09: None/Trivial 1. Purpose of ANOVA
Within-Subjects: all ppts. Get all IV levels ± 0.10 – 0.39: Weak To tease out the variability in the data due to the IV from the variability due to
6. Threats to Internal Validity ± 0.40 – 0.69: Moderate random/chance factors:
(Threat: Explanation) ± 0.70 – 0.89: Strong -Individual diffs, errors in measurements or errors in control (all error
Selection: Groups differ at baseline ±0.90 – 1.00: Very Strong variance)..IV is also a factor
Maturation: Natural changes over time 3. Direction of the Relationship F = Between groups variance / Within groups variance
Attrition: Dropout bias Positive: As one variable increases, the other decreases Between groups variance = Individual Diffs. + Error + Effect of IV (IV only
Test Reactivity: Practice or fatigue from repeated testing Negative: As one variable increases, the other decreases applies in H1(>1) not H0(=1))
Instrumentation: Tool changes over time 4. Interpreting Correlation in SPSS Within groups variance= Individual Diffs. + Error
History: External events affect outcomes Pearson Correlation (r), sig. (2-tailed) p-value & N (no. of cases 2. Reject H0 from ANOVA output
Statistical Regression: Extreme scores tend to return to average Interpretation: r(x) = .xx, p = .xxx ([strength], [direction], stat. sig/no sig. rsp) In F column, if value is more than 1, can reject H0
Multiple Treatment Interference: Other treatments occur during the study 5. Important 3. P-value in ANOVA output
7.Controlling Threats Correlation does not imply causation, outliers can dramatically affect r. If p < 0.05: Reject H0; If p > 0.05, Do not reject H0
Time-related: Include control group Use Spearman’s rho (rs) if: 4. Interpreting a Statistically Significant ANOVA
Group non-equivalence: Random assignment Data is ordinal or not normally distributed - The IV has had an effect on the DV F(x, xx) = x.xx, p<> x.xx
8. Design Examples Relationship is non-linear “An analysis of variance indicates that [IV/DV] is highly/not significant (p <>
One-group pretest-posttest: Weak;no control for rival explanations Outliers are present x.xx)
Posttest-only control group: True exp;uses random assignment 6. Reporting 5. Issue of Ambiguity (Confounding Variable)
Pretest-posttest control group: Strong;controls time&grp equivalence “r(x) = .xx, p = .xxx” Significance tells us only that at least two of the means are different – not
threats Report 2 dec. places for r and p. Include degrees of freedom (df = N – 2) which ones
Solomon 4-grp design: Controls for pretest effects 7. Measures of effect size 6. Assumptions of ANOVA
9. Ethics Principles d Family: Measures of group differences - Scale data, independence, normality (by group), homogeneity of variances
Autonomy: Respect ppts’ rights, dignity, informed consent r Family: Measures of association If equality of variances cannot be assumed, report Welch’s test rather than the
Beneficence: Maximise benefit, minimise harm 8. Cohen’s d regular ANOVA (robust test~)
Justice: Fair distribution of burdens/benefits The diff. between 2 means, expressed in SDs Welch’s F (df1, df2) = [statistic a], p = [sig.]
Trust: Maintain ppt/community trust d = 0.2 (Small) 7. Example Result Report for Omnibus ANOVA
Fidelity & Integrity: Conduct sound, honest, well-reported research d = 0.5 (Medium) “Following the significant omnibus ANOVA, two planned comparisons were
10. Ethics Examples d = 0.8 (Large) conducted, each with αpc = .05. The first, which compared the mean of the
Milgram (1963): Obedience to authority -> deception, distress *The practical sig. of an effect depends on context, not just size zero bystanders groups was statistically significant, t(49) = 3.41, p = .001, two
Little Albert: Conditioned fear in a baby -> distress 9. eta Squared (η2) tailed, and large, d = 0.97. The second, which compared the means of the one
Tudor Study: Labelling children ‘stutterers” -> lasting effects The proportion of variance in the data that can be accounted for by the IV of four bystanders groups was also significant, t(49) = 2.68, p = .010, two
11. Demand Characteristics & experimenter Effects η2 = .01 (Small) tailed, and large, d = 0.77”
Demand Characteristics: Ppts guess purpose and change behaviour η2 = .06 (Medium)
Experimenter Expectancy: Researchers’ beliefs affect ppt performance η2 = .14 (Large)
12. Quasi-Experiments * r is a measure of effect size for t, interpreted as a correlation coefficient
No random assignment or manipulation 10. Interpretation for the 3 (cannot interpret all 3) (1 at a time)
Often use natural groups (gender, clinical vs non-clinical) effect size [r,d,η2] = .xx, .xxx, x.xx
Lower internal validity, but still useful 11. Making Decisions Using Confidence Intervals
13. Sampling (95% CI of Diff: NHST Decisions: Practical Decision)
Probability Sampling: Every individual has equal chance of being chosen 0.3 to 7: Reject Null: Results not definitive enough to make a decision
Simple Random: Random draw form list 8 to 16: Reject Null: Surely reject null
Stratified: Random within subgroups 0.2 to 0.6: Reject Null: Decide negligible for all practical purposes
Cluster: Randomly select groups -1.4 to 0.8: Don’t reject Null: Satisfied say pop2 diff. is probably trivial
Non-probability Sampling: No equal chance for all -0.1 to 9: Don’t reject Null: Results not definitive enough to make a
Convenience: Based on availability decision
Quota: Non-random but meets group criteria
Snowball: Ppts refer others
14. Hypothesis Writing
Directional: Predicts the direction of difference/relationship
Non-directional: Just predicts a difference/relationship
Causal: Predicts IV causes DV
Relational: Predicts variables are associated
15. Null Hypothesis Significance Testing (NHST)
Assume null hypothesis = no effect
Compute probability (p) that results occurred by chance
If p < .05, reject null hypothesis -> result is statistically significant
16. Type I & Type II Errors
(Decision: Truth Null True: Truth Null False)
Reject null: Type I Error (False+): Correct Decision