Geschreven door studenten die geslaagd zijn Direct beschikbaar na je betaling Online lezen of als PDF Verkeerd document? Gratis ruilen 4,6 TrustPilot
logo-home
Samenvatting

Summary lectures 2nd part of Statistics (GEO2-2217)

Beoordeling
-
Verkocht
2
Pagina's
27
Geüpload op
16-04-2021
Geschreven in
2020/2021

This is a broad summary of all the lectures of the second part of the course Statistics.

Voorbeeld van de inhoud

Lecture 9: Association interval and ordinal variables
When changes in one variable corresponds to similar changes in another variable = positive
correlation → represented by correlation coefficient (r), that has positive value up to a max of 1.
→ correlation doesn’t imply causation (one variable does not cause change in another variable) →
it just measures changes in variables that co-occur.
Correlation of zero when changes in one variable bear no relation to changes in another.
When changes in one variable correspond with opposite changes in another = negative
correlation → represented by correlation coefficient (r), that has a negative value to a minimum
of -1 → fast vs slow, heavy vs light and reflected movements.
→ the size of the correlation coefficient indicates the strength of the
relationship between the two variables.
Association measures for interval and ordinal variables: see picture.
→ smallest correlation is -1 and the largest correlation is +1.

Covariance: example → 5 friends give a movie a score →
second variable is their age → what do we observe when we look at these 2 different
variables? (see pictures left).
- If one variable goes up (score) the other goes down (age) → at
first graph you see for the first friend the score is above average, but the
age is below average → counts for all friends → can also put them in the
same graph (see picture right), with score on y-axis and age on x-axis →
age en score covary → says something about direction of the
association → is a negative association → covariation tells something
about direction, but not yet about the strength of an association.
Example 2: 3 lecturers (A, B and C) that all graded the same assignments.
- Lecturer A: 2, 7, 8, 8, 10 → average grade = 7, standard deviation = 3.
- Lecturer B: 1, 6, 7, 7, 9 → average grade = 6, standard deviation = 3.
- Lecturer C: 3, 7, 5, 7, 8 → average grade = 6, standard deviation = 2.
→ can compare lecturer A and B → B grades assignments one point lower than A → their scores
vary identically (SDs are similar) → means they covary fully and the grades correlate max.
→ can compare C and A → C grades assignments with less variance compared to A → not
identical positions (sometimes C grades higher, sometimes A) → means they do covary, but less
than B covaries with A → grades from C and A correlate, but not max.
Covariance for A and C: lecturer A on x-axis and C on y-axis → coordinate
system through centre of gravity: (𝑥, 𝑦) = (7, 6) → can calculate x-deviations
2
compared to average of x (7) → gives x-dev: 𝑑𝑥 = -5, 0, 1, 1, 3 → 𝑠𝑥 = Σ(dx·
dx)/(n-1) = (25+0+1+1+9)/(4) = 9 (variance of x).
→ covariance is similar, but with x- and y-deviations: Σ(dx·dy)/(n-1) → so we also
need y-deviations: -3, 1, -1, 1, 2 → Σ(dx·dy) = (-5 x -3) + (0 x 1) + (1 x -1) + (1 x 1) + (3 x
2) =21 → cov = 21/4 = 5.25
Covariance is combined variance:


→ can be positive or
negative and is an indication of the correlation → covariance =
left graph = + 5.25, right graph = -2.0 → scale-sensitive →
depends on scale what will be the value of the covariance →

,covariance gives direction (negative or positive) → r = (cov)/(sxsy) = 5.25/(3x2) = 0.875 → r2 = 0.77
(77% linearly explained) → r is not scale-sensitive → this means you can compare the correlation
coefficient rho (r) in different studies → r also indicates whether a correlation is large or small →
summary r:
- r = coefficient of linear association (standardized covariance) → –1 ≤ r ≤ +1 → sign in
front of r shows whether it is a negative/positive correlation.
- r = standardized regression coefficient b in case of simple regression (when you have
only 1 independent variable).
- r2 = proportion variation in y linearly explained by X (covariance is not an association
measure, because it is scale-sensitive).
- Example if r = –0.5 → clearly negative correlation → a 1.0 sx increase in x associates with a
0.5 sy decrease in y → r2 = 0.25, so 25% Y-variation linearly explained by X
- r2 < .09: weak linear association
- 0.09 ≤ r2 < 0.25: medium linear association
- r2 ≥ 0.25: strong linear association
Eta vs r: eta = more general measure for dependency Y on X → eta2 = proportion variation Y
explained by x (see lecture 3) → eta2 ≥ r2 (because r2 is the proportion linearly explained) →
advantage eta: (1) variable X can take on every measurement level and (2) it is a more general
association → disadvantage eta: (1) it is less specific than r (because it has no direction) and (2)
eta Y on X ≠ eta X on Y, so eta is not a symmetrical measure (r = symmetrical measure).

Picture left gives covariance of 2 variables
(consumption of cheese vs number of people that died
by becoming tangled up in their bedsheets) → high
correlation: r = 0.95 → however, this correlation
doesn’t make any sense.
→ you can find correlation and association between
variables that is high, but that doesn’t make any sense → so you also have to base
the selection of variables on existing research and theories.
Rank correlation: use rank correlation measure if; (a) one or both variables are of
ordinal measurement level or (b) with scale variables whereby the trend is not
increasing or decreasing, but curved (see picture right).
→ advantage rank correlation: can use it more general → disadvantage rank
correlation: it is less specific → have 2 different rank correlation measures: (1)
Spearman’s rS and (2) Kendall's tau.
- Rank correlation coefficient of
Spearman's rho (rs): rs is similar to r, but now
we apply it to ranks scores → have scores of
lecturer A and lecturer C (see graph left) with r
= 0.875 → use rank scores, we rank the scores
→ ranking position 1, 2, 3, 4 and 5 → because 3
and 4 have an equal position, so we need to
take the average (3.5 twice) → have to rank
both on y-axis and on x-axis → these rank
scores are used in the calculation.
→ covariance: deviation of x (lecturer A) is multiplied by the deviation of y (lecturer C) → needs
to be divided by the 2 standard deviations → the rs appears to be a little lower than the Pearson

, correlation that was calculated before → calculation is similar, but instead of using original
scores, we use the rank scores.

Kendalls tau (τ): considers all the pairs of points → a pair of point is called a concordant if 1 point
in a pair is higher in terms of the x- and the y-value → if 1 point in pair has both a higher x- and a
higher y-value → concordant when upward direction of arrows (see picture right).
- Number of concordant pairs k+ = 7 (number of arrows in upward direction).
- Number of discordant pairs k- = 1 (number of arrows in downward
direction) → x-value is larger for point to the right, but y-value is
larger for point to the left (for one point x-value is higher and for one
point the y-value is higher).
- Number of neutral pairs = 2 (one pair with same y-value and one with
same x-value).
→ when x- and y-value are similar then it is exactly on the same spot.

tau-a = proportion of concordant - discordant pairs →
→ if we include the neutral pairs you get tau-b and tau-c → don’t calculate this by hand, but
through SPSS → gives tau-b = 0.67 and tau-c = 0.64.
Picture left shows 4 examples of correlations → left top: r = 0.9, so
positive correlation → top right: r = -0.3, so negative correlation and
association is less strong, because the points are more spread and the
slope is less steep → bottom left: rs = higher than r, because the
correlation is a bit curved → bottom right: it is also a curved pattern,
so eta is more suitable.
Correlation in SPSS: Menu <Analyze> <Correlate> <Bivariate...>;
- Tick: Pearson, Kendall’s tau-b, Spearman;
- Select the two variables;
- ‘Test of Significance’: ‘One-tailed’ or
‘Two-tailed’ (dependent on hypothesis);
<OK>
Output: pictures right → 1-tailed so suitable for
directed, which is the case for the grades in the
example, because of the positive correlation → correlation is symmetrical,
because for C and A the Pearson's Correlation is 0.875 and for A and C it is
also 0.875 (is a symmetrical matrix) → for the Kendall’s tau, the p-values are higher, which
means that this measure has less power.
- So for lecturer example: Pearson Correlation = 0.875 (r) and p1 = 0.026 → Spearman’s rho =
0.763 (rs) and p1 = 0.067 → Kendall’s tau_b = 0.667 (tau) and p1 = 0.059.
- r is most extreme (because of the outlier) → values rS and tau are smaller and just not
significant → p for rS and tau are almost similar.
3 correlation tests: to find out the statistical significance.

1. Student distribution: testing H0: ρ = 0 → test statistic: → student
(n-2) distributed → SPSS calculates the t with the exceedance probability p.
2. Spearman's rho: testing H0: ρ𝑠 = 0 → identical formulation, but for r we use rs in the
formula for t.

Documentinformatie

Geüpload op
16 april 2021
Aantal pagina's
27
Geschreven in
2020/2021
Type
SAMENVATTING
€3,99
Krijg toegang tot het volledige document:

Verkeerd document? Gratis ruilen Binnen 14 dagen na aankoop en voor het downloaden kun je een ander document kiezen. Je kunt het bedrag gewoon opnieuw besteden.
Geschreven door studenten die geslaagd zijn
Direct beschikbaar na je betaling
Online lezen of als PDF


Ook beschikbaar in voordeelbundel

Thumbnail
Voordeelbundel
Statistics (GEO2-2217) part 2
-
3 2021
€ 10,97 Meer info

Maak kennis met de verkoper

Seller avatar
De reputatie van een verkoper is gebaseerd op het aantal documenten dat iemand tegen betaling verkocht heeft en de beoordelingen die voor die items ontvangen zijn. Er zijn drie niveau’s te onderscheiden: brons, zilver en goud. Hoe beter de reputatie, hoe meer de kwaliteit van zijn of haar werk te vertrouwen is.
yaralangeveld Vrije Universiteit Amsterdam
Bekijk profiel
Volgen Je moet ingelogd zijn om studenten of vakken te kunnen volgen
Verkocht
372
Lid sinds
8 jaar
Aantal volgers
180
Documenten
119
Laatst verkocht
1 week geleden
Samenvattingen NW&amp;I (Universiteit Utrecht) en MPA (VU Amsterdam)

Ik ben een enthousiaste student die graag zelf goede samenvattingen maakt voor tentamens over diverse vakken van innovatie en natuurwetenschappen. Deze wil ik graag met jou delen, zodat jij je ook optimaal kunt voorbereiden op tentamens! Groetjes!

3,9

38 beoordelingen

5
12
4
15
3
7
2
2
1
2

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Kwaliteit die je kunt vertrouwen: geschreven door studenten die slaagden en beoordeeld door anderen die dit document gebruikten.

Niet tevreden? Kies een ander document

Geen zorgen! Je kunt voor hetzelfde geld direct een ander document kiezen dat beter past bij wat je zoekt.

Betaal zoals je wilt, start meteen met leren

Geen abonnement, geen verplichtingen. Betaal zoals je gewend bent via iDeal of creditcard en download je PDF-document meteen.

Student with book image

“Gekocht, gedownload en geslaagd. Zo makkelijk kan het dus zijn.”

Alisha Student

Bezig met je bronvermelding?

Maak nauwkeurige citaten in APA, MLA en Harvard met onze gratis bronnengenerator.

Bezig met je bronvermelding?

Veelgestelde vragen