Geschreven door studenten die geslaagd zijn Direct beschikbaar na je betaling Online lezen of als PDF Verkeerd document? Gratis ruilen 4,6 TrustPilot
logo-home
Samenvatting

Final Exam summary, Introduction to Statistical Analysis, Week 1-8 (CM1005 @EUR)

Beoordeling
4.0
(2)
Verkocht
19
Pagina's
34
Geüpload op
21-01-2021
Geschreven in
2020/2021

Summary all lectures and tutorials for Erasmus University CM1005: Introduction to Statistical Analysis.

Instelling
Vak

Voorbeeld van de inhoud

Week 1

Univariate => e.g. average grade
grade


Bivariate => e.g. male and female differ in grades
gender grade


Multivariate => e.g. grade dependent on X, Y, Z
X
Y grade
Z


Statistics => “the study of how we describe and make inferences from data” (Sirkin)

Inference => a conclusion reached on the basis of evidence and reasoning

Descriptive statistics => describe (sample) data

Inferential statistics => make statements about population based on sample

Population ( N ) and sample (n)

Units of analysis => what or who being studied (rows in SPSS)

Variable => measured property of each of the units of analysis (columns in SPSS)

Measurement level

Nominal Can’t be ranked Hair colour
Qualitative
variables




Can be rank-ordered
Ordinal Likert scale
NOIR




(but no equal distances)
Quantitativ
e variables




Interval Ranked with equal distances IQ

Ratio With meaningful zero Age


Continuous variable => measured along a continuum, can have decimals. E.g. height of
students in class.


CM1005 Introduction to Statistical Analysis

,Discrete variable => measured in whole units or categories. E.g. number of students in class.

Measures of central tendency => to (univariately) describe the distribution of variables on
different levels of measurement.


Mean => M =
∑ x or x (used with interval/ratio) = the average (of the sample)
n
 Changing any score will change the mean
 Adding/removing a score will change the mean (unless the score is already equal to
the mean)
 Sum of differences from the mean is zero:
∑ (x−M )=0




 Sum of squares (SS) => sum of squared differences from the mean is minimal. Lowest
possible. When using anything other than the mean to calculate SS, the outcome
would be higher.
∑ (x−M )2
∑ x => sum of all x ’s

Population mean => μ=
∑x
N

Median => (used with ordinal and interval/ratio) = 50th percentile = “middle case” when
written down in order.

Median in SPSS frequency table => first category that exceeds 50% in the ‘cumulative
percent’ column.

Outliers => value that sticks out from the rest (way lower/higher).




CM1005 Introduction to Statistical Analysis

,Mode => (used with nominal, ordinal and interval/ratio) the category with the largest
amount of cases.

Mode in SPSS frequency table => category with the highest percentage.

Nominal distributions => symmetric. Mean, median and mode are equal.




Week 2

Dispersion/variability => (spread) mean could be the same. E.g., first group (10 ×20+10 × 60)
has the same mean (40) as the second group (10 ×39+10 × 41).

Range => (ordinal, interval/ratio) distance between the highest and lowest score. Always
report with the maximum and minimum scores. Sensitive to outliers.

Interquartile range (IQR) => (ordinal, interval ratio) based on “quartiles” that split our data
into four equal groups of cases. Q1 (lower quartile), Q2 (median quartile) and Q3 (upper
quartile).
IQR=Q3 −Q1

Variance => (interval/ratio) based on Sum of Squares. Different for sample and population
data (sample is more common).
2 ∑ (x−M )2 2 ∑ (x−μ)
2
s= (sample) σ = (population)
n−1 N

Higher variance => more data difference.

n−1 => unbiased estimator

2 SS
Definitional variance => s = , where SS=∑ ( x−M )2
n−1

CM1005 Introduction to Statistical Analysis

, 2
SS ( x)
, where SS=∑ x2 − ∑
2
Computational variance => s = . No need to calculate
n−1 n
individual distances from the mean.
Standard deviation (SD) => (interval/ratio) approximate measure of the average distance to
the mean. It is the square root of the variance.
∑ ( x−M )2 (sample) ∑ ( x−μ)2 (population)
s=
√ n−1
σ=
√ N

Independent variable ( x ) => variable with values that are taken as simply given.

Dependent variable ( y ) => variable assumed to depend on, or be caused by, another (the
independent) variable.

Normally distributed variables. E.g. mean = 12 and SD = 4.




3 preconditions for making causal claims
1. Empirical evidence → for a relationship between the variables.
2. Temporal sequence → x occurs before the change or effect of y occurs.
3. Causality claim should be supported by reason and theory.

Confound variable => An unanticipated variable not accounted for in a research that could
be causing or associated with observed changes in one or more measured variables. E.g., in
a relation between feet size and reading skills, the confound variable is age.

Reverse causality => a problem that arises when the direction of causality between two
factors can be either direction.

Scatterplot => allows for graphical representation of the relationship between two
(interval/ratio) variables. Scatterplot’s x -axis is mostly for the independent variable.


CM1005 Introduction to Statistical Analysis

Geschreven voor

Instelling
Studie
Vak

Documentinformatie

Geüpload op
21 januari 2021
Aantal pagina's
34
Geschreven in
2020/2021
Type
SAMENVATTING

Onderwerpen

$4.78
Krijg toegang tot het volledige document:

Verkeerd document? Gratis ruilen Binnen 14 dagen na aankoop en voor het downloaden kun je een ander document kiezen. Je kunt het bedrag gewoon opnieuw besteden.
Geschreven door studenten die geslaagd zijn
Direct beschikbaar na je betaling
Online lezen of als PDF

Beoordelingen van geverifieerde kopers

Alle 2 reviews worden weergegeven
3 jaar geleden

4 jaar geleden

4.0

2 beoordelingen

5
0
4
2
3
0
2
0
1
0
Betrouwbare reviews op Stuvia

Alle beoordelingen zijn geschreven door echte Stuvia-gebruikers na geverifieerde aankopen.

Maak kennis met de verkoper

Seller avatar
De reputatie van een verkoper is gebaseerd op het aantal documenten dat iemand tegen betaling verkocht heeft en de beoordelingen die voor die items ontvangen zijn. Er zijn drie niveau’s te onderscheiden: brons, zilver en goud. Hoe beter de reputatie, hoe meer de kwaliteit van zijn of haar werk te vertrouwen is.
meggiew Erasmus Universiteit Rotterdam
Volgen Je moet ingelogd zijn om studenten of vakken te kunnen volgen
Verkocht
35
Lid sinds
5 jaar
Aantal volgers
31
Documenten
7
Laatst verkocht
1 maand geleden

4.0

3 beoordelingen

5
0
4
3
3
0
2
0
1
0

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Kwaliteit die je kunt vertrouwen: geschreven door studenten die slaagden en beoordeeld door anderen die dit document gebruikten.

Niet tevreden? Kies een ander document

Geen zorgen! Je kunt voor hetzelfde geld direct een ander document kiezen dat beter past bij wat je zoekt.

Betaal zoals je wilt, start meteen met leren

Geen abonnement, geen verplichtingen. Betaal zoals je gewend bent via iDeal of creditcard en download je PDF-document meteen.

Student with book image

“Gekocht, gedownload en geslaagd. Zo makkelijk kan het dus zijn.”

Alisha Student

Bezig met je bronvermelding?

Maak nauwkeurige citaten in APA, MLA en Harvard met onze gratis bronnengenerator.

Bezig met je bronvermelding?

Veelgestelde vragen