Termen
Werkcollege 1 (7/2/2023)
Type variabelen:
Categorisch (nominaal en ordinaal): Variabelen zoals lievelingskleur. Je kan er niet echt mee rekenen.
Nominal: when it has a noun, like someone's hair colour. You need a name for it.
Ordinal: Involves arranging numbers. it's organised, if something is on a scale of 1-5, 3 will never be an
extreme value.
Kwantitatief (discreet en continu): Getallen waar we mee kunnen rekenen.
Continuous: A continuous scale of something, like temperature. Like a gay person with a partner.
Discreet: Something that is not connected. Imagine like someone with internal homophobia, they're always
alone.
CALCULATING PERCENTAGES
Deel/geheel x 100
If change: new/old x 100
Verdeling van de data (data distribution):
● Uni-/ bimodaal
● Scheefheid
● Normaalverdeling (“bell shape”)
Grafiek, grafische weergave:
Categorisch
● Pie chart (taartdiagram); bar chart (staafdiagram)
Kwantitatief
● Histogram; stem-and-leaf plot; dot plot; box plot
Centrummaten:
Mode
● Value that occurs most frequently.
● Often used if a variable is measured on a nominal or ordinal level.
● You can have more than one mode.
Median
● Middle value of your observations when arranged from the smallest to the largest.
● When you have an even number of cases take the average of the middle numbers.
,Mean
● Sum of all the values divided by the number of observations.
Bij een left skewed graph:
1: Mean.
2: Median.
3: Mode.
Spreiding:
● Standaarddeviatie
● Variantie
● Range
● Interkwartielafstand (IQR)
Positie vs bepaalde observatie:
● Deviatie
● Percentiel
● Outlier
● z-score
Range
● Difference between the highest and lowest value.
● Subtract the lowest from the highest number to find.
● Advantages: Easy to understand, simple to compute.
● Disadvantages: Doesn't give a good impression of the variability, only takes into account the
extreme values.
, Interquartile range
Leaves out the extreme values. Divides a graph in four
equal parts. These are called quartiles. Q2 is the
median. The interquartile range is the distance
between the third and the first quartile.
Q2 = median
Q1= middle value left side of median
Q3= middle value right side of median
IQR= Q3 – Q1
Variance
Σ(𝑥−𝑥̄)
s2= 𝑛−1
S2: Variance
(x - x̄): From every observation (x) you have to subtract the mean variable (x̄), square all these values and add
them up. This result is the sum of squares.
N - 1: Divide the sum of squares with the sample size (n) - 1.
Disadvantage to using variance: the metric of the variance is the metric of the variable under analyses
SQUARED.
Standaarddeviatie
De standaarddeviatie geeft aan hoeveel de data Gemiddeld afwijkt van het gemiddelde.
Voor uitrekenen:
● Je kijkt eerst naar hoeveel elke observatie van het gemiddelde afwijkt.
X - x̄
●
● Standaarddeviatie zegt dus iets over de spreiding. Dit zegt iets over de zekerheid.