Week 1/2
Variables and measurement levels:
Nominal = Just categorization, no ordering gender
Ordinal = Categories have order, but no known distances
education level
Interval = Numbered categories have known distance between them
temperature in degrees Celsius
Ratio = Numbered categories with a meaningful 0 (count variables)
age
Nominal and ordinal categorical you cannot compute a mean score
Interval and ratio numerical you can compute a mean score
Article B. Winter about taste and smell words:
Hypothesis: taste and smell are different from seeing, hearing and
feeling
Gustatory (tasting) and olfactory (smelling) are chemical senses and words
associated with taste and smell are on average more emotionally
valanced (evoke more emotions) that words associated with other senses
Week 3/4
Chapter 2 pages 34-44
Geom = geometric object, how the data is visualized. The geom indicates
the primary shape which is used to visually represent the data
Iconicity = the resemblance between a sign’s form and its meaning
Lecture:
What are distributions?
Statistical distributions illustrate which values are common and uncommon
Numerical discrete distribution (can also describe categorical data!):
Rolling a dice 30 times and making a bar plot of the outcomes
empirical distribution = a data distribution based on collected data
Discrete uniform distribution = what you expect the outcome will be
like, so if you roll a dice 18 times, you expect every number to come up
three times theoretical distribution = a distribution based on our
expectations
How to describe categorical data?
Nominal data: we can count how many times each level of one categorical
variable occurs
Absolute frequencies: nominal, how many times does a variable occur?
Relative frequencies = percentages and probabilities
,Ordinal = Likert-scale data, education level, the levels build up
You look at frequencies, but you can also calculate the median (the value
in the middle) and the mode (the value that was most frequently chosen)
Univariate statistics = one variable
Bivariate statistics = the relationship between two variables
Measures of central tendency median
How to describe continuous numerical data:
Every number in a range is possible, example: scale of 0 to 10, 1,2 is
possible
Continuous uniform distribution:
What kind of data follows a continuous uniform distribution:
Arrival times between two known points in time
Random number generation in specific range
The normal/gaussian distribution:
μ (mu) = the mean
σ (sigma) = the standard deviation
The lower the standard deviation the more pointed the distribution is
What kind of data can follow a normal distribution?
Heights and weights
Standardised test scores
Psychological measures (the extent to which people are extrovert)
Often: reaction times
68 % of your data is -1SD mean +1SD
Lots of data falls within 1 standard deviation of the mean
2 SD below and above the mean is 95 % of your data
Most of the data falls within 2 standard deviations of the mean
, Real data is never that smooth and normally distributed, but we can make
it smooth
Things you can do with continuous data:
Mean, median, mode, standard deviation, range, interquartile range
Measures of central tendency = mean, median, mode
2 ways of plotting continuous data:
- histogram
- density distribution
Quartiles:
Boxplots
The second quartile is the median,
the data point in the middle, it splits your data in half