Geschreven door studenten die geslaagd zijn Direct beschikbaar na je betaling Online lezen of als PDF Verkeerd document? Gratis ruilen 4,6 TrustPilot
logo-home
Tentamen (uitwerkingen)

Statistics Videos Summary EPH1026

Beoordeling
-
Verkocht
1
Pagina's
53
Cijfer
B
Geüpload op
26-06-2024
Geschreven in
2022/2023

This document is an extensive summary of all the videos that students have to watch in order to understand the statistical concepts. This document is divded into Week 1, Week 2, Week 3 and Week 4. There is thorough explanation of the content for all the videos including some examples too.

Meer zien Lees minder

Voorbeeld van de inhoud

Week 1: Video Notes:
1. Types of Variables:
➔ Categorical: Qualitative Variable: Place people into groups or categories. Get
summarized using proportions or percentages.
- Nominal: no ordering based on magnitude or size.
- Ex: Having a disease or not so the answer is either yes or no - categories, hair color)
- Ordinal: there is an ordering to the categories. The spacing between the categories does
not have a meaning.
- Ex: size of coffee you get. Place that one finishes in the race.

➔ Numeric: Quantitative variables: Recorded numerical quantities
- Discreet: Integers only. (no negative values and just full numbers) 0,1,2,3.
- Ex: number of people in the ER.
- Discrete or Continuous variables can be further subdivided into scales of
measurement looking at the ratio scale or the interval scale
- Ratio scale of Age, weight and income has a meaningful zero ratio. (the zero
means something, such as having an age of 0, it does not mean there is no age.
The ratio is meaningful.
- Temperature is measured on an Interval scale, this has a non-meaningful zero.
- Continuous: measured on a continuous scale.
- Ex: Age and weight, income
- Exceptions: Categorical variables sometimes are recorded using numbers but they are
not numeric. (Such as females being indicated as 1 and males being indicated as 2),
Another case is on the likert scale. We are using numbers to indicate categorical.

➔ Extra Notes:
- Identifiers are used to identify an individual, so numbers have no meaning here.
- Numerical values can be converted into categorical variables
- Categorical variables are recorded using numbers and this does not mean that the
numbers have a meaning.




1

, 2. Summarizing + Displaying a Categorical variable:
➔ Best way to summarize a categorical variable is to count how many people fall into each
category and then summarize that by using frequency or relative frequency.
➔ Frequency Table: might contain:
- Frequency
- Proportion: Divide the frequency by the total
- Percentages: Just take the number after the decimal of the proportion and put %.
The percentages can show us the distribution.

➔ Bar Chart: Has along the x-axis the variable and along the y-axis we can either put the
frequency, proportion or percentages.

➔ Pie Chart: for each category there is a slice of the pie with the size of the proportion.

➔ Histograms:
Histogram Properties:
1. Quantitative Data
2. No Gaps
3. Bar Width (is constant-does not change within bars) Such as the size of
something.
4. Y-axis corresponds to the frequency

Steps of building a histogram:
- Break the range of values into intervals called “classes” through finding the lowest and
highest values
- Decide the size of the bin/Bar width. 10 is a good bin size
- Bracket means Include, Parentheses means does not include




2

, 3. Measures of Central Tendency (Mean, mode, median)
➔ Sample mean: average of the numbers. (Add all numbers and divide by the number of
values.




- The sample mean is Sensitive to Outliers. When we have a huge value, the
mean can be pulled towards this huge value. Sample mean is a parametric
measure.
- The mean can be a balance of all the observations we have.
- Population mean: the mean of the entire population rather than the sample. We
abbreviate that with (meu M)
- Trimmed mean: calculating the mean after removing the lowest alpha %.
Calculating the mean after cutting the lowest 5% of the data and the highest 5% of
the data.

➔ Median: Middle value. Cuts the data in half. Order data from small to big and find the
middle value. In even number of values: add the 2 middle values and divide by 2.
- Not Sensitive to Outliers
- The Median is a NONparametric measure.

➔ Comparing mean and Median:
- When the distribution is symmetric, the mean is the same/equal as the median.
- When the distribution is skewed, the mean is pulled towards skewness.

➔ Mode: most repetitive value.
- Less commonly used.




3

, 4. Measures of Variability (Variance, SD, IQR)
➔ Measures of dispersion: how far is the data from the center (average, mean)
➔ Range: how far is the spread between the largest and the smallest number. -Larger
range means a more dispersed set. (Not used a lot)
(Biggest value - Smallest Value)
➔ Variance: sign: 𝛔2. Squared differences between each data point and the mean.
Small variance, less dispersed data set. All numbers are close to each other.




To calculate the Variance: We have to subtract EACH value we have by the mean and
square the Answer. Add the answers we get. And divide by the amount of values we
have.
- Sample Variance: Average Squared Deviation
- Sample Variance is Sensitive to Outliers
- S squared is for sample variance
- Sigma squared is for population variance

➔ Standard Deviation: 𝞼 is Square root the variance.
Difference between the SD and the variance:
- SD: measures how far apart numbers are in a data set. The higher the SD,
the more spread and far apart are data from each other.
- SD is the Average Deviation (On average how far is a data point from the
mean)
- Population Standard deviation: Sigma alone
- Variance: gives an ACTUAL value to how much the numbers in a data
sets vary from the mean.


➔ IQR: Interquartile range: Range of the Middle 50% of ordered data. Finding Median of
first half and then finding the median of the second half
IQR is NOT sensitive to outliers.
- IQR= Q3-Q1
- Q2 is the median
- If we have a data set and we are asked to find the IQR.
1. Order the data from small to big
2. Find the median of all data
3. Find the median of the first half = Q1
4. Find the median of the second half = Q2


4

Documentinformatie

Geüpload op
26 juni 2024
Aantal pagina's
53
Geschreven in
2022/2023
Type
Tentamen (uitwerkingen)
Bevat
Vragen en antwoorden

Onderwerpen

€12,00
Krijg toegang tot het volledige document:

Verkeerd document? Gratis ruilen Binnen 14 dagen na aankoop en voor het downloaden kun je een ander document kiezen. Je kunt het bedrag gewoon opnieuw besteden.
Geschreven door studenten die geslaagd zijn
Direct beschikbaar na je betaling
Online lezen of als PDF

Maak kennis met de verkoper

Seller avatar
De reputatie van een verkoper is gebaseerd op het aantal documenten dat iemand tegen betaling verkocht heeft en de beoordelingen die voor die items ontvangen zijn. Er zijn drie niveau’s te onderscheiden: brons, zilver en goud. Hoe beter de reputatie, hoe meer de kwaliteit van zijn of haar werk te vertrouwen is.
sajaalsaket Maastricht University
Bekijk profiel
Volgen Je moet ingelogd zijn om studenten of vakken te kunnen volgen
Verkocht
13
Lid sinds
2 jaar
Aantal volgers
2
Documenten
19
Laatst verkocht
1 week geleden

0,0

0 beoordelingen

5
0
4
0
3
0
2
0
1
0

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Kwaliteit die je kunt vertrouwen: geschreven door studenten die slaagden en beoordeeld door anderen die dit document gebruikten.

Niet tevreden? Kies een ander document

Geen zorgen! Je kunt voor hetzelfde geld direct een ander document kiezen dat beter past bij wat je zoekt.

Betaal zoals je wilt, start meteen met leren

Geen abonnement, geen verplichtingen. Betaal zoals je gewend bent via iDeal of creditcard en download je PDF-document meteen.

Student with book image

“Gekocht, gedownload en geslaagd. Zo makkelijk kan het dus zijn.”

Alisha Student

Bezig met je bronvermelding?

Maak nauwkeurige citaten in APA, MLA en Harvard met onze gratis bronnengenerator.

Bezig met je bronvermelding?

Veelgestelde vragen