Geschreven door studenten die geslaagd zijn Direct beschikbaar na je betaling Online lezen of als PDF Verkeerd document? Gratis ruilen 4,6 TrustPilot
logo-home
Samenvatting

Samenvatting Statistics II For IB International Business

Beoordeling
-
Verkocht
2
Pagina's
54
Geüpload op
21-11-2022
Geschreven in
2022/2023

Samenvatting Statistics II For IB International Business

Instelling
Vak

Voorbeeld van de inhoud

Statistics II - IB
Lecture 1 – Introduction and hypothesis testing
Objectives of the course:
- Knowledge about multivariate statistics: hypothesis testing,
multivariate regression analysis, analysis of variance, time-series
analysis
- Skills for performing multivariate statistical analysis: use of SPSS

Introduction to multivariate statistical analysis
Types of data
Nonmetric or qualitative data
- Presence of a feature, male/female, vegetarian yes/no?

Metric or quantitative
- Quantifying an attribute, how tall is the individual/how satisfied?




Measurement scales
- Nominal scale: numbers in place of labels, male/female
- Ordinal scale: ranking
- Interval scale: with no ‘zero’ reference point: Celsius
- Ratio scale: with ‘zero’ reference point: Height

Missing value analysis
What are missing data? For an individual we have only partial information
(we know the values of only some of its characteristics).
The goal of the analysis is to identify the true patterns and relationships
among variables even when some data are missing. Impact:
- Reduces the sample size
- Can distort results: is it a systematic of random data deficiency?

Types of missing data:
Missing completely at random MCAR: for any respondent, the probability
that the value of a variable is missing does not depend on any variable.
Unsystematic missingness.

,Missing at random MAR: for any respondent, the probability that the value
of a variable is missing depends on other variables. E.g., probability of
missing data is related to age.

https://iriseekhout.shinyapps.io/MissingMechanisms/

How to analyse the missing values?
- Check in each variable the percentage of missing values and the
number of extremes and outliers.
- Check in each observation the percentage of missing values and
how often it is an extreme or outlier (also, to what extent)
- Check how often the missing patterns occur: frequent patterns might
indicate causality. Which cases present these missing patterns?

How to handle the missing values?
- Ignore: if it is less than 10% of cases/variables
- Deletion: pairwise or listwise
- Imputation: mean, hot deck imputation, cold deck imputation

Deletion
Listwise: delete entire observation. The advantage is that the remaining
dataset is complete. A disadvantage: the reduced resulting sample size
due to the loss of the incomplete cases, biased dataset if not MCAR.
Pairwise: delete incomplete cases on an analysis-by-analysis basis (delete
from the calculation). Sample size remain the same for some analyses,
reduced for others. Disadvantage is the inconsistency of the sample size.

Imputation
The mean of entire data or group. Creates reduced variability.
Hot deck imputation: use an observation from the sample that is
considered similar.
Cold deck imputation: use an observation from an external data source
that is considered similar.

Rules of thumb to handle missing data:
< 10%: ignore or use any imputation method
10-20%: hot deck imputation (assuming MCAR)
>20%: delete

Examining data
Why should be examine data carefully? To prevent from jumping to wrong
conclusions. Understand the type of data to answer the following
questions:
- What are the characteristics of the data?
- Is there a common behaviour to all the data?
- Is there any missing data?
- Is there any outlier?
- Which analysis methods can we use?

,We should detect the major features of the probability distribution of the
variables. But, first of all: identify the type of data.

Examining qualitative data
What could make sense to calculate: Frequency table, minimum,
maximum, range, mode.
What graphical techniques can be applied: bar chart, pie chart.

Examining quantitative data
What could make sense to calculate: mean, mode, median, range,
interquartile range, standard deviation, variance, skewness, kurtosis.
What graphical techniques can be applied: scatterplot, histogram, boxplot.




The normal distribution is always reference for comparison. We should
detect the major features of the probability distribution of the variables.
The shape of the probability distribution is important for the measure of




centrality and dispersion of the data.

, What can we do with the characteristics of the data?
Design a correct model reproducing the features of the data. Choose an
adequate technique for the analysis:
- Is the sample size large enough?
- Are the assumptions required by the chosen analysis technique
satisfied by the data?
- Do we have all the necessary data to apply correctly the chosen
analysis technique?
Transform the data before studying them, if necessary.

Types of samples
Independent samples: the groups in the data do not correspond to each
other. The number of observations in each group can be different.
Matched pairs: the groups in the data correspond to each other. The
number of observations in each group are always the same.




Lecture 2 – Hypothesis Testing
Statistical inference and testing
Statistical inference is conclusions based on the sample. When we analyse
statistical data, we try to infer some characteristics of the process that has
generated the data.

Observing a sample and statistical inference does not provide ‘definitive’
conclusions, it just sizes up the different ‘maybes’.
Using a sample, we can make a:
- Confidence interval
- Hypothesis testing
- Model parameter estimation

Expected results come from probability theory. Observed results come
from experiments. Statistics links these two.

We can test if the unknown value of a parameter is equal to a chosen
value (or set of values): this is a hypothesis. Example:
We roll a die 10 times, write down the result and see that the sample
mean is 4.6. The standard deviation of the sample is s=1.35. Can we
infer that the die is a fair die?

A statistical test is a function of the observed data which gives just two
answers: reject / no not reject the null hypothesis. Often: the population
mean equals/does not equal the theoretical mean. Example:
H0: the population mean = 3.5
H1: the population mean does not equal 3.5

Geschreven voor

Instelling
Studie
Vak

Documentinformatie

Geüpload op
21 november 2022
Aantal pagina's
54
Geschreven in
2022/2023
Type
SAMENVATTING

Onderwerpen

$9.55
Krijg toegang tot het volledige document:

Verkeerd document? Gratis ruilen Binnen 14 dagen na aankoop en voor het downloaden kun je een ander document kiezen. Je kunt het bedrag gewoon opnieuw besteden.
Geschreven door studenten die geslaagd zijn
Direct beschikbaar na je betaling
Online lezen of als PDF

Maak kennis met de verkoper
Seller avatar
minjonhoogvliets

Maak kennis met de verkoper

Seller avatar
minjonhoogvliets Rijksuniversiteit Groningen
Volgen Je moet ingelogd zijn om studenten of vakken te kunnen volgen
Verkocht
8
Lid sinds
3 jaar
Aantal volgers
7
Documenten
25
Laatst verkocht
2 jaar geleden

0.0

0 beoordelingen

5
0
4
0
3
0
2
0
1
0

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Kwaliteit die je kunt vertrouwen: geschreven door studenten die slaagden en beoordeeld door anderen die dit document gebruikten.

Niet tevreden? Kies een ander document

Geen zorgen! Je kunt voor hetzelfde geld direct een ander document kiezen dat beter past bij wat je zoekt.

Betaal zoals je wilt, start meteen met leren

Geen abonnement, geen verplichtingen. Betaal zoals je gewend bent via iDeal of creditcard en download je PDF-document meteen.

Student with book image

“Gekocht, gedownload en geslaagd. Zo makkelijk kan het dus zijn.”

Alisha Student

Bezig met je bronvermelding?

Maak nauwkeurige citaten in APA, MLA en Harvard met onze gratis bronnengenerator.

Bezig met je bronvermelding?

Veelgestelde vragen