Geschreven door studenten die geslaagd zijn Direct beschikbaar na je betaling Online lezen of als PDF Verkeerd document? Gratis ruilen 4,6 TrustPilot
logo-home
College aantekeningen

College aantekeningen ARMS

Beoordeling
-
Verkocht
-
Pagina's
28
Geüpload op
09-01-2024
Geschreven in
2022/2023

Dit zijn de complete college aantekeningen van het vak Advanced Research Methods and Statistics (ARMS) aan de UU. In het Engels geschreven. Ik heb met deze aantekeningen het vak gehaald.

Instelling
Vak

Voorbeeld van de inhoud

Hoorcollege 1: Multiple Linear Regression

Two different frameworks:
- Frequentist framework: still mainstream (but Bayes is catching up), based on
nulhypothesis, p-values, confidence intervals, effect sizes, power analysis.
- Bayesian framework: increased attention since replication crisis -> against incorrect
interpretations of test results, p-hacking, underpowered studies, publication bias.

Both use empirical research collected data to learn from. Information in this data is captured
in a likelihood function.
- On x-axis: values for mu.
- On y-axis: likelihood of each value for mu and the observed data.

In frequentist approach: all relevant information for inference is contained in the likelihood
function.
In Bayesian approach: in addition to the likelihood function to capture the information in
the data, we may also have prior information about mu.
- Central idea: prior knowledge is updated with information in the data and together
provides posterior distribution for mu.
o Advantage: accumulating knowledge.
o Disadvantage: results depend on choice of prior.

Bayesian estimates and probability
The posterior distribution of the parameter(s) of interest provides all desired estimates:
- Posterior mean or mode: the mean or mode of the posterior distribution.
- Posterior SD: SD of posterior distribution (comparable to frequentist standard error).
- Posterior 95% credible interval: providing the bounds of the part of the posterior in
which 95% of the posterior mass is.

Bayes conditions on observed data (probability that hypothesis Hj is supported by the data);
whereas frequentist testing conditions on the nullhypothesis (p-value= probability of
observing same or more extreme data given that the null is true).

Researchers with hypotheses may prefer to get info on the probability that their hypotheses
are true.
- To what extent does the data support their hypotheses?
o PMP= Posterior Model Probability (the Bayesian probability of the hypothesis
after observing the data).
Bayesian probability of a hypothesis being true depends on 2 criteria:
- How sensible it is, based on current knowledge (the prior).
- How well it fits new evidence (the data).

,Furthermore, Bayesian testing is comparative: hypotheses are tested against one another,
not in isolation.
This is also seen in the Bayes factor: 𝐵𝐹 = 𝑃(𝑑𝑎𝑡𝑎|𝐻1)
- BF10=10 Support for H1 is 10 times stronger than for H0
- BF10=1 Support for H1 is as strong as support for H0
Posterior probabilities of hypotheses (PMP) are also relative probabilities.
PMPs are an update of prior probabilities (for hypotheses) with the BF.

Definition of probability
Both frameworks use probability theory, but:
- Frequentists: probability is relative frequency (more formal?)
- Bayesians: probability is degree of belief (more intuitive?)
This leads to debate (same word used for different things) and to differences in the correct
interpretation of statistical results. E.g., p-value vs PMP; also:
- Frequentist 95% confidence interval (CI): If we were to repeat this experiment many
times and calculate a CI each time, 95% of the intervals will include the true
parameter value (and 5% does not).
- Bayesian 95% credible interval: There is 95% probability that the true value is in the
credible interval.

Multiple linear regression




Least squares principle: the distance between each observation and the line, represents the
error. The blue line is drawn in a way that the residuals are as small as possible.
- Blue line is the model for linear regression, is a predicted outcome.
- Intercept (b0): where blue line hits the y-axis. Value of y when x is 0.
- Slope (b1): how steep the line is. Change in y when x increases by 1.
- Residual: the black dots.

The variables must be continuous.

Model assumptions
- All results are only reliable if assumptions made by the model and approach roughly
hold

, o Serious violations lead to incorrect results.
o Sometimes there are easy solutions (e.g. deleting a severe outlier; or adding a
quadratic term) and sometimes not.
o Per model, know what the assumptions are and always check them carefully.
- Multiple linear regression assumes interval/ratio variables (outcome and predictors).
The exception are the dummy variables.

Example RQ: Are gender and age predictors of grade?
- Grade on scale 0-10; numbers have numerical meaning. OK!
- Age in years; numbers have numerical meaning. OK!
- Gender coded as: 1 = male; 2 = female. Categorical; numbers do not have numerical
meaning. Not OK!
Multiple linear regression can handle dummy variables as predictors.
- Dummy variable has 0 and 1 (e.g., Dmale,i = 1 for males; 0 for females)
𝑔𝑟𝑎𝑑𝑒i = B0 + B1*Agei + B2*Dmale,i
- Interpretation of B2: difference in mean grade between males and females with the
same age.

Evaluating the model
- With frequentist (classical) statistics:
o Estimate parameters of model
o Test with NHST if parameters are significantly non-zero, e.g.
▪ H0: R2 = 0 versus HA: R2 > 0 (R = multiple correlation coefficient =
correlation between Y and the predicted Y), (R2 = the variation in the
model).
▪ H0: 𝛽 = 0 versus HA: 𝛽 ≠ 0
- With Bayesian statistics:
o Estimate parameters of model
o Compare support in data for different models/hypotheses using Bayes factors

Example:
Can Life Satisfaction be predicted from Age and Years of education?
- y = Life Satisfaction, x1=Age; x2=Years of education.




Must look at Adjusted R2, because this is the size in the population (variation in model), R2 is
only for the sample.

Geschreven voor

Instelling
Studie
Vak

Documentinformatie

Geüpload op
9 januari 2024
Aantal pagina's
28
Geschreven in
2022/2023
Type
College aantekeningen
Docent(en)
Onbekend
Bevat
Alle colleges

Onderwerpen

$7.74
Krijg toegang tot het volledige document:

Verkeerd document? Gratis ruilen Binnen 14 dagen na aankoop en voor het downloaden kun je een ander document kiezen. Je kunt het bedrag gewoon opnieuw besteden.
Geschreven door studenten die geslaagd zijn
Direct beschikbaar na je betaling
Online lezen of als PDF

Maak kennis met de verkoper

Seller avatar
De reputatie van een verkoper is gebaseerd op het aantal documenten dat iemand tegen betaling verkocht heeft en de beoordelingen die voor die items ontvangen zijn. Er zijn drie niveau’s te onderscheiden: brons, zilver en goud. Hoe beter de reputatie, hoe meer de kwaliteit van zijn of haar werk te vertrouwen is.
clairenooijen Universiteit Utrecht
Volgen Je moet ingelogd zijn om studenten of vakken te kunnen volgen
Verkocht
176
Lid sinds
7 jaar
Aantal volgers
76
Documenten
25
Laatst verkocht
3 maanden geleden

3.7

15 beoordelingen

5
3
4
8
3
1
2
3
1
0

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Kwaliteit die je kunt vertrouwen: geschreven door studenten die slaagden en beoordeeld door anderen die dit document gebruikten.

Niet tevreden? Kies een ander document

Geen zorgen! Je kunt voor hetzelfde geld direct een ander document kiezen dat beter past bij wat je zoekt.

Betaal zoals je wilt, start meteen met leren

Geen abonnement, geen verplichtingen. Betaal zoals je gewend bent via iDeal of creditcard en download je PDF-document meteen.

Student with book image

“Gekocht, gedownload en geslaagd. Zo makkelijk kan het dus zijn.”

Alisha Student

Bezig met je bronvermelding?

Maak nauwkeurige citaten in APA, MLA en Harvard met onze gratis bronnengenerator.

Bezig met je bronvermelding?

Veelgestelde vragen