Written by students who passed Immediately available after payment Read online or as PDF Wrong document? Swap it for free 4.6 TrustPilot
logo-home
Summary

Summary AMDA Spring Chapter 2: Predictive Regression

Rating
-
Sold
-
Pages
10
Uploaded on
21-03-2023
Written in
2020/2021

AMDA Spring Chapter 2: Predictive Regression

Institution
Course

Content preview

20

2.1 Predictive Regression
Explanation vs prediction
The goal of scientific psychology is to understand human behaviour. Historically, this has meant to
explain behaviour - that is, to accurately describe its causal underpinnings - and to predict
behaviour - that is, to accurately forecast behaviours that have not yet been observed.
In practice these two goals are rarely distinguished!

● It might seem that the best explanatory model is equal to the best predictive model
● But from a statistical point of view this is simply not true (see this lecture)
→ different things to make a best explanation as compared to the best prediction

Regression
The regression model Y=f(X1,X2)=a+b1X1+b2X2 can be used for explanation or prediction.
● Explanation: how are the X’s related to the Y.
So we test the beta values for significance, and which are significant to explain the variance
on another variable
● Prediction: if we have new X’s what will be the predicted value of Y and how accurate is the
prediction? → We try to be as accurate as possible in predicting, not too interested in
which variables are important
● In explanation you usually use everyone to create the explanatory model, while in
prediction you usually split up the data set and use one part to train the model, and use the
other half to see how well it does predict the values

Explanatory Regression
● Explanatory regression starts with a theory about the data. The regression model is a
translation of the theory into mathematical form.
○ For example: gender and neuroticism have an effect on depression.
● Depressioni = 2 + 0.5*genderi + 1.5*neuroticismi
● The hypotheses generated from the theory can be examined in terms of statistical tests on
the regression weights
● In explanatory regression it is important that the regression weights are estimated
accurately, i.e. they should be unbiased.
Given the data that you have you try to explain the outcome variable as good as possible.
● The regression model itself is the object of interest.
● Explanatory regression heavily depends on assumptions
E.g. normality, independence, etc. (for prediction they are usually not very important

Funny use of “prediction” in psychology
● In psychology we often see papers with titles like
1. Impulsivity predicts problem gambling ...
2.Trait rumination predicts onset of Post-traumatic stress disorder ...
3.Predicting reading and mathematics from neural activity …
● Often the words explanatory and prediction are being used interchangeably.
● In psychology (as compared to the weatherman) we try to predict certain variables as good
as possible, without particularly caring about which variables actually explain those
prediction (as compared to what you do in explanation which is where you look what
explains a certain score, aka. Which variables have a sig. Beta value in predicting the
outcome variable)

, 21


Predictive Regression
● Usually we split a data set into two datasets, from which we use one to train the model (aka
create a model by seeing which variables are good predictors) and the other to test the
model (does it predict the scores well enough):
● Suppose we have data and obtain estimates. This is the training phase.
y=2+0.5X1i+1.5X2i
● Further suppose we have a new observation with and X1 = 2 and X2 = 3
y = 2 + 0.5*2 + 1.5*3
● y = 7.5
(so we are focusing on how accurate the 7.5 is to the observed model)
● Prediction focusses on the accuracy of the prediction. Therefore, we compare the predicted
value (y^) against the observed value (y). This is the testing phase.
● It is important that training and testing is performed on two different data sets. This provides
out-of-sample prediction accuracy

● Usually when we only do one explanatory regression, and use this to “predict” values, the
R2 value usually overfits what it can actually explain. Because you base your prediction
from one sample on the same data as what you build your model on. SO you would need to
use an adjusted R2

● More general, we have a population where the means of Y are given by a function of the
predictor variable(s) (X): Y = f(X) + e
● Often we collect data for a sample of n persons. These data are given by used to train a
model(xi,yi),...,(xn,yn)yi=̂f(xi)+εi




● Suppose we have new observations from the population.
● Based on the model that we estimated on the training data , we can make predictions for
the newly observed data .
● We can compare the predictions against the observations using the mean squared
prediction error (PE): PE(̂f(x0))=E[(y0−̂f(x0))2]

Prediction error
● The prediction error decomposes into (important!)
○ bias: the difference between the estimated f^ and the true f
○ variance: the variability of the estimated f
(can’t measure this from one model. But when you have a more complex model and
you repeatedly sample data and each time you fit this model, the outcomes will
differ. So more complex models have larger variance)
○ irreducible term: variance of Y at a specific value of X (that you cannot reduce.)
○ So the prediction error can be decomposed into those three components:
(PE(̂f(x0))=[Bias(̂f(x0))]2+Var(̂f(x0))+σ2

Written for

Institution
Study
Course

Document information

Uploaded on
March 21, 2023
Number of pages
10
Written in
2020/2021
Type
SUMMARY

Subjects

$5.79
Get access to the full document:

Wrong document? Swap it for free Within 14 days of purchase and before downloading, you can choose a different document. You can simply spend the amount again.
Written by students who passed
Immediately available after payment
Read online or as PDF

Get to know the seller

Seller avatar
Reputation scores are based on the amount of documents a seller has sold for a fee and the reviews they have received for those documents. There are three levels: Bronze, Silver and Gold. The better the reputation, the more your can rely on the quality of the sellers work.
fionabrosig Universiteit Leiden
Follow You need to be logged in order to follow users or courses
Sold
46
Member since
5 year
Number of followers
33
Documents
8
Last sold
1 year ago

5.0

1 reviews

5
1
4
0
3
0
2
0
1
0

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Working on your references?

Create accurate citations in APA, MLA and Harvard with our free citation generator.

Working on your references?

Frequently asked questions