Written by students who passed Immediately available after payment Read online or as PDF Wrong document? Swap it for free 4.6 TrustPilot
logo-home
Summary

Summary Applied Multivariate Data Analysis

Rating
-
Sold
-
Pages
25
Uploaded on
23-04-2024
Written in
2023/2024

This is a summary/overview of the most important topics that were discussed during the course 'Applied Multivariate Data Analysis'. It is relevant for all Psychology Masters, since the information/literature is the same. It is 26 pages.

Show more Read less
Institution
Course

Content preview

Summary Applied Multivariate Data Analysis

Important Analyses
Linear Regression – a linear regression is a way of predicting values of one variable from
another based on a model that describes a straight line. This line summarizes the pattern of
the data best.
- R2 – explained variance of the model, proportion of variance in the outcome variable
that is shared by the predictor variable
- F – ratio of how much variability the model can explain relative to how much it can’t
explain
- b-value – the gradient of the line and the strength of the relationship between a
predictor and the outcome variable
 b0 = intercept, the value of the outcome variable we would predict if the
predictor value would be 0

b-coefficients vs. beta-coefficients
- b = change in outcome is associated with a unit change in the predictor
- beta = the same as b-value, but expressed as standad deviations. Thus, because
these values are standardized we can compare them across studies or multiple
predictors when you have a multiple regression

How good is the model?
- If the regression model can predict something, it will be more steep than the flat line
that would be the mean of all people on the dependent variable
- If the F-value is greater than 1, it means the model can explain some variance
 F = 100: there is a 100 times more explained variance than unexplained variance
 F = 1: explained and unexplained variance is the same
- In order to check how well the model fits the data, we check multiple things:
 Standardized residuals/residual distance – for cases with a large prediction error
 Distance from the individual points to the regression line (the model)
 Influential cases that might bias the regression model do not have large
residuals per se > why we also check for other distances
 Mahalanobis distance – for outlying cases on the predictor
 Distance that the individual point is removed from the other points in the
space of the independent variables (thus, on the x-axis)
 Cook’s distance – for unfluential cases, measures the influence of a single case on
the model as a whole
 How much does the regression slope shift due to inclusion of this outlier

,General rules to see if there is an outlier based on standardized residuals:
1. Standardized residuals with an absolute greater value than 3.29 (approximately 3) is
cause for concern
2. If more than 1% of the sample cases have a residual above 2.58 (approximately 2.5) it
is cause for concern
3. If more than 5% of the sample cases have a residual above 1.96 (approximately 2) it is
cause for concern

General rules to see if there is an outlier based on the Mahalanobis distance:
1. Influential cases have values above 25 in large samples (500 or more)
2. Influential cases have values above 15 in smaller samples (100)
3. Influential cases have values above 11 in small samples (30 or less)

Multiple regression – this is the same as a simple linear regression, but with multiple
predictors.
- Ideally, all predictors have a high correlation with the outcome variable but the
correlations among the predictors is low. The higher the correlation among
predictors, the less information each predictor adds uniquely
- When the correlation among predictors is high, it causes multicollinearity: this
means that the variables basically explain the same variance (at least for a large
part). SPSS automatically corrects for this, which can cause changes between the
regression coefficient and the correlations (e.g. there is a positive correlation yet the
regression coefficient is negative). This is called bouncing betas
- Ways to detect multicollinearity:
1. Correlations between predictors is higher than .80
2. VIF of a predictor > 10
3. Tolerance of a predictor < .10
- Apart from bouncing betas, multicollinearity also causes other problems, namely, a
limited size of R given the number of predictors (adding a predictor with little unique
contribution) and difficulties with determining the importance of predictors (refers to
bouncing betas)

Assumptions Regression Analysis
1. Linearity – the relationship between the predictor and the outcome variable must be
linear
 Check 1) residual plot with Zpred. X vs. Zresid. Y or 2) scatterplot with predictor X
vs. dependent variable Y
 If the residuals show a curved pattern, the regression model is not optimal >
assumption is not met
2. Homoscedasticity / homogeneity of variance – for each value of the predictors, the
variance of the residuals should be equal (or: spread of outcome scores is roughly
equal at different points in the predictor variable)
 Check the residual plot with Zpred. X vs. Zresid. Y
 The residuals should al be equally centered around 0, with generally an equal
amount of residuals an all sides (left, right, under and above). If this is not the
case, we call it heteroscedasticity

,  If the residuals increase with the predicted values, the heteroscedasticity may be
explained with another predictor




3. Normally distributed errors – if the errors are not normally distributed, we cannot
trust the –values of the significance tests (with small N)
 Check 1) histogram of the residuals for multiple peaks or outliers or 2) scatterplot
with Zpred. X and Zresid. Y for the normal curve or 3) Q-Q plots
4. Independence of errors – all values of the outcome variable should come from a
different person
 Error terms of observations should be uncorrelated

Written for

Institution
Study
Course

Document information

Uploaded on
April 23, 2024
Number of pages
25
Written in
2023/2024
Type
SUMMARY

Subjects

$7.86
Get access to the full document:

Wrong document? Swap it for free Within 14 days of purchase and before downloading, you can choose a different document. You can simply spend the amount again.
Written by students who passed
Immediately available after payment
Read online or as PDF

Get to know the seller
Seller avatar
nienkevermaat

Get to know the seller

Seller avatar
nienkevermaat Erasmus Universiteit Rotterdam
Follow You need to be logged in order to follow users or courses
Sold
10
Member since
2 year
Number of followers
0
Documents
9
Last sold
5 months ago

0.0

0 reviews

5
0
4
0
3
0
2
0
1
0

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Working on your references?

Create accurate citations in APA, MLA and Harvard with our free citation generator.

Working on your references?

Frequently asked questions