Written by students who passed Immediately available after payment Read online or as PDF Wrong document? Swap it for free 4.6 TrustPilot
logo-home
Summary

Summary lectures Applied Multivariate Data Analysis (Fall) (6462PS009Y)

Rating
-
Sold
2
Pages
46
Uploaded on
20-01-2023
Written in
2022/2023

Summary of all the lectures regarding the four topics for the AMDA fall exam.

Institution
Course

Content preview

Exam Amda

Topic 1 - PCA & CFA

Learning Goals

1) Understand how PCA and CFA are (un)related in terms of both solution structure and
model specification.

PCA is widely applied —> Working with many variables that measure the same concept.
- You have an idea on how to design a questionnaire for some “concept”. —> you design
questions that address several subconcepts —> items suppose actually adequate in their
subdomain?

—> important to reduce the dimensionality and scale construction; for example measuring
intelligence, 1 concept? 2 concepts? or 3 concepts?

Two main approaches
1) Principal component analysis (only one component that is the only principal one).
- Use this technique to get an idea of any underlying structure —> Set of variables, on which you
could compute a correlation matrix. But this set has no distinction in terms of predictor and
output. Interested in how these variables interrelate/correlation
- Exploring (not testing!) (no sign. test)
- No pre-assumed structure, all 20 variables could be either predictors or outcomes.

2) Factor analysis
- Contains groups of variables, but already has an initial idea of what these groups of variables
might be.
- Predefined structures and these structures will be tested/evaluated in factor analysis. This
predefining might come from previous research or your own theory.
- Confirmatory method
- Makes a precise model for the relationship between items and scales —> Model true for your
current sample?
-
So in short;
Both methods deal with groups of variables. PCA tries to identify potential groups of variables
based on the correlation structure. In CFA, the idea of variable grouping and testing whether this
grouping structure works for our data.

Confirmatory Factor Analysis
- Test a specific factor structure —> predefined structure, grouping items/variables. No particular
predictor or outcome in this analysis.
- Trying to come up with fit measures to tell us how well our predefined structures work for the
data that we have.

- How do we test this?

,In linear regression —> explained variables (to check the fit) (residuals - predicts scores). The
closer the predicted scores are to our outcomes, the better the model works. (Significant regression
coefficients).

In CFA —> matrix situation, set of variables, construct covariance matrix (correlation matrix -
standardized matrix). —> observed matrix because we can compute the covariances between the
things that we have observed and we define a factor model and use this to predict a covariance
matrix.
So we compare the observed covariance matrix with the predicted covariance matrix. —> residuals
in a matrix shape.

> Interdependence technique —> so no predictor vs outcome (CFA)
Some form of a predictor structure going on.
> Dependence technique —> set of predictors and assume that some score in y depends on the
predictor variables (Linear regression)

Confirm our theoretical construct division.

Technical aim:
—> reproduce correlation / predict covariance matrix.
—> Error —> misfit between observed and predicted matrix. Errors are not correlated.
—> Correlations based on the observed numbers should be explained by common factors (sound
like regression).
—> Regression equation with manifest response variables with two latent predictors. Assume that
there is some underlying but not directly observable process going on (F1 and F2). —> but this
does lead to differences in scores in the variables that we do observe. (Variable X1-X6).

So there is single linear regression —> F1 —> X2, Factor 1 —> X2 etc.
We can predict scores for the items and predict a covariance matrix. Assume something going on we
cannot see and that something leads to the scores that we observe in the variables themselves.
Factors can be correlated or not. Can also be that items are explained by more than 1 factor.
(Crossloadings). —> if many it could be that you are ignoring the correlations between the factors.

Compared to component analysis;
For each component, there is an arrow to all of the items. Some items will have close to zero
numbers and some will be very high.

CFA is more strict than PCA because the item will have to be exactly a loading of zero.

Factors; theoretical constructs that we examine —> can be that our CFA is derived from PCA.
Components; empirically suggested combinations of variables —> may or may have not meant. In
CFA you already assume that this structure has meaning.

PCA —> if you have a set of variables you have no idea what will happen in terms of structure.
EFA —> instruments that have never been tested before. Some ideas and some items may be
correlated in some way, so there is a theory, but you are not testing this theory.
PCA —> One single strong conceptual idea of the factor structure.

,Example;

4 correlated factors and 15 items. Assume —> that each of these items corresponds to one factor
only. —> number of covariances; 0.5 x number of items x number of items + 1 = 120 covariances in
this case; 0.5 x 15 x 16. —> Units of observations (fit evaluation).




What elements are estimated in the model;
15 unique variances.
4-factor variances.
Correlations between the factors —> so 6 covariances between the factors; formula: 5 x number of
factors x number of factors - 1.
11 loadings (unique factor loadings) —> difference between 15 items and 4 factors. For each of
these factors, one of the arrows needs a fixed factor loading, all of the other factor loadings will be
relevant to that number. Thus, 4 are fixed and 11 values remain to be estimated.

Counting everything together —> 11 + 15 + 4 + 10 = 36 model parts that are going to be estimated.
(Number of parameters estimated in the model). We have 120 covariances, so 120-36 = 84 degrees
of freedom.

Check in the output;
- Warning and errors
- Standardized residuals
- Residual distribution
- Model fit statistics
- Estimated parameter
- Suggestion for improvement.

Assumptions
- Like performing CFA on metric/numerical variables. Scale/interval variables.
- The sample needs to have more observations than variables.
- For stable covariances we need 100 observations —> but CFA wants more, at least 200.
- Minimum 5 items, but preferably 10 observations per variable.
- A strong conceptual idea of what you are going to test —> hypothesized model!

Rule of thumbs

Look at the X2 statistics
CFI —> confirmatory fit >.95 (how well does your model fit).
sRMR —> <.08,
RMSEA —> <.06 (90% confidence interval)
—> Also apply equivalence tests.
No need for rotation, we don’t need to identify the best possible view of subgroups. There is one
specific subgroup defined by ourselves.
You can have variables that have a high coefficient on one factor only. Not persé a problem, as long
as you can defend it.

, Model specification;
- 4 factors = 4 latent variables
—> if 13 variables —> 91 covariances, so 91 numbers in this dataset.
Residual distribution
—> Symmetric distribution, on average 0, equal amounts of over and under estimations.

Interpreting model fit statistics.

Chisq + df + p value = fit statistics.
Baseline. Chisq —> the difference between the model that you have currently estimated and a
model without any factor structure. The larger the X2, the larger the difference. —> difference
between the covariance matrix based on your model, compared to a model without any factor
structure. You want to have specified a model that is better than no model specification —> Should
be significant!

The other chi-square —> is the difference between the observed and predicted model, covariances,
using your data and your model —> you do not want to be significant! Should be alike! A large
difference means that your prediction does not resemble the observation matrix. —> significantly
deviating from our observations.

- CFI should be >.95

We want to have a small standardized root mean residuals —> sRMR <.08
Rmsea —> <.06, if 0 —> perfect match between prediction and observation.

You can have a very reasonable model but is not very strongly fitting yet, based on the fit statistics
and X2.

Suggestions for improvement

Maybe being strict too strict by not allowing some of the factors to predict two of these items, for
example, factor 2 should also be allowed to have items from factor 1 in his model? —> this may
lead to an extra add of variance leading to a perfect/sufficient fit.

Request modification index that suggests where you might want to add things to your model to
improve your fit. For example; factor 3 should add vocabulary also in their model, leading to a
2.494 decrease in misfit.

2) Understand how PCA goes from data to component structure.

Rely on interrelationships between variables. Technique searches for a structure of components
by finding groups of variables that show high correlations within the group, but the lower
correlation between the groups.

Interpretation comes afterward, as an exploratory technique. Possible that the structures do not
make sense —> Probably due to weak correlations.

So —> Going from data (correlation matrix) to potentially suggested models that support a theory

Written for

Institution
Study
Course

Document information

Uploaded on
January 20, 2023
File latest updated on
January 23, 2023
Number of pages
46
Written in
2022/2023
Type
SUMMARY

Subjects

$16.30
Get access to the full document:

Wrong document? Swap it for free Within 14 days of purchase and before downloading, you can choose a different document. You can simply spend the amount again.
Written by students who passed
Immediately available after payment
Read online or as PDF

Get to know the seller

Seller avatar
Reputation scores are based on the amount of documents a seller has sold for a fee and the reviews they have received for those documents. There are three levels: Bronze, Silver and Gold. The better the reputation, the more your can rely on the quality of the sellers work.
steffaniebrama Universiteit Leiden
Follow You need to be logged in order to follow users or courses
Sold
29
Member since
6 year
Number of followers
23
Documents
5
Last sold
4 months ago

4.5

2 reviews

5
1
4
1
3
0
2
0
1
0

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Working on your references?

Create accurate citations in APA, MLA and Harvard with our free citation generator.

Working on your references?

Frequently asked questions