Written by students who passed Immediately available after payment Read online or as PDF Wrong document? Swap it for free 4.6 TrustPilot
logo-home
Summary

Summary (MultivariateStatistics) Principal Component Analysis (Ch1)

Rating
-
Sold
-
Pages
5
Uploaded on
29-12-2025
Written in
2025/2026

This document introduces Principal Component Analysis as a dimensionality reduction technique used to simplify high-dimensional data while preserving most of its variance. It covers the motivation behind PCA, the underlying idea of principal components, the mathematical procedure involved, and the interpretation of results. The notes emphasize both theoretical understanding and practical application.

Show more Read less
Institution
Course

Content preview

Principal Component Analysis (PCA)

1. Motivation
In modern data analysis, datasets often consist of a large number of variables, sometimes ranging from
tens to thousands of features. While having more variables can increase the richness of information, it
also introduces several significant challenges.

One major issue is the curse of dimensionality, where the volume of the data space increases
exponentially with the number of dimensions. As dimensionality grows, data points become sparse,
making it difficult to identify meaningful patterns or relationships. This negatively affects statistical
modeling, machine learning performance, and computational efficiency.

Another challenge is redundancy among variables. In many datasets, variables are highly correlated,
meaning they carry overlapping information. For example, in socioeconomic data, income, education
level, and occupation may all reflect similar underlying trends. Analyzing all correlated variables
separately can be inefficient and may distort results.

Additionally, high-dimensional data is difficult to visualize. Humans can easily interpret one-, two-, or
three-dimensional plots, but beyond that, direct visualization becomes impossible. This limits
exploratory data analysis and intuitive understanding.

Principal Component Analysis (PCA) was developed to address these challenges. Its main goals are:

• To reduce dimensionality while retaining as much relevant information as possible

• To eliminate redundancy by transforming correlated variables into uncorrelated components

• To simplify data structure, making analysis, visualization, and modeling more efficient

In summary, PCA provides a mathematically sound method to compress data without significantly
sacrificing the underlying structure or variability.




2. The Idea
The fundamental idea behind Principal Component Analysis is to re-express the data in a new coordinate
system where the axes represent directions of maximum variance.

, Instead of analyzing the original variables directly, PCA constructs new variables known as principal
components. These components are derived as linear combinations of the original variables, meaning
each principal component is formed by multiplying each original variable by a coefficient and summing
the results.

Key Properties of Principal Components:

1. Orthogonality

o Each principal component is orthogonal (perpendicular) to all others.

o This ensures that components are uncorrelated and capture distinct information.

2. Variance Maximization

o The first principal component captures the maximum possible variance in the data.

o Each subsequent component captures the maximum remaining variance subject to
being orthogonal to previous components.

3. Ordered Importance

o Components are ordered by importance based on how much variance they explain.

o Typically, only the first few components are required to represent most of the
information.

Geometrically, PCA can be understood as rotating the coordinate axes to align them with the directions
of greatest data spread. Instead of measuring variability along arbitrary original axes, PCA identifies the
most informative directions inherent in the data.

This transformation allows complex datasets to be represented in a more compact and interpretable
form without altering the relative relationships between observations.




3. The Method
Principal Component Analysis follows a systematic mathematical procedure. Each step is essential to
ensure accurate and meaningful results.

3.1 Data Standardization
Before applying PCA, the data is usually standardized:

Written for

Institution
Course

Document information

Uploaded on
December 29, 2025
Number of pages
5
Written in
2025/2026
Type
SUMMARY

Subjects

$4.22
Get access to the full document:

Wrong document? Swap it for free Within 14 days of purchase and before downloading, you can choose a different document. You can simply spend the amount again.
Written by students who passed
Immediately available after payment
Read online or as PDF

Get to know the seller
Seller avatar
lucastitodemoraisv2

Get to know the seller

Seller avatar
lucastitodemoraisv2 ISEG
Follow You need to be logged in order to follow users or courses
Sold
-
Member since
4 months
Number of followers
0
Documents
3
Last sold
-

0.0

0 reviews

5
0
4
0
3
0
2
0
1
0

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Working on your references?

Create accurate citations in APA, MLA and Harvard with our free citation generator.

Working on your references?

Frequently asked questions