Columbia
Links: Index, Third Year, Fall 2025 Term 1,
STAT 306 - Finding Relationships in Data
#classnotes
Subject: MATH307 MATH340 #STAT306 PHYS119
Topic::
Slide 18 - Principal Component Analysis (PCA)
We want to transform a data frame with many columns into one with fewer columns.
With high dimensional spaces, visualization, model fitting and distance metrics (SD, Var)
become problematic
To get around this, there are strategies such as regularization and dimensionality reduction.
PCA
In this example, we drop magnesium and we project the points onto a horizontal line, we can
instead project it into other lines, planes, hyperplanes
-> We project onto a line where the variability is the highest so that we don't lose out on
information when we move to lower dimensions, and retain the most information as possible.
Ide a
This s t u d y s o u r c e was downloaded by 100000900134793 from CourseHero.com on 01-10-2026 12:10:17 GMT -06:00
, I want to “compress” my data in lower dimension in such a way that if I were to try to recover
https://wm
wwy.coourrisgehienroa.clod
m/a
fi tlea/25f3r5o6m
3930t /h
STeAcT-o3 0m6 - p1 - r1 e2 psd fs/ ed data, the reconstructed points would be as close as