PCA is like rotating a 2D photograph of a 3d object until you find the angle with the most detail
- Standardisation: Scale everything into the same dimension and stuff, and scales
- Covarience: It looks at how the variables move together, if two are related, they are redundant etc
- Identifying the axis: PCA finds new axes for your data via;
PC1: The direction of the data is most spread out
PC2: the direction perpendicular to PC1 that captures the highest amount of varience
- Dimensionaliaity reduction: You keep the top few components and discard the rest, killing the noise
can get rid of low low variance variables, then it finds the correlation between vairables. If it is high, they are removed.
This is where the covariance matrix comes in, it compares every variable with every other variable