Remember that Naive Bayes only work when the features are uncorrelated to each other. We can use PCA to derive a new set of variables that are:
- Linear combinations of the original variables
- Are uncorrelated
- Are in a decreasing order of importance
Setting
Let
We want to find the orthogonal transformation
Solution
There are two interesting points in the derivation of PCA
- The direction where the variance is the largest is an eigenvector.
- The eigenvalue of the eigenvector is the variance. As, such, we can find all the eigenvectors, and then sort them based on their eigenvalue, to see which direction we can use in our new dimension.