Remember that Naive Bayes only work when the features are uncorrelated to each other. We can use PCA to derive a new set of variables that are:
- Linear combinations of the original variables
- Are uncorrelated
- Are in a decreasing order of importance
Setting
Let be the original variables and be the linear combinations of these variables
We want to find the orthogonal transformation yielding new variables that have stationary values of their variances. The point is that you want to project into a new direction where the variance is the largest.
Solution
There are two interesting points in the derivation of PCA
- The direction where the variance is the largest is an eigenvector.
- The eigenvalue of the eigenvector is the variance. As, such, we can find all the eigenvectors, and then sort them based on their eigenvalue, to see which direction we can use in our new dimension.