Following LDA, we can then formalize Fisher’s Criterion for best separation
where
So the optimal w,
Note that the derivation comes from substituting
Some things to note with the value of
We define a Langragian function:
Let
which means that
Substituting
Remember that
Note that this is the binary linear discriminant. The multiclass version is slighty different.