Instead of normal Logistic Regression which is binary, we want to classify it to multiple classes.
Binary classification:
- Weighted sum β Logistic function β Compare with threshold β Classification. Multiclass classification:
- Weighted sum β Compare among peers (Max function) β Classification.
But max functions donβt sum up to 1. We need softmax.
And the loss function is cross-entropy loss.
Note that is the output probability of the th sample being in th class, and is the indicator of the correct class label.