For Multiclass Classification
Decision function:
The standard SVM only separate 2 classes. So we have 2 strategies to deal with multiclass classification:
- One-vs-All (OvA)
- Train one binary classifier per class. Either in that class or out of that class.
- Pick the classifier that gives the highest score.
- But this may lead to overlapping regions and the negative class being very unbalanced.
- Multicategory SVM
- Actually quite confusing, but basically instead of
we have class codes. refers to the number of classes.
- Actually quite confusing, but basically instead of
- Then we set the objective function as
where $\mathbf{f}(x_{i}) = (f_{1}(x_{i}), \dots, f_{K}(x_{i}))$ is the predicted score vector for sample $i$, $y_{i}$ is the target class code for sample $i$, and $(\mathbf{f}(x_{i}) -y_{i})_{+}$ refers to the hinge penalty – how far the prediction is for the target. Note that $(a)_{+} = \max{(0, a)}$. Basically, this means that you only penalize when misclassified.
Next, you have some regularization term $\Vert h_{j}\Vert^2_{\mathcal{H}_{K}}$ and the hyperparameter cost $L$.
For Regression
Here is the function you want to minimize:
- There is an addition of
which is a part of -insensitive loss function – basically to take into account some error.