For Multiclass Classification

Decision function:

The standard SVM only separate 2 classes. So we have 2 strategies to deal with multiclass classification:

  • One-vs-All (OvA)
    • Train one binary classifier per class. Either in that class or out of that class.
    • Pick the classifier that gives the highest score.
    • But this may lead to overlapping regions and the negative class being very unbalanced.
  • Multicategory SVM
    • Actually quite confusing, but basically instead of we have class codes. refers to the number of classes.
- Then we set the objective function as
	where $\mathbf{f}(x_{i}) = (f_{1}(x_{i}), \dots, f_{K}(x_{i}))$ is the predicted score vector for sample $i$, $y_{i}$ is the target class code for sample $i$, and $(\mathbf{f}(x_{i}) -y_{i})_{+}$ refers to the hinge penalty – how far the prediction is for the target. Note that $(a)_{+} = \max{(0, a)}$. Basically, this means that you only penalize when misclassified.

	Next, you have some regularization term $\Vert h_{j}\Vert^2_{\mathcal{H}_{K}}$ and the hyperparameter cost $L$.

For Regression

Here is the function you want to minimize:

  • There is an addition of which is a part of -insensitive loss function – basically to take into account some error.