Generalized Optimal Hyperplane

Following Optimal Hyperplane, we have generalized optimal hyperplane that deals with non-separable cases.

Actually, we just need to add a slack / error variable $ξ_{i} \geq 0$ to relax the constraint:

y_{i} ((w \cdot x_{i}) + b) \geq 1 - ξ_{i}, i = 1, 2, \dots, l

We then define a function $F_{σ} (ξ) = \sum_{i = 1}^{l} ξ_{i}^{σ} σ > 0$ , which reflects how much of the original constraints are violated.

Generalized Optimal Hyperplane

$\begin{align} &\min \Phi(\mathbf{w}, \boldsymbol{\xi}) = \frac{1}{2}(\mathbf{w}\cdot \mathbf{w}) + C\left( \sum_{i=1}^l \xi_{i} \right) \quad \text{w.r.t } \mathbf{w} \\ &\text{s.t. } y_{i}((\mathbf{w} \cdot \mathbf{x}_{i}) + b) \geq 1-\xi_{i}, \quad i=1,2,\dots,l$

\end{align}

$Where parameter $C$ controls the penalty on errors.$

I am kinda too tired to derive everything like we did in Optimal Hyperplane, but we need to use Kuhn Tucker theorem as well, but with the error term $ξ_{i}$ .

Here are the revised theorem:

Primal Problem

$\begin{align} &\min \psi(\mathbf{w}, \boldsymbol{\xi}) = \frac{1}{2}(\mathbf{w}\cdot \mathbf{w}) + C\left( \sum_{i=1}^n \xi_{i} \right) \\ &\text{s.t. } y_{i}[(\mathbf{w}\cdot \mathbf{x}_{i})+b] -1 + \xi_{i} \geq 0, \quad \xi_{i} \geq 0, \quad i=1,..,l$

\end{align}

Dual Problem

$α max W (α) = i = 1 \sum l α_{i} - \frac{1}{2} i, j = 1 \sum l α_{i} α_{j} y_{i} y_{j} (x_{i} \cdot x_{j})$
s.t. $\sum_{i = 1}^{l} y_{i} α_{i} = 0,$ and $0 \leq α_{i} \leq C, i = 1, \dots, l$

The decision function solution:

f (x) = sgn (i = 1 \sum n α^{*} y_{i} (x_{i} \cdot x) + b^{*})

Messy Notes

Explorer

Generalized Optimal Hyperplane

Graph View

Backlinks