Let’s say we have a sample set
We want to measure the distance (let’s say, using Euclidean distance):
And so, the nearest neighbor decision is
Note here that
Nearest-Neighbor
We find the nearest value (
) to our target value . Then, we copy whatever is the label of to .
Some questions that you may have:
- What distance to use?
- How many samples should we base it on, data could be too noisy.