IML L7.3 k-Neighbours
IML L7.3 k-Neighbours
k-Neighbours
The kk-neighbours method is an instance-based learning algorithm. It remembers the training set and when a new data point is presented it looks for the closest kk samples from the training set and returns
- the average of the target values of these kk values for regression
- the class of the majority of the kk training examples. (using some procedure to break ties)
Regularisation
The parameter kk can be used to control overfitting.
- With k=1k=1 the algorithm is likely to overfit.
- Large values of kk can lead to underfitting.
Example
We can use the iris dataset:
k=1
k =3
k=10
k=20
Digits example
We can use the 8x8 digits picture example after applying PCA to reduce it to 2 dimensions:
k=1
k=3
k=5
This post is licensed under CC BY 4.0 by the author.