k-Nearest Neighbour
*******************

.. contents::

.. module:: ailib.fitting

The idea of k-Nearest neighbour classification very simple. First some training
data has to be provided. Then, any new data point is assigned the same label as
the majority of its k closest training data points.
In the simplest case, this looks like this:

1. Store all training points
2. When a new data point is presented, compute the distance to all training points.
   As default, the euclidean distance is used, i.e.
   
   .. math::
       d_i(\nu) = \| \nu - x_i \|^2
   
3. Select the training samples with k smallest distances.
4. Return the label of the selected training samples that appeared most often.

If sample weights are given, the distance in the second step is modified:

.. math::
    d_i(\nu) = \frac{1}{N w_i} \| \nu - x_i \|^2

This adjustment makes it more probable for a sample with a large weight to be in
the set of nearest neighbours. Instead of a majority vote on the labels, it is
also possible to pick a label at random (with respect to the distance to that point).

Only the base algorithm is implemented here. If a more advanced implementation is
required, have a loot at the `scikit-learn <http://scikit-learn.org/stable/>`_ package.


Interfaces
==========

.. autoclass:: kNN
   :members:
   :show-inheritance:

Examples
========