4. fitting — Model Fitting

4.1. Introduction

The term model is used in a very general manner. A model usually is a mathematical (often a statistical) tool that can be adjusted to some given data points. Once set up (trained), it can be evaluated at arbitrary input values. There are roughly two types of models:

  1. Classification
  2. Regression

In classification problems an input point has to be assigned to a cluster. If the clusters are labeled, the collection of all clusters defines a codebook, so that the problem can also be viewed as finding the optimal code for each input value. Loosely speaking regression means curve fitting, i.e. adjust parameters of a mathematical function such that it optimally represents the training data. Of course, it has to be defined what optimal means. There is no general answer to this question, as it depends on the fitting target (the model). Often, the residual sum of squares is used, i.e. the sum of squared deviation between measurement and prediction from the model.

To be clear, the model only specifies how training data is explained. For learning problems, the model is incomplete (some information is missing, e.g. parameters) and there’s need for a learning algorithm that determines this information. The model learning algorithms strongly depend on the input their given. Basically, there are two situations:

  1. Supervised learning
  2. Unsupervised learning

For supervised learning, the input values are measured together with the target output values. Both are presented to the fitting algorithm, so that it can train the model to fit the targets optimally. However, it’s not always the case that targets are available. If there are only input but no output values, it’s an unsupervised learning problem. This implies that it’s not possible to use an error measurement for training or evaluation. So unsupervised learning algorithms try to find some (hidden) structure in the input data. A typical example is finding the location where the density of input points is maximal.

4.2. Data model

There are three kinds of information:

  1. Features
  2. Labels
  3. Weights

A feature is the same as one input value. It may be a single value, however the inputs are often multidimensional, so in general the features are vectors (lists). The labels are the target values to the respective input. In most cases, the labels are unidimensional. However, the concrete data format (of input and output values) strongly depends on the model and thus is specified by the learning algorithm. Some algorithms also allow weighted samples. The weights are usually real numbered values, passed as a list to the learning algorithm.

Usually, the training data is passed to the fit method seperately through the respective parameters. For unsupervised learning, the labels argument will be ignored. Internally, there are several ways to represent training data. For details, see Data conversion.

In this library, the model and learning algorithm are combined in a single class. The class Model defines the interface for all learning algorithms (supervised or unsupervised).

class ailib.fitting.Model

Interface for fitting algorithm.

err((x, y))
Return the distance between the target y and the model prediction at the point x.
Evaluate the model at data point x.
fit(features, labels=None, weights=None)

Fit the model to the training data.

  • features – Training samples.
  • labels – Target values.
  • weights ([float]) – Sample weights.


Many learning algorithms use several models to make the prediction more robust. For this, a very general interface is defined. The class Committee allows to collect models and it provides different mixins for evaluation and error measurement.

class ailib.fitting.Committee

Bases: ailib.fitting.model.Model

Interface for fitting algorithms that base on an ensemble of several models.

Typically, the committee consists of several — possibly different — submodels. All submodels are trained first, where the manner in which this happens is not known here (thus the fit method is not overwritten). For prediction, each trained submodel is evaluated. The final result is then based on the individual predictions. How the predictions are assembled depends on the problem type.

The mixin classes provide common evaluation and error functions. Use like so:

>>> class foo(Committee.Sampling, Committee): pass
class Classification

Mixin for classification models.

err((x, y))
0-1 loss function for classification.
Return the most frequently predicted class of x.
class Committee.Regression

Mixin for regression models.

err((x, y))
Squared residual.
Return the average model prediction at x.
class Committee.Sampling

Mixin for classification models. The class label is sampled from all predicted labels.

err((x, y))
0-1 loss function for classification.
Return the class of x, sampled from the predictions of the individual models.
Add a model m to the committee.

Table Of Contents

Previous topic

3. sampling — Sampling Methods

Next topic

4.3.1. Linear Regression

This Page