Interactive animation

Bias-variance tradeoff

This expression summarizes the fact that introducing a certain amount of bias in an otherwise unbiased estimator may improve its performances.

The bias-variance tradeoff for an estimator

The performance of an estimator q* of a parameter q is measured by its Mean Square Error (MSE) that is shown to be :

MSE = Var(q *) + Bias(q *

 

Although the lack of bias is an attractive feature of an estimator, it does not guarantee the lowest possible value of the MSE. This minimum value is attained when a proper tradeoff is found between :

    * The bias of the estimator,  and

    * Its variance

so as to make the value of the above expression smallest.

 

As a matter of fact, it commonly observed that introducing a certain amount of bias in an otherwise unbiased estimator can lead to a significant reduction of its variance, so much so that the MSE will be reduced and therefore the perfomance of the estimator improved.

-----

In the Tutorial below, we show that of the two classic estimators of the variance :

    * The sample variance (biased) :

s² = 1/n.Si(xi - µ

    * And the "corrected" sample variance (unbiased) :

s' ²= 1/(n - 1).Si(xi - µ

the first one has the lower MSE of the two (despite its bias) when considering normal distributions.

We'll then identify a third estimator that is even better (lower MSE) than s² although its bias is the largest of the three.

The bias-variance tradeoff for models

The bias-variance tradeoff (or "bias-variance dilemma") is a very important issue in data modeling. Ignoring it is a frequent cause of model failure, and although it has a deep theoretical rooting, it can be explained in simple terms.
-----

A model consists of :

For example, in polynomial regression :

Once the architecture (the degree) is decided upon, fitting the model consists in finding the appropriate values of the parameters (in this case, using the Least Squares approach).

-----

But the analyst has first to decide on the appropriate degree of the polynom.

 

 

 

 

____________________


This is the essence of the bias-variance dilemma. In the example of the polynomial regression, it says that :

 

The degree of the "best" polynom must therefore be somewhere "in-between".

_________________________________

 

This phenomenon is not specific to polynomial regression. In fact, it shows-up under various guises in any kind of model. So, quite generally, the bias-variance tradeoff principle can be stated as follows :
 

  • Models with too few parameters are inaccurate because of a large bias (not enough flexibility).
  • Models with too many parameters are inaccurate because of a large variance (too much sensitivity to the sample).
  • Identifying the best model requires identifying the proper "model complexity" (number of parameters).

 

 

This important issue is illustrated by an interactive animation that you'll find here .

 

We address some aspects of the bias-variance tradeoff in the next section :

__________________________________________________________________

 

 

Tutorial

 

In this Tutorial, we compare the performances (MSE) of the two natural estimators of the variance of the normal distribution.

We show that the uncorrected (biased) estimator is performing better than its corrected (unbiased) counterpart.

-----

We then recognize that these two estimators belong to a class of estimators, and identify the best (lowest MSE) estimator in the class. Its bias will turn out to be even larger than that of the uncorrected sample variance.

 

 

 

 

BIAS-VARIANCE TRADEOFF

Comparing two estimators of the variance of the normal distribution

MSE of the corrected (unbiased) sample variance

MSE of the uncorrected (biased) sample variance

Bias

Variance

MSE

An even better estimator of the variance

A class of estimators

Identifying the best estimator in the class

Properties of the best estimator

Comparing the properties of the three estimators

TUTORIAL

 

____________________________________________________

 

Related readings :

Estimation

Mean Square Error

Ridge Regression

Download this Glossary

 

Want to contribute to this site ?