This expression summarizes the fact that introducing a certain amount of bias into an otherwise unbiased estimator may improve its performances.
The performance of an estimator θ* of a parameter θ is measured by its Mean Square Error (MSE) that is shown to be :
MSE = Var(θ*) + Bias(θ*)²
Although the lack of bias is an attractive feature of an estimator, it does not guarantee the lowest possible value of the MSE. This minimum value is attained when a proper tradeoff is found between :
* The bias of the estimator, and
* Its variance
so as to make the value of the above expression smallest.
As a matter of fact, it is commonly observed that introducing a certain amount of bias into an otherwise unbiased estimator can lead to a significant reduction of its variance, so much so that the MSE will be reduced and therefore the perfomance of the estimator will be improved.
In the Tutorial below, we show that of the two classic estimators of the variance :
* The sample variance (biased) :
s² = 1/n.Σi(xi - µ)²
* And the "corrected" sample variance (unbiased) :
S² = 1/(n - 1).Σi(xi - µ)²
the first one has the lower MSE of the two (despite its bias) when considering normal distributions.
We'll then identify a third estimator that is even better (lower MSE) than s² although its bias is the largest of the three.
A similar phenomenon is observed, for example, when estimating the parameter θ of the uniform distribution U[0, θ] and is illustrated here by an interactive animation.
The bias-variance tradeoff (or "bias-variance
dilemma") is a very important issue in data modeling. Ignoring
it is a frequent cause of model failure, and although it has a deep
theoretical rooting, it can be explained in simple terms.
A model consists of :
For example, in polynomial regression :
Once the architecture (the degree) is decided upon, fitting the model consists in finding the appropriate values of the parameters (in this case, using the Least Squares approach).
But the analyst has first to decide on the appropriate degree of the polynom.
This is the essence of the bias-variance dilemma. In the example of the polynomial regression, it says that :
The degree of the "best" polynom must therefore be somewhere "in-between".
This phenomenon is not specific to polynomial
regression. In fact, it shows-up under various guises in any kind of
model. So, quite generally, the bias-variance tradeoff principle can be
stated as follows :
This important issue is illustrated by an interactive animation that you'll find here .
We address some aspects of the bias-variance tradeoff in the next section :
In this Tutorial, we compare the performances (MSE) of the two natural estimators of the variance of the normal distribution.
We show that the uncorrected (biased) estimator is performing better than its corrected (unbiased) counterpart.
We then recognize that these two estimators belong to a class of estimators, and identify the best (lowest MSE) estimator in the class. Its bias will turn out to be even larger than that of the uncorrected sample variance.
Comparing two estimators of the variance of the normal distribution
MSE of the corrected (unbiased) sample variance
MSE of the uncorrected (biased) sample variance
An even better estimator of the variance
A class of estimators
Identifying the best estimator in the class
Properties of the best estimator
Comparing the properties of the three estimators
Related readings :