Bias
Quite generally, the term "bias" refers to a systematic (i.e. not random) difference between a quantity and a predition of this quantity.
Let q* be an estimator of a parameter q of a probability distribution. q* is said to be unbiased if its expectation is equal to the value of the parameter :
E[q*] = q
In other words, the value of an unbiased estimator is, on the average, equal to the value to be estimated.
If the expectation of the estimator is different from the value of the parameter, the estimator is said to be "biased", and its bias is then defined as the difference between the expectation of the estimator and the true value of the parameter :
Bias = E[q*] - q
The value of the bias depends, in general, on the value of the parameter.
The lack of bias is clearly an attractive feature of an estimator, but it is not a fundamental one. The quality of an estimator, for a given sample size, is rather measured by its Mean Square Error (MSE), the expected value of the squared difference between the value of the estimator and the value to be estimated :
MSE = E[(q* - q)²]
It can be shown that :
MSE = Var(q*) + Bias(q*)²
and it is then quite possible that a certain amount of bias be more than compensated for by a low variance (see for instance Ridge Regression, and the animation about the bias-variance tradeoff).
Besides, it is sometimes possible to remove the bias of a biased estimator. The most classical example is that of the estimator of a variance :
* The estimator obtained by the method of moments :
s² =
1/n .Si
(xi -
)²
with
= 1/n .Sixi
is biased ;
* Whereas the "corrected" estimator :
s' ² =
1/(n - 1) .Si
(xi -
)²
is unbiased (lower image of the above illustration).
Unbiased estimators are easier to analyze than convergent but biased ones because the bias may depend on the value of the parameter in a complex manner. Unbiased estimators have therefore been studied quite extensively.
In particular, it is common for a parameter to have several, or even an infinity of unbiased estimators. For example :
* For the normal distribution N(µ, s²) :
- The sample mean, and
- The sample median,
are both unbiased estimators of the mean µ of the distribution.
* For the uniform distribution in [0, q] :
- Twice the sample median, and
- [(n + 1)/n].x(n) , where x(n) is the rightmost observation of the n-sample
are both unbiased estimators of q.
It is of course desirable to identify and use the (unbiased) estimator with the smallest variance. This estimator is called the "Minimum Variance Unbiased Estimator" (MVUE) of q, and we'll denote it by q*M.
-----
In the Tutorial below, we establish two important properties of a Minimum Variance Unbiased Estimator :
|
1) It is unique : any other unbiased estimator of q has a variance that is strictly larger than that of q*M.
2) An "unbiased estimator of 0" is, as the name suggests, a function f(x) of the sample (a "statistic") whose expectation is equal to 0 for all values of q. If you add an unbiased estimator of 0 to an unbiased estimator of q, the result is clearly another unbiased estimator of q. Therefore, it can be expected that unbiased estimators of 0 will play an important role in characterizing unbiased estimators. As a matter of fact, we'll show that an unbiased estimator of q is a Minimum Variance Unbiased Estimator if and only if it is uncorrelated with any unbiased estimator of 0.
|
What is the smallest variance that can be achieved by an unbiased estimator ?
In many cases, it is possible to identify a lower bound to the variance of an unbiased estimator. This lower bound is then given by the Cramér-Rao inequality,
that we describe is some detail here.
If an unbiased estimator of q is not a MVUE, it may be improved (i.e. its variance may be reduced) provided that a sufficient statistic for q is available. This is the object of the Rao-Blackwell theorem.
The new, improved estimator is not garanteed, though, to be a MVUE.
At each and every point of the space of the input variables, the role of a model is to make a prediction of the value of a certain quantity (e.g. the value of the response variable in predictive models). As the fitted model depends on the sample, this prediction is a random variable, that is used as an estimator of the quantity whose value is to be predicted.
This estimator may be :
* Unbiased (e.g. Simple Linear Regression under the standard conditions).
* But it may also be biased (see Ridge Regression, and the first part of the animation on the bias-variance tradeoff).
One may also consider the average bias over the range of the input variables, ponderated by the joint probability density of these variables. This quantity measures the global, or average bias of the model, that is, its ability to accommodate the shape of the deterministic part of the proces that generated the data (lower image of this illustration) :
The reader may refer to the second part of the animation on the bias-variance tradeoff.
________________________________________________________________
|
Tutorial |
In this Tutorial, we establish two important properties of a Minimum Variance Unbiased Estimator :
* It is unique : the variance of any other unbiased estimator of the parameter is strictly larger than that of the MVUE.
* An unbiased estimator is a MVUE if and only if it is uncorrelated with any unbiased estimato of 0.
MINIMUM VARIANCE UNBIASED ESTIMATORS
|
A Minimum Variance Unbiased Estimator is unique A MVUE is uncorrelated with any unbiased estimator of 0 The condition is necessary The condition is sufficient |
||
|
TUTORIAL |
||
________________________________________________
Related readings :
|
Want to contribute to this site ? |