Estimators (Combination of independent..)
If you are not familiar with the notion of estimator, we suggest that you first read this.
Suppose that one quantity Q (for example the mean m of a distribution), has to be estimated. For some reason, the estimation is conducted by two independent teams, each team using its own pet estimator:
* Team 1 uses estimator Q1, that produces number Q*1 as an estimate of Q. Estimator Q1 is known to be unbiased.
* Team 2 uses estimator Q2, that produces number Q*2 as another estimate of Q. Estimator Q2 is also known to be unbiased.
Is there a way to merge these two independent estimates into a single, improved estimate Q*c of Q ? By "improved", we mean that :
* The expectation of Q*c , considered as a random variable, must still be Q, that is, Q*c must be one realization of a new unbiased estimator of Q (just as Q1 and Q2 are) that we'll denote Qc,
* And the variance σ² of this estimator must be less than both the variance σ1² of Q1 and the variance σ2² of Q2. In fact, we'll obtain even more, as we are going to identify the one and only Qc with minimal variance.
A simple idea is to use the average of Q*1 and Q*2 as a combined estimation of Q :
Q*c = (Q*1 + Q*2)/2
This means that we use :
Qc = 1/2.(Q1 + Q2)
as the new synhetic estimator of Q.
This idea, natural as it is, is not very good. Qc is indeed unbiased, but that its variance, although certainly smaller than the larger of σ1² and σ2², may be larger than the smaller of σ1² and σ2², which means that Qc may be worse than the best of Q1 and Q2.
A better idea is to seek a more general linear combination of Q1 and Q2 :
Qc = λ1.Q1 + λ2.Q2
such that :
* Qc is unbiased,
* and with minimal variance.
We show here that the solution is :
and that the variance of Qc is then :
σ² = σ1².σ2² / (σ1² + σ2²)
which is clearly smaller than both σ1² and σ2².
So if the variances σ1² and σ2² of Q1 and Q2 are known, the best possible estimate of Q is :
Q*c = λ1.Q*1 + λ2 .Q*2
with the values of λ1 and λ2 as above.
Warning : this does not mean that Q*c is closer to the true value of Q than Q*1 and Q*2 (although it is certainly closer to Q than at least one of Q*1 or Q*2). It only means that Qc has a narrower distribution around Q than Q1 and Q2 (or any linear combination of Q1 and Q2). In other words, it is still quite possible that the error (Q*c - Q) be larger than either (Q*c - Q1) or (Q*c - Q2) (but certainly not both).
We show here that if :
* Q (the quantity to be estimated) is the mean of a distribution,
* Q1 and Q2 are the averages of two samples of different sizes,
then Qc is not just the best linear combination of Q1 and Q2 : it is in fact the very best estimator possible.
Combining independent estimators is somewhat related to Weighted Least Squares in Simple Linear Regression. The expression for Qc may be re-written :
Qc = (σ1².σ2² / (σ1² + σ2²)).[ Q1/ σ1² + Q2 / σ2²]
which shows that the best estimator for Q is a linear combination of Q1 and Q2 with coefficients inversely proportional to the estimators' variances.
In Weighted Linear Regression, the Weighted Least Squares line is the best estimator of y (the response variable), and it is obtained by minimizing the sum of the squares of the weighted residuals, each residual being ponderated by the inverse of the local variance of y.
The following animation illustrates the concept of "best linear combination of independent estimators".
The animation proposes :
* Two (gaussian) curves, that are our Q1 and Q2 estimators (Note : it is not required that Q1 and Q2 be normally distributed. Normal distributions have been arbitrarily chosen for convenience only).
* These curves are centered on thick black ticks, the quantity Q to be estimated.
* Under each curve is also a thin tick, the Q*1 and Q*2 estimates.
* The lower frame shows :
1) A red (gaussian) curve that is a linear combination of the two upper curves. This is our Qc, a linear combination of Q1 and Q2. The coefficients are positive and add up to "1". They are initialized at "0.5", but may be manually adjusted with the green slider to the right.
Below the red curve is a thin yellow tick which is Q*c, the linear combination of Q*1 and Q*2 .
2) A black (gaussian) curve that is the theoretical best linear combination of Q1 and Q2.
* Observe that in the "best" position, the red curve's S-D is less than both Q1's and Q2's S-Ds.
* When λ is set to "1", the red curve is identical to Q1. It is identical to Q2 when λ is set to "0".
* Change the width of Q1 and Q2 (Slide the right hand slopes of the curves with your mouse). Observe that both the red and black gaussians vary accordingly.
* Click on "Go", and observe the build up of the distribution of Q*c. It matches the red gaussian curve.
Related readings :