Complete sufficient statistic
Let p(x; θ) be a probability distribution known up to the value of the parameter θ, and X a sample drawn from this distribution.
* If T(X) is sufficient for θ, this statistic contains all the information that will ever be known about the value of θ. In informal terms, T(X) contains all the useful information for estimating θ, plus a certain amount of information that is useless for this task.
* In general, a statistic T ' obtained as an image f(T) of a sufficient statistic T by function f(.) that is not one-to-one is not sufficient. But it may happen that
T ' is sufficient : it then still contains all the useful information about θ, and is somewhat "lighter" as f(.) has removed a certain amout of useless information from T. If a sufficient statistic cannot be be made lighter by this process, it is said to be minimal sufficient.
* A minimal sufficient statistic may still contain some useless information which cannot be eradicated without spoiling the "sufficient" nature of the statistic.
But it may happen that a minimal sufficient statistic contains no useless information at all, and contains only all of the useful information for the estimation of θ. When it exists, such a statistic is said to be complete (lower image of the above illustration).
This informal idea is not easy to formalize. One approach is to remember that the Rao-Blackwell theorem is used for reducing the variance of an unbiased estimator by conditioning this estimator on a sufficient statistic. The resulting estimator is still unbiased, but with no guarantee that its variance is the smallest attainable variance. As a complete statistic is the "best" of all sufficient statistic, it can be hoped that blackwellizing an unbiased estimator with a complete statistic will produce the "best" of all unbiased estimators, that is, a Uniformly Minimum Variance Unbiased Estimator (UMVUE).
This line of thinking can lead to the formal definition of a complete statistic. The Lehmann-Scheffé theorem states that blackwellizing any unbiased estimator of θ with a complete statistic will produce a UMVUE if only the definition of a complete statistic is chosen as follows :
|
A statistic T is said to be complete if no non trivial function of T is an unbiased estimator of 0, this being true for all values of θ. |
Recall that an unbiased estimator of 0 is a statistic whose
expectation is 0 for all values of θ.
So a statistic is said to be "complete" if :
E[φ(T )] = 0 for all values of θ implies that φ(.) is identical to the nul function (equal to 0 almost everywhere).
This somewhat non intuitive definition is sort of dictated by the desire to obtain the Lehmann-Scheffé theorem.
It can also be interpereted as follows : if two functions φ1(T) and φ2(T) of a complete statistic have the same expectation, then these functions are identical (almost everywhere). For if it were not the case, then [φ1(T) - φ2(T)] would be a non trivial function with 0 expectation, that is, a non trivial unbiased estimator of 0. In other words, if T is a complete statistic, then the value of the expectation of φ(T) fully characterizes the function φ(.).
-----
It is important to understand that the concept of "completeness" of a statistic does no bear :
* On the statistic itself (e.g. a particular analytic form of the function defining the statistic from the observations),
* Or on the completeness of a statistic for a particular distribution, a meaningless concept,
but rather on the completeness of the statistic for the family of distributions p(x; θ).
We'll show that if the family p(x; θ) admits a complete statistic for θ, then this statistic is unique within a one-to-one transformation.
We'll show that if T is a complete statistic for θ, and if f(.) is any one-to-one function, then T ' = f(T) is also complete for θ.
Our intuitive approach of the concept of completeness of a statistic started from the concept of minimal sufficient statistic whose useless part has been totally eliminated. Yet, the definition retained for a complete statistic does not mention that it be sufficient minimal, or even sufficient.
We'll show, though, that under mild conditions, a complete statistic is sufficient minimal.
We mentioned that a minimal sufficient statistic may be not complete. We'll show that then, the family of distributions p(x; θ) has no complete statistic for θ.
Direct identification of complete statistics may be rather laborious. Fortunately, many of the common probability distributions belong to the exponential family :
p(x; θ) = exp[A(x)B(θ) + C(x) + D(θ)]
for which we already showed that the statistic
| T = Σi A(xi) |
is sufficient.
It can be shown that this statistic is in fact complete for θ (difficult).
____________________________________________
|
Tutorial |
In this Tutorial, we first demonstrate some general results about complete statistics.
We then give some examples of statistics that we already showed to be sufficient minimal, and that we now show to be complete by using only the definition of a complete statistic. The most common way of showing that a statistic is complete, though, is by showing that it is the natural statistic of a distribution belonging to the exponential family.
Some difficult mathematical results will be stated without
proof.
COMPLETE STATISTICS
|
A complete statistic is minimal sufficient An example of a minimal sufficient statistic which is not complete Expectation of x(1) Expectation of x(n) The statistic is not complete A complete statistic is unique within a one-to-one function A one-to-one transform of a complete statistic is complete If there exists a not complete minimal sufficient
statistic, then there is Examples of complete statistics Bernoulli distribution Exponential distribution Poisson distribution Uniform distribution U[0, θ ] |
||
|
TUTORIAL |
||
_____________________________________________________
Related readings :