Minimal sufficient statistic
We noticed that the transform T ' = f(T) of a sufficient statistic T by a function f(.) is usually not sufficient : only when f(.) is one-to-one is f(T) always still sufficient. But we also noticed that f(T) may occasionally be sufficient even if f(.) is not one-to-one. Within the paradigm of the "useful" and "useless" information carried by a statistic T, f(.) then sheds away some of the useless information, but retains all of the useful information.

T ' may be thought of as a "leaner" sufficient statistic than T.
So the question arises naturally whether it is possible to find a sufficient statistic that contains no useless information at all, but still contains all of the useful information : in the above illustration, the blue useless part would be reduced to 0, leaving only the red useful part. The image of such a statistic by any not one-to-one function would then be "not sufficient".
It is indeed sometimes the case. When this is possible, the resulting sufficient statistic is called a minimal sufficient statistic.
-----
We may therefore visualize the set of all sufficient statistics of a parameter θ as an "ocean" in which :
* Transforming a sufficient statistic by a one-to-one function produces another sufficient statistic at the same "depth".
* Transforming a sufficient statistic by a not one-to-one function and yet obtaining another sufficient statistic takes us to a greater depth.
* A minimal sufficient statistics lies on the floor of the ocean : no function of this statistic that is not one-to-one can produce another sufficient statistic.

The above considerations could lead to a definition of a minimal sufficient statistic (MSS). Unfortunately, this definition would not be operational, contrary to the following universally accepted definition :
|
A minimal sufficient statistic is defined as a sufficient statistic that is a function of any other sufficient statistic. |
It is not clear from the definition alone whether a minimal statistic is unique or not. It is not, and clearly any one-to-one function of a minimal sufficient statistic is also minimal. But it is easily shown that given two minimal sufficient statistics M1 and M2, it is always the case that there exists a one-to-one correspondence between M1 and M2.
So, within a one-to-one correspondence, a minimal sufficient statistic is unique.
The definition of a MSS can be given the following geometric interpretation.

The sample space is partitionned into a set {At} of "sheets", each sheet At being identified by a value t.
Let f(.) be a function that is not one-to-one. Under the action of f(.), {At} will be transformed into another set of sheets {Bf (t)}. For a given t :
* All the points in At will be in the same Bf (t),
* But because f(.) is not one-to-one, there may be some other number t' such that f(t' ) = f(t). Then At and At' have the same image Bf (t) = Bf (t' ) under
f(.).
Let now T ' be another statistic defined by T ' = f(T). It partitions the sample space into {Bf (t)} and several f(At)s may be contained into the same Bf (t).
Consequently :
The transform T ' = f(T) of
a statistic T by a function f(.) that is not one-to-one
generates
a coarser partition of the sample space than T does.
By definition, a minimal sufficient statistic then generates the coarsest partition among all partitions generated by all sufficient statistics.
Just as the definition of a sufficient statistic proved inconvenient for identifying sufficient statistics and was advantageously replaced by the Factorization Theorem for this purpose, the sole definition of a minimal sufficient statistic is essentially useless for finding minimal sufficient statistics. Fortunately, we'll identify a condition for a statistic to be sufficient minimal which makes the identification of a MSS much easier.
-----
Let p(x, θ) be a distribution known up to the value of the parameter θ, and T(X) a statistic (that needs not be assumed sufficient). We denote fθ (X) the sample distribution.
Consider the following two conditions :
1) For any two samples X and Y such that T(X) = T(Y) (and that therefore belong to the same sheet as in the above illustration), the ratio
fθ (X) / fθ (Y)
considered as a function of θ turns out, in fact, not to depend on θ.
2) Conversely, for any pair of samples X and Y such that fθ (X) / fθ (Y) does not depend on θ, it turns out that T(X) = T(Y).
We'll show that :
|
If both 1) and 2) are satisfied, then T is a minimal sufficient statistic for θ. |
____________________________________________________________________________
|
Tutorial |
In this Tutorial, we first etablish the classicial condition for a statistic to be minimal sufficient.
We then review the sufficient statistics previously identified and show that they are in fact not only sufficient, but also minimal sufficient.
The case of the [θ, θ + 1] uniform distribution will show that the dimension of a minimal sufficient statistic does not necessarily match the dimension of the parameter to be estimated.
MINIMAL SUFFICIENT STATISTIC
|
A condition for a statistic to be minimal sufficient The statistic is sufficient The statistic is minimal sufficient Examples Uniform distribution [0, θ ] Uniform distribution [θ, θ + 1] Normal distribution Mean Variance Mean and variance Poisson distribution Exponential distribution Gamma distribution Shape parameter Dispersion parameter Shape and dispersion parameters Beta distribution |
||
|
TUTORIAL |
||
_____________________________________________________
Related readings :