Exponential family
An important familty of probability distributions.
A probability distribution p(x) is said to belong to the exponential family if it can be written as :
|
p(x; θ) = exp[A(x)B(θ) + C(x) + D(θ)] |
* The distibution can be continuous or discrete.
* The range of x over which p(x; θ) is not equal to 0 must not depend on θ. In particular, the uniform distribution is excluded from the family.
* The parameter θ may not appear natively in the elementary form of p(x). Rather, it is usually a function of the various parameters appearing in this elementary form.
Many classic distributions can indeed be written under this general form, but this is not enough to justify the importance of this somewhat artificial expression. In fact, the exponential family has two different but converging origins that we now describe.
We previously established two characterizations of the concept of sufficient statistic (through the sample conditional distribution and the Factorization Theorem) that relied both on the sample likelihood, but not on the mother probability distribution p(x; θ) itself. But is seems clear that the existence of a sufficient statistic for θ should somehow constrain the form of p(x; θ).
This is indeed true, and we'll show that a necessary and sufficient condition for the existence of a sufficient statistic for θ is that p(x; θ) can be written as above, and therefore belongs to the exponential family.
In fact, we'll do more than that, and identify a particular sufficient statistic for θ when p(x; θ) is written under the "exponential family" form. We'll then have a very powerful tool for discovering sufficient statistics.
Recall that the Cramér-Rao inequality establishes a lower bound to the variance of an unbiased estimator of a function g(θ) of the parameter θ, but says nothing about whether or not there exists an estimator whose variance is indeed equal to this bound (efficient estimator).
We'll show that a necessary and sufficient condition for the existence of a function g(θ) that can be efficiently estimated is that p(x; θ) belongs to the exponential family. The function g(θ) is then entirely determined by p(x; θ). In particular, if g(.) is the identity function, θ admits an efficient estimator; but it can be asserted that θ has no efficient estimator if g(.) is some other function.
This result requires regularity conditions on p(x;
θ) that are stronger than that needed for etablishing
the Cramér-Rao lower bound. We won't enunciate these difficult regularity conditions
and only mention that if these conditions are relaxed, there exist "exotic" distributions
that do not belong to the exponential family and yet have efficient estimators
for some
g(θ).
The above mathematical form is called the general exponential form, and was obtained from very general principles indeed. In fact, it is so general that most usual distributions can be written in a somewhat simpler form obtained by replacing A(x) by x.
We therefore have :
p(x; θ) = exp[xB(θ) + C(x) + D(θ)]
This simplified form is called the canonical form of the distribution.
* The term B(θ) is then called the natural parameter of the distribution.
* Other parameters appearing in p(x; θ) are called nuisance parameters.
The canonical form was obtained as a simplified form of the general form. In practice, many classical distributions can be cast into an even more restrictive form that describes a sub-class of the canonical exponential family known as the natural exponential family. This form is :
|
p(x, θ, Φ) = exp{(xθ - b(θ))/Φ + c(x, Φ)} |
This expression is the "natural form" of the distribution, and is a special case of the canonical form.
* We'll se that the parameter θ of the natural form is not the same as the parameter θ of the canonical form.
* The new term Φ is usually known. We'll see that it contributes to the spread of the distribution. For this reason, it is called the dispersion parameter of the distribution. It is a "nuisance parameter" (see above).
* c(x, Φ) is a "catch all" term that does not play any important role. It is only required that it does not depend on θ.
Because the natural exponential family is much simpler than the general exponential family, the mean and the variance of one of its distribution can be expressed by remarkably simple expressions of the components of the natural form.
* Mean :
We'll show that :
|
µ = b'(θ) |
where " ' " denotes the differentiation with respect to θ.
The definition of the natural exponential family demands that b'(θ) be one-to-one. The parameter θ is then a function of the mean µ :
θ = b'-1(µ) = τ(µ)
and the distribution can then be expressed using µ as the parameter : it is then said to be expressed in the mean value parametrisation.
* Variance :
We'll show that :
|
σ² = Φ.b''(θ) |
This last expression justifies the name "dispersion parameter" given to Φ.
The above expression shows that, within a natural exponential family, the variance is a function of the mean :
|
σ² = Φ.µ' = V(µ) |
The function V is called the variance function of the distribution.
It can be shown that under some regularity conditions, the variance function completely characterizes the distribution once this distribution is known to belong to the natural exponential family.
__________________________________________________
|
Tutorial 1 |
In this Tutorial, we describe the two origins of the exponential family.
* Sufficient statistic
- A necessary and sufficient condition for p(x, θ) to have a sufficient statistic for θ is that it belongs to the exponential family.
- We then identify a particular sufficient statistic for θ.
* Cramér-Rao lower bound
- With some reservations, a necessary and sufficient condition to the existence of a function g(θ) that can be efficiently estimated is that p(x; θ) belongs to the exponential family.
- We'll identify g(θ) as well as its efficient estimator. If g(θ) is not the identity function, then θ has no efficient estimator.
____________
* Natural exponential family
- We then show that the mean and the variance of a distribution belonging to the natural exponential family can be calculated by indirect but simple and elegant methods.
EXPONENTIAL FAMILY AND SUFFICIENT STATISTIC
EXPONENTIAL FAMILY AND CRAMER-RAO LOWER BOUND
NATURAL EXPONENTIAL FAMILY
|
Exponential family and Sufficient statistic The condition is necessary The condition is sufficient A particular sufficient statistic Exponential family and the Cramér-Rao lower bound The condition is necessary The condition is sufficient Score and exponential family Score and efficient estimation The efficient estimator and the estimated quantity The natural exponential family Mean Variance Variance function |
||
|
TUTORIAL |
||
________________________________________________________
|
Tutorial 2 |
We now review some usual distributions that will turn out to belong to the exponential family (in fact, to the natural exponential family). For each one, we'll identify :
* Its canonical form,
* Its natural form,
* Its variance function,
* A quantity that is efficiently estimated, and the corresponding efficient estimator.
The detailed Table of Contents relative to the normal distribution is reproduced for every distribution in the list.
-----
The Chi-square distribution belongs to the exponential family, but we don't address it explicitely as it is only a special case of the Gamma distribution. But we'll address the exponential distribution (although it is also a special case of the Gamma distribution) because it is so simple.
EXAMPLES OF DISTRIBUTIONS
BELONGING TO THE EXPONENTIAL FAMILY
|
Normal distribution Canonical form Natural form Variance function Efficient estimation Exponential distribution Gamma distribution Binomial distribution Geometric distribution Negative binomial distribution Poisson distribution |
||
|
TUTORIAL |
||
_____________________________________________________
Related readings :