Moment generating function
The probability density function (continuous variable) or the probability mass function (discrete variable) of a random variable X contains all the information you'll ever need about this variable. Therefore, it seems that it should always be possible:
* to calculate the mean, variance and higher order moments of X from its p.d.f. (or its discrete p.m.f.).
* to calculate the distribution of, say, the sum of two independent random variables X and Y whose distributions are known.
Yet, in practice, it turns out that calculations
from first principles
are often intractable. The Moment Generating Function (mgf) may then
come to our help.
Before we describe the mgf, a little digression
is in order. The difficulty we just mentioned is not specific to Probability
Theory. In fact, just about any disciplin in Physics or Mathematics will run
into the same problem sooner or later: a certain property of a function
f is needed, the equations are there, but cannot be solved in a closed
form.
A very powerful and general idea, that
exists under many guises, consists then in transforming the original function
f into a new and appropriately chosen function g, such that
comparatively simple calculations on g will provide the desired results.
The obstacle is therefore circumvented, at the expense of appropriately transforming the original function f. This ploy is depicted in the following illustration :
The lower image makes explicit the principle of "transformation of a function" :
* First calculate g, the "image" of f under the transformation,
* Then conduct the appropriate calculations on g to obtain the desired quantity Q.
The mgf results from a transformation of the p.d.f. (or the p.m.f.). This transformation is defined as follows :
1) A new parameter t is introduced,
2) Then the random variable etX is created,
3) E[etX], the expectation of etX is calculated. For example, if X is continuous :

where p(x) is the p.d.f. of X.
The integration is over x, therefore x is not to be found in the final result, which is then a function of t only. This function is, by definition, the Moment Generating Function of the variable X (or of its p.d.f. p(x)), and will be denoted MX (t), or M(t) when there is no risk of confusion.
So, by definition, the moment generating function of a random variable X is :
|
MX (t) = E[etX] |
The moment generating function is not to be confused
with the generating function.
We describe below the main properties of the moment generating function. Some of these properties are very difficult to demonstrate, and will be stated without proof. The easier properties will be demonstrated in the Tutorial below.
Not all distributions have a moment generating function. There are two reasons why a probability distribution may not have a moment generating function :
1) First, all the moments of the distribution may not exist. It can be shown that if the order n moment of a distribution does not exist, then no moment of order larger than n exists either. We'll see that a most important property of the mgf is that, when it exists, all the moments of the distribution can be calculated from this mgf. So if the distribution does not have all of its moments, it certainly does not have a mgf.
This is the case, for example, of Student's tn distribution which has not moment beyond the order (n - 1) moment. In particular, if n = 1, the t distribution turns into the Cauchy distribution, which has no moments.
2) A distribution may have all of its moments, and yet not have a moment generating function as the expression defining the mgf then leads to an infinitely large quantity. This is the case, for instance, of the lognormal distribution.
-----
The complex numbers version of the moment generating function is called the characteristic function of the distribution. Contrary to the mgf, the characteristic function always exists but, owing to the intricacies of Complex Calculus, it is not addressed in this Glossary.
If r.v. X has a mgf, then the nth order moment of its distribution is equal to the value of the nth order derivative of its mgf for t = 0.
|
|
Of course, the moment generating function gets its name from this property.
Using this property is often more convenient than direct calculation for calculating the moments of a r.v. (except possibly for the uniform and the Beta distributions), as will be seen throughout this site.
|
If the r.v.s X1 and X2 both
have a mgf |
This result is used for calculating the distributions of many classical r.v. that are defined, or can be interpreted as the sum of iid r.v. :
* Chi-square,
The important question of the distribution of the empirical mean of a r.v. is often succesfully addressed by refering to this property (see for example the case of the normal distribution).
|
If the r.v.s X et Y have identical
moment generating functions, |
This seemingly almost obvious result is in fact very difficult to demonstrate, but also very useful. Note that a mgf cannot be "inverted" : if you know the analytical form of a mgf, there is usually no way to derive by calculus only the probability distribution of which it is the mgf. But suppose you have a list of distributions together with their mgfs. Upon calculating a new mgf, you may look up the list in the hope of finding this new mgf. If you're lucky, it will be there, together with its distribution, and you can then safely state that this distribution is the unique distribution of which your new function is the mgf.
We use the uniqueness property many times throughout this site, for example :
* Calculation of the Chi-square distribution.
* Additivity property of the binomial, negative binomial, Gamma and Poisson distributions.
* Linear transform of a normal r.v., linear combination of independent normal r.v. (see here)
* Identification of the distribution of the sum of iid exponential variables (see here).
* Identification of the distribution of the sum of a random number of iid exponential variables (see here).
-----
If two distributions have all their moments equal, and if they both have a mgf, then they have identical distributions. But some pathological examples are known of distributions that have all their moments equal, and yet that are not identical. These distributions clearly have no mgf.
Let {Xi} be a sequence of r.v.s with respective moment generating functions Mi(t). If the sequence {Mi(t)} converges to a limit M(t), and if this limit is the mgf of a r.v. X , then the sequence {Xi} converges in distribution to X.
This fundamental result is difficult, and
will not be demonstrated. Yet, we'll use it :
* For demonstrating the
Central Limit Theorem.
* As a means
of calculating the mgf of the Poisson distribution as the limit of the binomial
distribution.
* As a means
of showing that the binomial distribution tends to a normal distribution for
large samples.
* As a means
of showing that the Poisson distribution
tends to a normal distribution for large values of the parameter λ.
The moment generating function generalizes to the multivariate case. Its definition is the same as in the univariate case :
MX (t) = E[e t'X]
where :
* X is a random vector.
* and t is now a vector parameter (but the mgf is scalar).
The properties of the univariate mgf carry over to the multivariate case. In particular, the multivariate mgf allows calculating the cross-moments of a multivariate distribution. Let X be a p-dimensional random vector, X = (X1, X2, ..., Xp). A cross-moment of X is a quantity of the form :
E[X1k1.X2k2....Xpkp]
with k1 + ... + kp = k.
Then it is easily shown that, when all the involved quantities exist :

an expression similar to that of the univariate case.
Because of its multivariate nature, the multivariate mgf has some properties that have no equivalent in the univariate case. The most important of these properties may be that it sometimes allows calculating marginal distributions very easily.
For let X be a random vector partitioned as X = (X1, X2 ). The vector parameter t is similarly partitioned as t = (t1, t2 ). The mgf of X is then :
MX (t) = MX(t1, t2 ) = E[exp(t1'X1 + t2'X2 )]
Now set t2 = 0 in this expression. We obtain :
MX (t1, 0) = E[exp(t1'X1 + 0)] = E[exp(t1'X1)] = MX1(t1)
which is the mgf of X1.
So the moment generating function of a marginal r.v. can very easily be obtained from the mgf of a multivariate distribution by setting to 0 the appropriate components of the vector parameter t.
We use this property as one of the (many) ways of calculating the marginal distributions of the multivariate normal distribution.
We show here that the two r.v.s X and Y with respective mgfs MX (s) and MY (t) and joint mgf MXY (s, t) are independent if and only if :
|
MXY (s, t) = MX (s)MY (t) |
__________________________________________________________________________
|
Tutorial |
The above mentioned properties are detailed and some of them are demonstrated in the following Tutorial.
PROPERTIES OF THE MOMENT GENERATING FUNCTION
|
The mgf generates moments Mean Second order moment All moments Mgf of a linear transform of a r.v. Mgf of the sum of independent r.v. Uniqueness (without proof) Convergence of the mgf (without proof) Existence of the mgf, characteristic function |
||
|
TUTORIAL |
||
______________________________________________________
Related readings :