Moment generating function

The probability density function (continuous variable) or the probability mass function (discrete variable) of a random variable X contains all the information you'll ever need about this variable. Therefore, it seems that it should always be possible:

    * to calculate the mean, variance and higher order  moments of X  from its p.d.f. (or its discrete p.m.f.).

    * to calculate the distribution of, say, the sum of two independent random variables X and Y whose distributions are known.


Yet, in practice, it turns out that calculations from first principles are often intractable. The Moment Generating Function (mgf) may then come to our help.

 

Before we describe the mgf, a little digression is in order. The difficulty we just mentioned is not specific to Probability Theory. In fact, just about any disciplin in Physics or Mathematics will run into the same problem sooner or later:  a certain property of a function f is needed, the equations are there, but cannot be solved in a closed form.

A very powerful and general idea, that exists under many guises, consists then in transforming the original function f into a new and appropriately chosen function g, such that comparatively simple calculations on g will provide the desired results.

The obstacle is therefore circumvented, at the expense of appropriately transforming the original function f. This ploy is depicted in the following illustration :

 

 

 

 

The lower image makes explicit the principle of "transformation of a function" :

    * First calculate g, the "image" of  f under the transformation,

    * Then conduct the appropriate calculations on g to obtain the desired quantity Q.

 

The mgf results from a transformation of the p.d.f. (or the p.m.f.). This transformation is defined as follows :

    1) A new parameter t is introduced,

    2) Then the random variable etX  is created,

    3) E[etX], the expectation of etX is calculated. For example, if X is continuous :

 

where p(x) is the p.d.f. of X.

 

The integration is over x, therefore x is not to be found in the final result, which is then a function of t only. This function is, by definition, the Moment Generating Function of the variable X (or of its p.d.f. p(x)), and will be denoted MX (t), or M(t) when there is no risk of confusion.

So, by definition, the moment generating function of a random variable X is :

 

MX (t) = E[etX]



The moment generating function is not to be confused with the generating function.

Properties of the moment generating function

We describe below the main properties of the moment generating function. Some of these properties are very difficult to demonstrate, and will be stated without proof. The easier properties will be demonstrated in the Tutorial below.

Existence

Not all distributions have a moment generating function. There are two reasons why a probability distribution may not have a moment generating function :

    1) First, all the moments of the distribution may not exist. It can be shown that if the order n moment of a distribution does not exist, then no moment of order larger than n exists either. We'll see that a most important property of the mgf is that, when it exists, all the moments of the distribution can be calculated from this mgf. So if the distribution does not have all of its moments, it certainly does not have a mgf.

This is the case, for example, of Student's tn distribution which has not moment beyond the order (n - 1) moment. In particular, if n = 1, the t distribution turns into the Cauchy distribution, which has no moments.

 

2) A distribution may have all of its moments, and yet not have a moment generating function as the expression defining the mgf then leads to an infinitely large quantity. This is the case, for instance, of the lognormal distribution.

-----

The complex numbers version of the moment generating function is called the characteristic function of the distribution. Contrary to the mgf, the characteristic function always exists but, owing to the intricacies of Complex Calculus, it is not addressed in this Glossary.

Moments and moment generating function

If r.v. X has a mgf, then the nth order moment of its distribution is equal to the value of the nth order derivative of its mgf for t = 0.

 

 

 

 

Of course, the moment generating function gets its name from this property.

 

Using this property is often more convenient than direct calculation for calculating the moments of a r.v. (except possibly for the uniform and the Beta distributions), as will be seen throughout this site.

Sum of independent r.v.s

 

If the r.v.s X1 and X2  both have a mgf
then
X = X1 + X2  has a mgf which is equal to the product of the mgfs of X1 and of X2

 

 

This result is used for calculating the distributions of many classical r.v. that are defined, or can be interpreted as the sum of iid r.v. :

    * Chi-square,

    * Negative binomial,

 

The important question of the distribution of the empirical mean of a r.v. is often succesfully addressed by refering to this property (see for example the case of the normal distribution).

Uniqueness of the moment generating fuction

 

 

If the r.v.s X et Y have identical moment generating functions,
then
their distributions are also identical.

 

 

This seemingly almost obvious result is in fact very difficult to demonstrate, but also very useful. Note that a mgf cannot be "inverted" : if you know the analytical form of a mgf, there is usually no way to derive by calculus only the probability distribution of which it is the mgf. But suppose you have a list of distributions together with their mgfs. Upon calculating a new mgf, you may look up the list in the hope of finding this new mgf. If you're lucky, it will be there, together with its distribution, and you can then safely state that this distribution is the unique distribution of which your new function is the mgf.

 

We use the uniqueness property many times throughout this site, for example :

    * Calculation of the Chi-square distribution.

    * Additivity property of the binomial, negative binomialGamma and Poisson distributions.

    * Linear transform of a normal r.v., linear combination of independent normal r.v. (see here)

    * Identification of the distribution of the sum of iid exponential variables (see here).

    * Identification of the distribution of the sum of a random number of iid exponential variables (see here).

-----

If two distributions have all their moments equal, and if they both have a mgf, then they have identical distributions. But some pathological examples are known of distributions that have all their moments equal, and yet that are not identical. These distributions clearly have no mgf.

Convergence property of the moment generating function

Let {Xi} be a sequence of r.v.s with respective moment generating functions Mi(t). If the sequence {Mi(t)} converges to a limit M(t), and if this limit is the mgf of a r.v. X , then the sequence {Xi} converges in distribution to X.

 
This fundamental result is difficult, and will not be demonstrated. Yet, we'll use it :
    * For demonstrating the Central Limit Theorem.
    * As a means of calculating the mgf of the Poisson distribution as the limit of the binomial distribution.
    * As a means of showing that the binomial distribution tends to a normal distribution for large samples.
    * As a means of showing that the Poisson distribution tends to a normal distribution for large values of the parameter λ.

Multivariate moment generating function

Definition

The moment generating function generalizes to the multivariate case. Its definition is the same as in the univariate case :

MX (t) = E[e t'X]

where :

    * X is a random vector.

    * and t is now a vector parameter (but the mgf is scalar).

Cross-moments

The properties of the univariate mgf carry over to the multivariate case. In particular, the multivariate mgf allows calculating the cross-moments of a multivariate distribution. Let X be a p-dimensional random vector, X = (X1, X2, ..., Xp). A cross-moment of X is a quantity of the form :

E[X1k1.X2k2....Xpkp]

with k1 + ... + kp = k.

Then it is easily shown that, when all the involved quantities exist :

 

an expression similar to that of the univariate case.

Marginal distributions

Because of its multivariate nature, the multivariate mgf has some properties that have no equivalent in the univariate case. The most important of these properties may be that it sometimes allows calculating marginal distributions very easily.

For let X be a random vector partitioned as X = (X1, X2 ). The vector parameter t is similarly partitioned as t = (t1, t2 ). The mgf of X is then :

MX (t) = MX(t1, t2 ) = E[exp(t1'X1 + t2'X2 )]

Now set t2 = 0 in this expression. We obtain :

MX (t1, 0) = E[exp(t1'X1 + 0)] = E[exp(t1'X1)] = MX1(t1)

which is the mgf of X1.

So the moment generating function of a marginal r.v. can very easily be obtained from the mgf of a multivariate distribution by setting to 0 the appropriate components of the vector parameter t.

We use this property as one of the (many) ways of calculating the marginal distributions of the multivariate normal distribution.

Multivariate mgf and independence

We show here that the two r.v.s X and Y with respective mgfs MX (s) and MY (t) and joint mgf MXY (s, t) are independent if and only if :

 

MXY (s, t) = MX (s)MY (t)

__________________________________________________________________________


 

Tutorial

 

The above mentioned properties are detailed and some of them are demonstrated in the following Tutorial.

 

 

PROPERTIES OF THE MOMENT GENERATING FUNCTION

The mgf generates moments

Mean

Second order moment

All moments

Mgf of a linear transform of a r.v.

Mgf of the sum of independent r.v.

Uniqueness (without proof)

Convergence of the mgf (without proof)

Existence of the mgf, characteristic function

TUTORIAL

  

______________________________________________________

 

Related readings :

Central Limit Theorem

Generating function

Download this Glossary