Likelihood Ratio Test  (LRT)

A general and powerful method for building tests.

-----

Likelihood was invented for the purpose of quantifying the fit between a probability distribution and a sample : the larger the likelihood, the better the fit.

On the other side, tests are meant to discriminate between two nonoverlapping groups of distributions that both claim to contain the distribution that generated the sample.

One may therefore anticipate the possibility to use the likelihood of the sample for the various competing distributions to settle the issue of which group is more likely to contain the distribution that generated the sample.

Vocabulary and notations

Let's first clarify some notations commonly used when dealing with Likelihood Ratio tests.

    * First, all the distributions we'll consider belong to the same family, the various distributions in the family differing only though the value of a parameter θ (which may be a vector parameter, i.e. a set of several scalar parameters). For example, we may consider the family of normal distributions N(µ, σ²), of which each member is fully characterized by the values of µ and σ².

    * The two groups of distributions are then defined respectively by the null hypothesis and the alternative hypothesis. For example, we might want to test :

        - H0 : µ = µ0

against

        - H1 : µ  µ0

The hypothesis we'll consider can be indifferently simple or composite (whereas the Neyman-Pearson lemma applies only to two simple hypothesis).

In what follows, it will be convenient to consider that :

    * H0 does not just denote the null hypothesis, but the set of the values of the parameter θ defined by H0 as well and, by extension, the set of distributions defined by this set of values of the parameter.

    * H1 does not just denote the alternative hypothesis, but the set of the values of the parameter θ defined by H1 as well and, by extension, the set of distributions defined by this set of values of the parameter.

So the notation H0 U H1 will designate the set of the values of the parameter θ defined by either H0 or H1 and, by extension, the set of distributions defined by this set of values of the parameter.

Principle of Likelihood Ratio tests

The Likelihood Ratio Test (LRT) approach reasons as follows.

    1) Suppose that H0 is true : the distribution that generated the sample belongs indeed to H0. We certainly expect the sample to exhibit a large likelihood for the distribution that generated it, and consequently we expect this likelihood to be close to the largest likelihood encountered when scanning through all the distributions in H0.

Considering the distributions in H1 will probably change nothing : none of these distributions generated the sample, so none of these distributions is expected to display a large likelihood for the sample.

Consequently the largest likelihood in H0 is not anticipated to be substantially smaller than the largest likelihood observed over the complete set of distributions H0 U H1.

This can be formalized as follows. Denote :

        - maxH 0 L(x, θ) the largest likelihood observed over H0.

        - maxH U H 1 L(x, θ) the largest likelihood observed over H0 U H1.

Then the ratio

λ = maxH 0 L(x, θ) / maxH U H 1 L(x, θ)  

which is certainly smaller than 1, is expected to be close to 1.

 

    2) Conversely, suppose that H0 is false (and therefore that H1 is true). The distribution that generated the sample belongs to H1, and not to H0. None of the distributions in H0 is anticipated to exhibit a large likelihood. The largest likelihood of all is anticipated to be found for a distribution in H1 because the distribution that generated the sample is in H1.

Consequently, the ratio

λ = maxH 0 L(x, θ) / maxH U H 1 L(x, θ)  

although always positive, is expected to have a small value (close to 0).

The test

So it appears that the statistic :

Λ = maxH 0 L(x, θ) / maxH U H 1 L(x, θ

 

 

is a plausible choice for testing H0 against H1 : small values (close to 0) of Λ favor H1, while large values (close to 1) of Λ favor H0.

Suppose that if H0 is true, the probability distribution g(λ) of the random variable Λ is known and that the integral

can be calculated, or at least tabulated.

We then have all the ingredients needed for testing H0 against H1 at the 1 - α significance level (upper and lower images of this illustration) :

 

 

 We considered the ratio of the maximum likelihood over H0 to the maximum likelihood over the complete parameter space H0 U H1. We might as well (and perhaps more naturally) considered the ratio of the maximum likelihood over H0 to the maximum likelihood over H1 only, with a similar line of reasoning.

Distribution of the test statistic

Unfortunately, the test statistic Λ and its distribution g(λ) are usually very complicated (see for example Bartlett's test).

Functions of the statistic

It is sometimes possible to identify a function f(.) such that f(Λ) is more tractable. It is then possible to carry out a test equivalent to the original test using

f(Λ) instead of Λ as a test statistic.

Examples of such transformations are given in the Tutorial below.

Asymptotic distribution

It can be shown (difficult) that the r.v. -2.log(Λ) is asymptotically  distributed (i.e. for large samples). The number of degrees of freedom of the distribution is equal to the difference between :

        - The number of free parameters in H0 U H1,

        - And the number of free parameters in H0.

This deserves a little explanation. Suppose that an LRT is designed for the purpose of testing :

        - H0 : µ = µ0

against

        - H1 : µ  µ0

for normal distributions where the variance σ² assumed to be known.

    * In H0, the likelihood has a certain value determined by the sample, and no parameter is available to make this likelihood change. So H0 has no (i.e. 0) free parameter.

    * In H0 U H1 the likelihood will be maximized over the range of µ, which is unconstrained over ]-∞, +∞[. Therefore, H0 U H1 has one free parameter (that is, µ).

The difference between the numbers of free parameters of H0 U H1 and H0 is therefore 1 - 0 = 1.

Similarly, if the same hypothesis are tested in a setting where the variance is not assumed to be known :

        - H0 has one free parameter, namely σ²,

        - While H0 U H1 has two free parameters, namely µ and σ²,

and the difference between the numbers of free parameters of H0 U H1 and H0 is therefore 2 - 1 = 1.

-----

The term "free" makes reference to the fact that the effective number of parameters (or "number of free parameters") may be smaller than the actual number of parameters. For example, suppose that we consider the sub-family of the family of normal distributions N(µ, σ²) defined by µ² = σ². Although both the mean and the variance of the distributions are free to vary across their entire respective ranges, the sub-family is described by a single "free parameter".

The number of free parameters may be regarded as the intrinsic dimension of the manifold bearing all the allowed values of the parameters linked by constraining relationships.

________________________________________________________________________

 

 

Tutorial

 

In this Tutorial, we use the Likelihood Ratio paradigm to :

    * First build two tests about the value of the mean of the normal distribution. The first test assumes that the variance is known, while the second test will not assume the variance to be known.

    * Then build a test about the value of the variance of the normal distribution with unknown mean.

The test statistics obtained by respecting a strict compliance to the LRT methodology are usually a bit complicated and have messy distributions. But we'll be able to transform these statistics into new random variables whose distributions could more easily be calculated.

But we won't even have to calculate these distributions for some reasoning about these new variables will allow us to establish that these three tests are in fact equivalent respectively to :

    * The one sample t-test when the variance is known (also known as "z-test"),

    * The one sample t-test when the variance is unknown.

    * The F-test when the mean is unknown.

 

 

 

EXAMPLES OF LIKELIHOOD RATIO TESTS

Mean of normal, variance known

The parameter space

Maximum Likelihood under H0

Maximum Likelihood under H0U H1

Likelihood ratio

Transformation of the Λ test statistic

Equivalence with the z - test

Mean of normal, variance unknown

The parameter space

Maximum Likelihood under H0

Maximum Likelihood under H0U H1

Likelihood ratio

Transformation of the Λ test statistic

Equivalence with the t - test

Variance of normal, mean unknown

The parameter space

Maximum Likelihood under H0

Maximum Likelihood under H0U H1

Likelihood ratio

Transformation of the Λ test statistic

Equivalence with the F - test

TUTORIAL

 

 ______________________________________________________

 

Related readings :

Likelihood

Test

Neyman-Pearson lemma

Bartlett's test (Homogeneity of variances)

Download this Glossary