Interactive animation

Fisher's F distribution

Also known as the Fisher-Snedecor distribution.

Rationale of the F distribution

Given two samples drawn from two independent normal distributions, the question is :

Are the variances of these two normal distributions equal ?

 

 

1) The samples may have different numbers of observations.
2) The. means of the populations play no role in what follows.

 

If the variances of the two normal distributions are indeed equal, one expects the variances of the two samples to be approximately equal as well. On the other hand, if the two samples have vastly different variances (lower image of the above illustration), one would suspect that the two mother normal distributions also have different variances.

This line of thinking clearly leads to a test based on the comparison of the two sample variances. All we have to do is to identify an appropriate test statistic.

The F  test

Let the two normal distributions be N(µ1, s²1) and N(µ2, s²2 ) (again, the means are irrelevant).

We want to test :

    * The null hypothesis H0 : s²1 = s²2 

    * Against the alternative hypothesis  H1 : s²1  s²2 .

The test statistic

We tentatively consider the ratio of the two sample variances as the test statistic F. Recall that for any distribution the quantity s² defined by :

s² = 1/(n - 1).Si(xi -

is called the "corrected sample variance", and is an unbiased estimator of the distribution variance.

So, with obvious notations, we now consider the quantity :

F = s1² /s2²

which is an attractive candidate as a test statistic : we'd naturally tend to reject H0 if the value of F is just too different from 1.

Distribution of the test statistic

Denote :

    * n the size of the first sample,

    * m the size of the second sample.

 

We know that :

    * (n - 1)s1²/s1² ~n - 1  

    * (m - 1)s2²/s2² ~m - 1  

so that, under the null hypothesis s²1 = s²2, the quantity :

F = s1² /s2²       is distributed as      [n-1 /(n-1)] / [m-1 /(m-1)]

 

which does not depend on the common variance s² of the two normal distributions, and can therefore be used as a test statistic.

It happens that the distribution of F can be calculated explicitely, and is known as Fisher's F distribution Fn - 1, m - 1. It depends on two indices, that are known as its two degrees of freedom.

You'll find here an interactive animation that describes the various shapes of the F distribution depending on the values of its degrees of freedom.

The F test

It is easy to show that for a given quantile a, one has :

Fa , n, m  = F(1 - a ),  m, n 

This symmetry is used to turn a naturally two-sided test into a more convenient one-sided test. In practice, one always considers the ordering of the two samples that leads to a value of F larger than 1.

For a given significance level a, the null hypothesis H0 : s²1 = s²2  is then rejected if the observed F is larger than Fa , n - 1, m - 1, which is the left limit of the yellow area in this illustration :

 

 

 

Formal definition of the F distribution

We may now give a more formal definition of the Fn,m distribution without making reference to the problem that led to its identification.

By definition:

 

* X² ~ n

* Y² ~ m

* X² and Y² independant.

 

 

This formal definition is useful because several important quantities that are not related to estimated variances will appear to be the ratio of two independent variables divided by their own number of degrees of freedom. This is the case of :

    * The F statistic of ANOVA.

    * The F statistic of the global test of validity of a Simple or Multiple Linear Regression.

__________________________________________________________

 

 

Tutorial

 

We establish the (rather complex) analytical form of Fisher's distribution.

This Tutorial is technical, and is of little use for applications. It is part of a more general Tutorial on functions of random variables, and more particularly on the ratio of two r.v..

 

ANALYTICAL FORM OF FISHER'S F DISTRIBUTION

 Analytical form of Fisher's F distribution

TUTORIAL

______________________________________

 

 

 

* Shape of Fisher's F distribution.
* Numbers of degrees of freedom are adjustable.

* Progressive histogram of the ratio of two estimated variances.

 

__________________________________________________________

 

Related readings:

ANOVA

Chi-square distribution

Download this Glossary

 

Want to contribute to this site ?