Interactive animation

Binormal  (Distribution)

If you're not familiar with the (univariate) normal distribution, we suggest that you first report here.

-----

The multivariate normal distribution is by far the most widely used distribution in classical modeling techniques (e.g. Discriminant Analysis), that often explicitely assume the data distributions to be multinormal.

When only two variables are involved, the multivariate normal distribution is called the bivariate normal distibution, or simply the binormal distribution. For example, the joint distribution of the height and weight of individuals in a relatively homogenous population is approximately binormal.

# The standard bivariate normal distribution

The standard (univariate) normal distribution N(0, 1) plays a central role when studying the general normal distribution because any normal distribution can be transformed into this reference distribution by a linear transformation.

Similarly, we'll start our study of the bivariate normal distribution by focusing on the standard bivariate normal distribution, which is illustrated by the animation below.

This animation shows two (univariate) normal distributions represented by a vertical and a horizontal gaussians. Both are centered and initially of unit variance. These normal distributions are the marginal distributions of the standard bivariate normal distribution.

 The "Book of Animations" on your computer

The correlation coefficient of these two normal r.v. can be adjusted with the cursor on the blue background. To each value of this correlation coefficient corresponds one standard bivariate normal distribution. So, whereas the standard (univariate) normal distribution is unique, there is a family of standard bivariate normal distributions. This family is indexed by the value of the correlation coefficient between the two marginal variables of the distribution.

The ellipse is the locus of points for which the density is equal to 1.

* This ellipse degenerates into a circle when the two marginals are uncorrelated.

* Its major axis is always at 45° of the axes bearing the marginal distributions. This major axis is in the 1st and 3rd quadrant when the correlation coefficient is positive, and in the other two quadrants when the correlation coefficient is negative.

-----
Click on "Go" and observe the spatial distribution of the observations drawn from a binormal distribution. The value of the correlation coefficient may be changed without interrupting the animation.

# Probability density of the bivariate normal distribution

In the Tutorial below, we calculate the probability density function g(x, y) of the standard bivariate normal distribution :

as well as the conditional distribution g(Y | X = x0). This last result will be obtained very simply, and will be later rediscovered in a somewhat more complex way when we study the general bivariate normal distribution.

# The general bivariate normal distribution

In the above animtion, the standard deviations of the marginal distributions can be adjusted with the two cursors on a yellow background. Playing with these controls allow discovering new binormal distributions, which are then not standard (the variances of the marginals are not equal to 1), but can be made standard by a linear transformation.

The isodensity ellipse is now not at 45° of the axes anymore, except when the marginals have equal variances. The major axis is one of the axes of the distribution if and only if the marginals are uncorrelated.

The most general bivariate normal distribution is obtained by an arbitrary translation of the above distribution in its own plane (not implemented).

-----

So let X and Y be two random variables (that we do not assume to be normal) :

* With respective means µx et  µy,

* and respective variances x² et y².

By definition, the joint distribution of the pair (X, Y) is said to be bivariate normal if the joint distribution of the standardized variables :

•
•

is a standard bivariate normal distribution.

We'll show that this distribution is :

Although this expression looks impressive, a close look shows that it is only a rather intuitive modification of the standard binormal distribution, which can be obtained by setting the means to 0 and the standard deviations to 1.

Note that because a correlation coefficient in invariant under a linear transformation, is the correlation coefficient of the pair (X, Y) as well as that of the standardized pair (X ', Y ').

We'll then show that the variables X et Y are both normal,and we'll calculate the parameters of their distributions.

# Conditional distributions of the bivariate normal distribution

This illustration shows a general bivariate normal distribution.

For an arbitrary x0, draw the vertical line x = x0. Then normalize the values of the joint density as encountered on this line so that the area under the red curve is equal to 1.

By definition, the result is the density of Y conditionally to X = x0.

-----

We'll show the important following points :

1) This conditional distribution is normal.

2) Its variance always has the same value irrespective of the chosen x0. This value is :

 Var(Y | X = x0) = (1 - ²)y²

Note that this variance :

* Does not indeed depend on x0. All the vertical cuts have the same variance (with a similar result for the horizontal cuts).

* Is always smaller than the variance of Y (resp. X), the marginal variable.

3) The mean of this distribution (top of the red curve) is on a straight (blue) line called the regression line. In other words, the mean of the distribution of Y conditionally to X = x is a linear function of x. The equation of this line is :

The expression "regression line" is justified by the very definition of the term "regression" which is the kind of modeling consisting in calculating the expectation of a r.v. conditionally to the value of another variable.

Note the formal similarity between the above expression, and the equation of the Least Squares Line in Simple Linear Regression.

The slope of the regression line is always less (in absolute value) than that of the major axis (or "First Principal Component") of the distribution. This implies that as you move along the vertical line x = x0, the largest value of the density is not to be found on the major axis of the distribution, a fact that is sometimes perceived as counterintuitive.

-----

Of course, there are anologous results for the distribution of X conditionally to Y = y0.

# Binormal distribution and multinormal distribution

The binormal distribution is just a special case of the general multivariate normal distribution, and should therefore require no special treatment. Yet, it is useful to devote a special entry to this distribution for at least two reasons :

* First, it is quite often met in practical applications. The equations pertaining to the bivariate case are simpler than those of the general multivariate case, and are therefore to be prefered.

* From a pedagogical standpoint, it has two nice features :

- Any result pertaining to the binormal distribution can be easily visualized (see above animation).

- The equations describing the properties of the bivariate normal distribution are more complex than those describing the univariate normal distribution, but they are still quite manageable. Further on, when an arbitrary number of variables will be considered, these equations will become very cumbersome and will be advantageously replaced by matrix equations. Therefore, the binormal distribution offers the opportunity for a "soft" introduction to some aspects of Linear Algebra by first establishing results as "ordinary equations", then translating these equations into matrix notation.

_________________________________________________________________

 Tutorial 1

In this Tutorial :

1) We first define the standard bivariate normal distribution. We do so not by defining its probability density, but rather from its two marginal distributions that we build  as two standard (univariate) normal distributions with an adjustable correlation coefficient.

Only then do we derive the probability density of the standard bivariate normal distribution.

We further review the most important properties of this standard bivariate normal distribution :

* We show that the probability density can be expressed with the inverse of the covariance matrix of the marginals. This result is not of much interest for bivariate normal distributions, but it will be generalized to the multivariate case, where it is then essential.

* The marginals are independent if and only if they are uncorrelated.

* We'll calculate the conditional distributions, which are also normal, and whose properties are important.

* We'll show that a rotation of the reference frame allows expressing the standard bivariate normal distribution as the joint distribution of two independent normal variables, whose properties we'll establish.

* We'll finally show that the isodensity curves are ellipses (also called "covariance ellipses"). We'll calculate the orientation and lengths of the major and minor axes of these ellipses.

2) We then define the general bivariate normal distribution as the joint distribution of two r.v. X and Y such that the joint distribution of their standardized versions is a standard bivariate normal distribution. X and Y need not be assumed normal.

It is now necessary to show that the marginal distributions (that of X and Y) are indeed normal. So the normality of the marginals X and Y will appear as a consequence of binormality ot (X, Y). In the process, we'll also incidentally calculate the conditional distributions of the general binormal distribution.

We'll confirm this last result by calculating the conditional distributions of the general binormal distribution from first principles. This will turn out to be just a little more difficult than for the standard binormal distribution.

THE BIVARIATE NORMAL DISTRIBUTION

 Standard bivariate normal distribution The two marginal distributions The standard binormal distribution with a given correlation coefficient ρ Standard binormal distibution and covariance matrix Uncorrelation and independence Conditional distributions Rotation of the axes and independent marginal distributions Ellipses of constant probability density General bivariate normal distribution Probability density Marginal distributions Conditional distributions TUTORIAL

_________________________________________________________

 Tutorial 2

We mentioned that if the two normal components of a binormal distribution are uncorrelated,  then they are also independent. The terms "uncorrelated" and "independent" are often associated when it comes to normal r.v., and it is a common idea that uncorrelated r.v. are independent.

As such, the statement is incomplete, and therefore false.

In this Tutorial, we give two counter-exemples that dispell this misconception. For each one, we exhibit a pair (X, Y) of standard normal r.v. that are uncorrelated, yet that are not independent. This "uncorrelated yet not independent" combinatation is somewhat unsual for normal variables, and  has just as unsual consequences :

* The sum of two normal r.v. may not be normally distributed (whereas this sum is always normal if the two normal variables are independent).

* A bivariate distribution may have normal marginal distributions, yet not be binormal.

In the Tutorial, we give the correct version of the link between "uncorrelated" and "independent" for normal variables.

-----

The two counter-examples are illustrated by an interactive animation.

UNCORRELATED YET NOT INDEPENDENT NORMAL R.V.

First example

Definition of Y

Y is standard normal

X and Y are uncorrelated

X and Y are not independent

X + Y is not normally distributed

Second example

Definition of Y

Y is standard normal

X and Y are uncorrelated

X and Y are not independent

X + Y is not normally distributed

 Interactive animation Two examples of pairs (X, Y) of normal r.v. that are :       * Uncorrelated       * Yet that are not independent

TUTORIAL

_______________________________________________________

 Univariate normal distribution Correlation coefficient Covariance matrix Multivariate normal distribution