Interactive animation

Covariance

As the name implies, covariance is a measure of the strength of the link between two (numerical) random variables.

# The concept of "link" between two r.v.

Given two such r.v. X1 and X2 , two extreme circumstances can be encountered :

1. There is no link whatsoever between X1 and X2 . Knowing the value of  X1  gives no clue to what the value of  X2 might be. The two variables are said to be independent.
2. The link is so strong that it is in fact functional. There is a completely deterministic function y = f(x) such that :

X2 = f(X1)

Knowing the value of X1  then detemines the value of X2 without any uncertainty.

Most often, the link between two r.v. is "somewhere in between" : knowing the value taken by X1 reduces, to a certain extent, the uncertainty about the value that X2 will take.

-----

There is no universal way to define and measure the strength of the link in this intermediary situation. Covariance is one way to do it, and is very useful in many practical situations despite its limitations.

# Definition of the Covariance

If X1 and X2 are strongly (positively) linked, then we could think of defining covariance in a way that would embody the following idea :

* Whenever X1 is positive, then X2 is likely to be positive too.

* Whenever X1 is negative,  then X2 is likely to be negative too.

This will not do because we want the covariance to be unchanged when both probability distributions are translated by arbitrary quantities. So instead of measuring the values of X1 and X2  from "0", we will measure them from reference points that translate along with the probability distributions, for example their respective means  µ1 and  µ2. Our original idea now reads :

* Whenever (X1 -  µ1) is positive, then (X2  -  µ2 ) is likely to be positive too.

* Whenever (X1 -  µ1) is negative, then (X2  -  µ2 ) is likely to be negative too.

So if X1 and X2 are strongly (positively) linked, more often than not, X1 -  µ1 and   X2 -  µ2  are :

* simultaneously positive,

* or simultaneously negative.

The product (X1 -  µ1).(X2  -  µ2 ) is then likely to be very often positive :

* Either because both quantities are positive,

* Or because both quantities are negative.

Yet, the product (X1 -  µ1).(X2  -  µ2 ) is a random variable, and we want a fixed number. But a random variable that spends most of its time taking positive values is likely to have a positive expectation. So we will consider the expectation of (X1 -  µ1).(X2 -  µ2 ), and call it the covariance of X1 and X2 :

 Cov(X1, X2) = E[(X1 -  µ1).(X2 -  µ2 )]

We'll show that this expression is equivalent to this other one, more convenient in practice :

 Cov(X, Y) = E[XY] - E[X].E[Y]

# Interpretation of the Covariance

## Absolute value of the Covariance

Staying with purely qualitative arguments, we notice that, for given and fixed probability distributions of X1 and of X2 :

* A large positive value of the Covariance is an indication that (X1 -  µ1) and (X2  -  µ2 ) often take large positive or large negative values simultaneously, a circumstance that strengthens our belief that the variables are indeed tightly linked.

* Wheraeas a smaller positive value of the covariance is an indication that one of the variables has a fair chance to be close to its mean when the other takes large (positive or negative) values.

So, a large positive value of the covariance is a good detector of a strong link between two r.v..  It can be shown that this link is then necessarily linear.

What conclusions may be drawn from a low value (close to 0) of the covariance ? In general, none.

* The Covariance may be low because, indeed, the link between the two variables is weak.

* But their may exist a strong, non linear link between the two variables, the nature of this link making the Covariance low (see here).

## Sign of the Covariance

We developed the argument leading to the definition of the covariance on the basis of a positive link between X1 and X2. But it applies just as well in the case of a negative link. We can use the same line of reasoning if   X1 -  µ1  taking large positive values makes it likely that X2 -  µ2  will take large negative values. In this case, the covariance is a large negative number.

# Covariance and correlation coefficient

A drawback of covariance is that its value depends on the units used to express the values of X1 and X2, whereas a practical measure of the strength of the link between two variables certainly shouldn't. Therefore, attractive as it is for the theoretician, the application oriented analyst will usually prefer its standardized version, the correlation coefficient, whose value does not depend on measurement units, and that we describe here.

# Covariance and independence

If the two random variables X and Y are independent, then their covariance is 0.

But the converse is not true : two random variables may have 0 covariance, and yet not be independent. For example, let :

* X be uniformly distributed in [-1, +1].

* Y = X ².

We leave it as an exercise to show that

Cov(X, Y ) = 0

while X and Y are clearly not independent.

____________________________________________________________

 Tutorial

The basic properties of Covariance are listed and demonstrated in the following Tutorial :

BASIC PROPERTIES OF THE COVARIANCE

 Basic properties Symmetry Variance is self-covariance Another expression for the Covariance An interpretation of the covariance Linearity with respect to constants Linearity with respect to variables Variance of a sum of random variables Covariance and independence TUTORIAL

____________________________________________________

 Correlation coefficent Covariance matrix Variance Independent random variables

Interesting examples of calculation of a covariance :

 Covariance of two modalities of a multinomial distribution Covariance of two observations when sampling without replacement Covariance of the numbers of events in two initial time intervalsof a Poisson process Covariance of two observations from a normal distribution conditionally to the value of the sample mean