|
Interactive animation |
Distribution function (Cumulative)
Let X be a numerical random variable. It is completely described by the probability for a realization of the variable to be less than x for any x. This probability is denoted by F(x) :
F(x) = P{X ≤ x}
F(x) is called the (cumulative) distribution function (or c.d.f.) of the variable X. It can be regarded as the proportion of the population whose value is less than x.
The c.d.f. of a random variable is clearly a monotonously increasing (or more precisely, non decreasing) function from 0 to 1.
-----
The two events :
* X ≤ x and
* X > x
are mutually exclusive. Therefore :
P{X ≤ x} + P{X > x} = 1
and
P{X > x} = 1 - F(x)
More generally, for any two numbers a and b with a < b, we have :
|
P{a < X ≤ b} = F(b) - F(a) |
The c.d.f. is by no means constrained to be continuous. For example, the c.d.f. of a r.v. that can take only a finite number of values is a "staircase" function :

We then have :
F(xi ) =
j
P{X = xj } j
= 1, 2, ..., I
-----
The same applies to discrete r.v. that can take an infinite number of values, like a Poisson variable. The "staircase" then has an infinite number of steps.
Recall that a r.v. X is said to have a probability density function p(x) if, for any two numbers a and b with a < b, we have :

The cumulative distribution function F(x) is then continuous, and moreover :
* It can be differentiated,
* and its derivative F '(x) is just p(x).
We then have :
|
|
The relationship between :
* Cumulative distribution function, and
* Probability density function
is illustrated by the upper and lower images of this illustration :
You'll also find here an interactive animation illustrating this relationship.
The following illustration represents :
* A probability density,
* And a sample of size n drawn from this probability density.
The observations in the sample are labeled by increasing order of values.
The empirical distribution fuction Fn(x) is defined as follows (lower image of above illustration). It is a staircase function :
* That is equal to 0 for x < x1,
* That is equal to 1 for x
xn,
* Which is constant on semi-open intervals [xi , xi + 1[
* And such that the height of each "step" is 1/n.
There are n steps. The function is monotonously increasing from 0 to 1.
This function is not to be confused with the c.d.f. of
a discrete probability distribution as in the foregoing
paragraph.
The ultimate goal of Statistics is to derive the probability distribution that generated a sample from the sample itself. This goal is of course inaccessible, but Statistics major achievement is to provide practitioners with some partial and probabilistic versions of this goal (mainly, estimation and tests).
This achievement is made possible by the fact that the sample is an incomplete, but hopefully rather faithful image of the true probability distribution (that we now assume to be continuous) :
* Observations are more densely packed in regions of high probability density,
* But are few and far between in regions of low probability density,
this image being somewhat distorted in an unpredictable way by the random nature of population sampling.
The empirical distribution function is an excellent tool for measuring how faithful the sample is to the probability distribution : where observations are densely packed, this function grows rapidly, which is exactly what is expected from the true distribution function, for where the distribution function grows rapidly, the probability density (its derivative) is large, which is propitious to a high concentration of observations.
-----
These intuitive remarks are justified by the Fundamental Theorem of Statistics, which states that :
The empirical distribution function Fn(x) converges to the true distribution function F(x) as the sample size grows without limit.
This convergence is of course to be understood in the sense of the convergence of random variables.
* We demonstrate here that Fn(x) converges to F(x) in probability for every x (we'll use a generalization of the Weak Law of Large Numbers).
* In fact, the convergence is stronger than that, as for every x, the convergence is almost sure (difficult).
* In fact, the convergence is even stronger than almost sure convergence for every x : it can be shown (Glivenko-Cantelli theorem) that if we denote :
Xn = sup|Fn(x) - F(x)|
the r.v. defined as the largest absolute difference between Fn(x) and F(x) for all x (for a given sample), then Xn converges almost surely to 0.
Note that Xn is the statistic of the Kolmogorov-Smirnov test.
This property makes the empirical distribution function very useful in many circumstances :
The cumulative distribution function may be defined the same way in the multivariate case. For example, if we have two random variables X and Y, their (bivariate) cumulative distribution function F(x, y) is defined, for any pair of values x0 and y0, as the probability for a realization of the pair {X, Y} to be such that :
* -
<
X ≤ x0,
* -
<
Y ≤ y0,
that is, for {X, Y} to be in the green area of this illustration :

F(x0, y0) = P{X ≤ x0, Y ≤ y0}
If the joint distribution of {X, Y} has a joint probability density f (x, y), then
|
|
which is the bivariate generalization of the relationship between cdf and pdf in the univariate case.
We show here that two random variables X and Y with respective distribution functions FX (x) and FY (y) and (bivariate) joint cumulative distribution function FXY (x, y) are independent if and only if :
|
FXY (x, y) = FX (x).FY (y) |
This result generalizes to any number of variables.
______________________________________________________________
Related readings :
|