Interactive animation

Exponential distribution

Definition of the exponential distribution

By definition, the exponential distribution f(x) is:

    * f(x) = 0   for x < 0

    * and for x 0:

 

 f(x) = le-lx

 

 

where l is a positive parameter.

We'll show that the mean µ of the distribution is equal to 1/l, and the distribution is often written as :

f(x) = 1/µ.e-x/µ

 

You'll find :

    * Here the basic properties of the exponential distribution,

    * Here an interactive animation that illustrates these basic properties.

____________________

 

The exponential distribution plays a central role in a large class of problems related to the concept of "lifetime". For example, an electronic component might be known to have a lifetime of, say, 10.000h. This means that the component is expected to fail after about 10.000h of use. But of course, this is an average value, and some components from the same batch will last less than 10.000 hours, while others will last longer. So the lifetime of a component is a random variable.

The exponential distribution is "memoryless"

The "weak" memoryless property

Some lifetime problems exhibit a very important property called the "memoryless property". For example, consider a very large number of identical radioactive atoms, and observe their decay.

    * During the first t seconds, a proportion p of these atoms disintegrate.

    * Now leave the lab, and come back any time later. Consider the atoms that have not disintegrated yet, and observe the radioactive process for another t seconds. During this time, the proportion of these atoms that will disintegrate is also p.

 

Now translate these observations into terms of probabilities for individual atoms.

    * We first state that the probability for an atom to disintegrate in the first t seconds is p.

    * We return to the lab after s seconds. Then the probability for a (still not disintegrated) atom to disintegrate within the next t seconds is also p.

 

So the probability that an s second old atom will last another t seconds is the same as the probability for a "new" atom to last t seconds. It is as if the atom had absolutely no memory of how long it has been around. For radioactive atoms, there is no aging. As long as such an atom is alive, it remains absolutely identical to itself. And then, without any warning, it decays.


Physicists have long given up the hope of finding a more deterministic model of radioactivity.

The memoryless property is expressed mathematically by writing that the two following probabilities are equal :

    * The probability that a new atom will live for at least t seconds.

    * The probability for an atom that is still alive at time s to live at least an extra t seconds, that is to be still alive at time t + s.
 

P{X > t}= P{X > s + t | X > s}

 

 

We show here that the exponential distribution has the memoryless property. This means that if a device's lifetime is exponentially distributed, then it will fail exactly the same way as a radioactive atome does.

We also show that the memoryless property is unique to the exponential distribution. More specifically, we show that the exponential distribution is the only continuous distribution with the memoryless property, which explains why it is so important in practice.

In the discrete domain, the geometric distribution also has the memoryless property, and is the only one to have this property.

-----

 

We'll also show that, as a consequence of the memoryless property of the exponential distribution, the lifetime distribution of those atoms that survived up to time s is, from this point on, identical to their original lifetime distribution. In other words, the new lifetime distribution of these "survivors" is just their original lifetime distribution right-shifted by a quantity s.

The strong memoryless property

The foregoing property can be generalized as follows. The "weak" memoryless property refers to survival after a reference date s that is arbitrary but fixed. We show here that the property is still true if the reference date s is not fixed, but instead is itself an exponentially distributed random variable. More precisely, we'll show that if :

then whenever X2 > X1, the distribution of the excess lifetime of X2 over that of X1 does not depend on the value of X1.

This translates into the expression :
 

P{X2 > t } = P{X2 > X1 + t | X2 > X1}

 

 

which is identical to the expression describing the weak memoryless property, except that the fixed date s is now replaced by the random variable X1.

This property is called the strong memoryless property of the exponential distribution.

-----

We'll also show that the distribution of the difference X2 - X1 conditionally to X2 > X1 is identical to that of X2, and we'll illustrate this result by an interactive animation.

-----

In practical terms, the strong memoryless property describes the following situation. Let M1 and M2 be two machines whose lifetimes are both exponentially distributed and independent. Then, after one machine fails, the survival time of the other machine beyond this failure has the same distribution as the original lifetime of the machine

The memoryless property is not universal

Of course, not all lifetimes problems can be expressed in terms of the memoryless property. For example, all living organisms go through the aging process, and although their lifetimes are more or less random variables, we should not expect their probability density functions to be the same as that of radioactive atoms. Still, many complex devices have lifetimes that behave in a nearly memoryless fashion, which explains why the exponential distribution is so important in many engineering applications..

Hazard rate, failure rate

If a machine is still operational at time t, what is the (infinitely small) probability dP that it will fail during the next dt time interval ? This probability is :

So, whatever the nature of the distribution of the lifetime of the machine (exponential or not), we have :

dP = h(t).dt

The function h(t) is called the hazard rate function, or "failure rate function" of the distribution.

We describe below the properties of the hazard rate function, and show that the hazard rate function of the exponential distribution is constant (does not depend on t). In fact, this is a characteristic property of the exponential distribution, and is therefore equivalent to the (weak) memoryless property.

Sufficient statistic

We identify here a sufficient statistic for the parameter l of the exponential distribution.

Exponential family

We show here that the exponential distribution belongs to the exponential family. From this result, we'll deduce that the sample mean is an efficient estimator of the mean µ.

______________________________________________________

 

 

Tutorial 1


In this first Tutorial, we describe the elementary properties of the exponential distribution.

We also clarify the relationship between the exponential and the geometric distributions by showing that an exponential r.v. may be considered as the limit of a series of geometric r.v. (the reverse relation is addressed here).

 

 

 

BASIC PROPERTIES OF THE EXPONENTIAL DISTRIBUTION

Probability density function

Cumulative distribution function

Quantile function Q(p)

General

Median

Exponential r.v. as the limit
of a series of geometric r.v..

Moment generating function

Moments

Mean

Direct calculation

Moment generating function

Variance

Direct calculation

Moment generating function

All moments

_________________________________________________

 Interactive animation

* Adjustable exponential.

* Mean, mode, standard deviation.

TUTORIAL

 _________________________________________________________________

 

 

Tutorial 2


We then calculate the Maximum Likelihood estimator of the parameter l of the exponential distribution. It will turn out to be easier to find the Maximum Liklihood estimator of µ = 1/l, that is, of the mean of the distribution. We also identify the distribution of this estimator, which is closely related to the Gamma distribution.

We illustrate this result with a two-fold interactive animation :

 

 

MAXIMUM LIKELIHOOD ESTIMATION OF THE

MEAN OF THE EXPONENTIAL DISTRIBUTION

 Maximum Likelihood estimation of the mean of the exponential distribution

The Likelihood

The Log-Likelihood

Maximum of the Log-Likelihood

The Maximum Likelihood estimator

Distribution of the estimator

Expectation of the estimator

_________________________________________________

 

Interactive animation

* Manual tuning of an exponential to the ML.
* Progressive histogram of the ML estimator.

 

TUTORIAL

 __________________________________________________________________

 

 

Tutorial 3


In this Tutorial, we demonstrate that the exponential distribution is memoryless, and that it is the only continuous distribution with this property. This groundbreaking result is illustrated by an interactive animation.

 

 

 

THE MEMORYLESS PROPERTY

The exponential distribution is memoryless

The exponential distribution is the only continuous memoryless distribution

Distribution of the observations whose values are larger than a threshold value s

____________________________________________________________
 

Interactive animation

* Adjustable threshold s and extra-life t.
* Progressive estimation of P{X > t} and P{X > s + t | X > s}
* Both are equal to exp(-lt)

* See also next animation below...

TUTORIAL

 _______________________________________________________________________

 

 

Tutorial 4


In the next Tutorial, we demonstrate the strong memoryless property of the exponential distribution as described here. This demonstration is more difficult than that of the weak property, but it exploits useful probabilistic techniques, and we made it detailed enough to be accessible with only a basic background in probability theory.

The strong memoryless property generalizes to n independent exponential distributions. We give this generalized result without demonstration.

-----

We illustrate this important result with a twin interactive animation :

 

 

STRONG MEMORYLESS PROPERTY

OF THE EXPONENTIAL DISTRIBUTION

The strong memoryless property

Theory

Consequences

Strong memoryless property

Independence with respect to X1

Distribution

Generalization to n exponential distributions      (No demonstration) 

___________________________________________

 

Interactive animation 

* Memoryless property (weak) :

       - Progressive histogram of X  for X > s

* Strong memoryless property :

       - Progressive histogram of  X2 - X1 for X2 > X1

 

TUTORIAL

  _________________________________________________________

 

 

Tutorial 5


In this Tutorial, we define and determine the properties of the Hazard Rate Function (HRF) (also known as "Failure Rate Function"), a fundamental quantity in survival issues. We show that it is a characteristic of a distribution (just as is the moment generating function).

The HRF of the exponential distribution is constant, a circumstance that is therefore equivalent to the memoryless property. Real devices (not to mention people) do wear out, and a constant HRF is then an irrealistic assumption. Various assumptions about the time evolution of the HRF lead to defining  lifetime probability distributions other than exponential, and that are here only briefly touched upon.

 

 

 

THE HAZARD RATE FUNCTION

Hazard rate function

The general case

Special case : the exponential distribution

The hazard rate function uniquely determines a distribution

The general case

Increasingly varying hazard rates

Constant rate : the exponential distribution

Rate increasing linearly with time : the Rayleigh distribution

Rate increasing as a power of time : the Weibull distribution

Rate increasing exponentially with time : the Gompertz distribution

 

TUTORIAL

  _________________________________________________________

 

 

Tutorial 6


In the next Tutorial, we consider a set of independent devices, whose lifetimes are all exponentially distributed. At time 0, all the devices are operational. But sooner or later, one device will fail. We calculate the probability for any of these devices to be the first one to fail.

 

 

 

FIRST DEVICE TO FAIL

Probability for any device to be the first one to fail

Theory

__________________________________________

 

Interactive animation

* Three adjustable exponentials.

* Progressive estimation of the probability
   for each one to be the min.

TUTORIAL

  ______________________________________________________

 

 

Tutorial 7


We now examine three classical ways to assemble devices or components into one sytem :

and we calculate the lifetime distribution and expected lifetime of each of these three set-ups.

 

 

 

DEVICES IN SERIES, PARALLEL AND STAND-BY

Minimum of exponentials, devices in series

The problem

Distribution of min(X1, X2)

Expectation of min(X1, X2)

Interactive animation

Maximum of exponentials, devices in parallel

The problem

Distribution of max(X1, X2)

Expectation of max(X1, X2)

One device on line, another one on stand-by

The problem

Identical lifetimes

Distribution

Expected lifetime

Different lifetimes

Distribution

Expected lifetime

___________________________________________________________________________ 

Interactive animation

* X1, X2 and X3 adjustable exponentials.

* Progressive histogram of the distribution of  min(X1, X2, X3 )

TUTORIAL

 _________________________________________________________________________________

The next two Tutorials are dedicated to applications of the strong memoryless property.

__________________________________________________________________________________

 

 

Tutorial 8


In this Tutorial, we revisit the problem of components in parallel. We analyzed the setting made of only two components in parallel. We didn't go any further because, although straightforward in principle, the calculation of the distribution of the max of independent exponential r.v. is intractable in practice.

We now consider any number of independent devices in parallel, but that are constrained to have identical distributions : they are all ~Exp(l) for some l. Even with this simplification, we won't attempt to calculate the distribution of the lifetime of the system, but only concentrate on a more modest goal : calculating the expected lifetime of the system.

The strong memoryless property, in its most general form relative to n exponential distribution, will lead us to the result. We'll then discover that increasing the number of identical components in parallel is a very inefficient way of increasing the lifetime of a system, as this expected lifetime increases only very slowly with the number of components, and all the more so that the number of components is already large. This sad fact is often called "the law of diminishing returns", as investing in more components brings about only a vanishingly small increase in expected lifetime.

-----

In the process, we'll also calculate the distribution of the difference between two consecutive order statistics of the exponential distribution, a quantity known as a "spacing" in the Theory of Distributions.

-----

We illustrate these two results with an interactive animation.

 

 

 

THE "LAW OF DIMINISHING RETURNS"

SPACINGS OF THE EXPONENTIAL DISTRIBUTION

Expectation of the max Zn of n iid exponentials

The "law of diminishing returns"

Warm standbys

Cold standbys

Distribution of spacings

___________________________________________


 

Interactive animation 

* X is an exponential distribution.
* Sample size adjustable, spacing selectable.

* Progressive histogram of the selected spacing.
* Expectation of rightmost observation.

TUTORIAL

 __________________________________________________

 

 

Tutorial 9

 

 

 

We now spend some time on a  little problem that we dub "Two identical components in parallel versus one component", or "2// vs. 1" for short.
 

The question is :

"What is the probability for A to fail before B ?"

 

 


The problem looks deceptively simple. We propose three solutions :

    * The first solution is short, elegant and requires virtually no calculation. Yet it calls twice on the strong memoryless property, and on two important results established in the preceeding Tutorials, so we find it useful to go over it in some detail.

    * The second solution is more straightforward, although it requires some pedestrian calculations. Because it calls on material developed elsewhere on this site, we only outline the solution and leave the actual calculations as an exercise.

    * The third solution is often perceived by beginners as the most intuitive one. Unfortunately, it is wrong. We give the "solution" but leave it as a teaser to find the flaw in the reasoning.

 

 

THE "2// vs. 1" PROBLEM

The problem

First solution

Second solution (Outline only)

Third solution (WRONG !)

 

TUTORIAL

 __________________________________

 

 

Tutorial 10


We now establish the distribution of the r.v. Z, defined as the sum of a random number of i.i.d. exponential variables. The number of variables in the sum is assumed to be geometrically distributed. The problem looks complicated and in fact, the solution relies on somewhat advanced material but the final solution is very simple. This is fortunate because, although the problem looks rather academic, it accurately describes a realistic situation encountered in Reliability Theory.

-----

Although the mean and variance of the distribution were already calculated somewhere else, recall that we established expressions for the mean and variance of a random sum of i.i.d.r.v. in a more general context. We now use these expressions on the problem at hand, but just as an exercise.

 

 

DISTRIBUTION OF THE SUM OF A

RANDOM NUMBER OF EXPONENTIAL R.V.

Distribution of Z

Example of application

Calculating the mean and the variance of Z by the general method

 

TUTORIAL

  ________________________________________________________________

 

 

Tutorial 11

 

We conclude with two illustrated exercises.

The strong memoryless property states that :

    * If X1 and X2 are two independent exponential r.v.,

    * Then the distribution of X2 - X1 conditionally to X2 > X1 is identical to the (unconditional) distribution of X2.

 

But it says nothing about the respective distributions of X1 and of X2  under the same condition.

The goal of the following two exercises is to calculate these distributons..

-----

The upper frame of the following animation displays two exponential distributions : a red one (that of X1) and a green one (that of X2). You can change the value or the parameter of the distribution of X1 by sliding horizontally the small ball at the upper end of the vertical segment marking the mean of the red distribution with your mouse.

Distribution of X1 conditionally to X1 < X2

* By default, the lower frame displays the distribution of X1 conditionally to X1 < X2.

* Click on "Go" and observe the build-up of the histogram of this distribution. Note that only draws with X1 < X2 contribute to this histogram, the other draws being ignored (click on "Pause", then several times on "Next").

* The question is : "What is this distribution ?".

    We answer the question (as well as the next one) in the Tutorial below.

 

 

 

The "Book of Animations" on your computer

 

 

Distribution of X2 conditionally to X1 < X2

Click on "Reset", then select "X2".

The lower frame now displays two curves :

    * The green curve is the distribution of X2 conditionally to X1 < X2.

    * The blue curve is the distribution of max(X1, X2). It plays no active role in the animation, and is displayed for the sole purpose of convincing you that the distribution we are trying to identify, although similar to that of max(X1, X2), is in general not identical to it.

    * Also, although "Gamma looking", it's not a Gamma distribution.

    * Click on "Go" and observe the build-up of the histogram of the distribution. As before, only those draws resulting in X1 < X2 are retained.

 

What is this distribution ? 

-----

In this Tutorial, we give two solutions to this second exercise :

    * The first one is a direct calculation.

    * The second one calls on the result of the first exercise, and on the strong memoryless property.

 

 

SOLUTIONS OF THE TWO EXERCISES

Distribution of X1 conditionally to X1 < X2

Animation

Distribution of X2 conditionally to X1 < X2

TUTORIAL

 

_________________________________________________________________ 

Related readings :

Gamma distribution

Geometric distribution

 

Download this Glossary

 

Want to contribute to this site ?