Interactive animation

Exponential distribution

# Definition of the exponential distribution

By definition, the exponential distribution f(x) is:

* f(x) = 0   for x < 0

* and for x 0 :

 f(x) =  λe-λx

where  is a positive parameter (verify that this is indeed a probability density function).

We'll show that the mean µ of the distribution is equal to 1/, and the distribution is often written as :

f(x) = (1/µ).e- x/µ

____________________

The exponential distribution plays a central role in a large class of problems related to the concept of "lifetime". For example, an electronic component might be known to have a lifetime of, say, 10.000h. This means that the component is expected to fail after about 10.000h of use. But of course, this is an average value, and some components from the same batch will last less than 10.000 hours, while others will last longer. So the lifetime of a component is a random variable.

# Basic properties of the exponential distribution

We'll establish the following basic properties of the exponential distribution :

## Mean

 µ = 1/λ

## Variance

 σ² =  1/λ²

You'll find here an interactive animation that illustrates these basic properties of the exponential distribution.

## Moments of all orders

We'll show that the order n moment of the exponential distribution is :

## Exponential distribution and Gamma distribution

We establish here that the sum of n iid Exp(λ) distributed random variables is distributed as Gamma(n, 1/λ).

## Random sum of exponential random variables

We address here the important question of the distribution of the sum of a random number of iid exponential variables when the number of variables in the sum is geometrically distributed.

# The exponential distribution is "memoryless"

## The "weak" memoryless property

Some lifetime problems exhibit a very important property called the "memoryless property". For example, consider a very large number of identical radioactive atoms, and observe their decay.

* During the first t seconds, a proportion p of these atoms disintegrate.

* Now leave the lab, and come back any time later. Consider the atoms that have not disintegrated yet, and observe the radioactive process for another t seconds. During this time, the proportion of these atoms that will disintegrate is also p.

Now translate these observations into terms of probabilities for individual atoms.

* We first state that the probability for an atom to disintegrate in the first t seconds is p.

* We return to the lab after s seconds. Then the probability for a (still not disintegrated) atom to disintegrate within the next t seconds is also p.

So the probability that an s second old atom will last another t seconds is the same as the probability for a "new" atom to last t seconds. It is as if the atom had absolutely no memory of how long it has been around. For radioactive atoms, there is no aging. As long as such an atom is alive, it remains absolutely identical to itself. And then, without any warning, it decays.

Physicists have long given up the hope of finding a more deterministic model of radioactivity.

The memoryless property is expressed mathematically by writing that the two following probabilities are equal :

* The probability that a new atom will live for at least t seconds.

* The probability for an atom that is still alive at time s to live at least an extra t seconds, that is to be still alive at time t + s.

 P{X > t}= P{X > s + t | X > s}

We show here that the exponential distribution has the memoryless property. This means that if a device's lifetime is exponentially distributed, then it will fail exactly the same way as a radioactive atome does.

We also show that the memoryless property is unique to the exponential distribution. More specifically, we show that the exponential distribution is the only continuous distribution with the memoryless property, which explains why it is so important in practice.

In the discrete domain, the geometric distribution also has the memoryless property, and is the only one to have this property.

• You'll find here a first interactive animation illustrating the memoryless property.

-----

We'll also show that, as a consequence of the memoryless property of the exponential distribution, the lifetime distribution of those atoms that survived up to time s is, from this point on, identical to their original lifetime distribution. In other words, the new lifetime distribution of these "survivors" is just their original lifetime distribution right-shifted by a quantity s.

• We illustrate here  this property by an interactive animation that also illustrates a similar property related to the so-called "strong memoryless property" (see below).

## The strong memoryless property

The foregoing property can be generalized as follows. The "weak" memoryless property refers to survival after a reference date s that is arbitrary but fixed. We show here that the property is still true if the reference date s is not fixed, but instead is itself an exponentially distributed random variable. More precisely, we'll show that if :

• X1 is Exp(1),
• X2 is Exp(2),
• X1 and X2 are independent,

then whenever X2 > X1, the distribution of the excess lifetime of X2 over that of X1 does not depend on the value of X1.

This translates into the expression :

 P{X2 > t } = P{X2 > X1 + t | X2 > X1}

which is identical to the expression describing the weak memoryless property, except that the fixed date s is now replaced by the random variable X1.

This property is called the strong memoryless property of the exponential distribution.

-----

We'll also show that the distribution of the difference X2 - X1 conditionally to X2 > X1 is identical to that of X2, and we'll illustrate this result by an interactive animation.

-----

In practical terms, the strong memoryless property describes the following situation. Let M1 and M2 be two machines whose lifetimes are both exponentially distributed and independent. Then, after one machine fails, the survival time of the other machine beyond this failure has the same distribution as the original lifetime of the machine

The memoryless property is not universal

Of course, not all lifetimes problems can be expressed in terms of the memoryless property. For example, all living organisms go through the aging process, and although their lifetimes are more or less random variables, we should not expect their probability density functions to be the same as that of radioactive atoms. Still, many complex devices have lifetimes that behave in a nearly memoryless fashion, which explains why the exponential distribution is so important in many engineering applications..

# Hazard rate, failure rate

If a machine is still operational at time t, what is the (infinitely small) probability dP that it will fail during the next dt time interval ? This probability is :

• Proportional to dt,
• And depends a priori on t.

So, whatever the nature of the distribution of the lifetime of the machine (exponential or not), we have :

dP = h(t).dt

The function h(t) is called the hazard rate function, or "failure rate function" of the distribution.

We describe below the properties of the hazard rate function, and show that the hazard rate function of the exponential distribution is constant (does not depend on t). In fact, this is a characteristic property of the exponential distribution, and is therefore equivalent to the (weak) memoryless property.

# Estimation of the parameter λ

We identify here a sufficient statistic for the parameter λ of the exponential distribution, then show here that this statistic is minimal sufficient, and finally that it is complete.

From this result, we'll deduce a Minimum Variance Unbiased Estimator of λ, which will be shown not to be efficient (its variance is larger than the Cramér-Rao lower bound).

# Exponential family

We show here that the exponential distribution belongs to the exponential family. From this result, we'll deduce that the sample mean is an efficient estimator of the mean µ.

# Exponential distribution and Record Values

The exponential distribution plays a central role in studying record values, because it may well be the only distribution for which the distributions of the record values can be calculated directly.

The Probability Integral Transformation then allows linking the distributions of the record values of any other absolutely continuous distribution (i.e. a distribution with a probability density) to those of the exponential distribution.

______________________________________________________

 Tutorial 1

In this first Tutorial, we describe the elementary properties of the exponential distribution.

We also clarify the relationship between the exponential and the geometric distributions by showing that an exponential r.v. may be considered as the limit of a series of geometric r.v. (the reverse relation is addressed here).

BASIC PROPERTIES OF THE EXPONENTIAL DISTRIBUTION

Probability density function

Cumulative distribution function

Quantile function Q(p)

General

Median

Exponential r.v. as the limit
of a series of geometric r.v..

Moment generating function

Moments

Mean

Direct calculation

Moment generating function

Variance

Direct calculation

Moment generating function

All moments

_________________________________________________

 Interactive animation * Adjustable exponential. * Mean, mode, standard deviation.

TUTORIAL

_________________________________________________________________

 Tutorial 2

We then calculate the Maximum Likelihood estimator of the parameter  of the exponential distribution. It will turn out to be easier to find the Maximum Liklihood estimator of µ = 1/, that is, of the mean of the distribution. We also identify the distribution of this estimator, which is closely related to the Gamma distribution.

We illustrate this result with a two-fold interactive animation :

• Manual tuning of an adjustable candidate exponential until one identifies visually the Maximum Likelihood version.
• Building the histogram of the distribution of this estimator, and thus verify visually that it is indeed unbiased.

MAXIMUM LIKELIHOOD ESTIMATION OF THE

MEAN OF THE EXPONENTIAL DISTRIBUTION

Maximum Likelihood estimation of the mean of the exponential distribution

The Likelihood

The Log-Likelihood

Maximum of the Log-Likelihood

The Maximum Likelihood estimator

Distribution of the estimator

Expectation of the estimator

_________________________________________________

 Interactive animation * Manual tuning of an exponential to the ML.* Progressive histogram of the ML estimator.

TUTORIAL

__________________________________________________________________

 Tutorial 3

In this Tutorial, we demonstrate that the exponential distribution is memoryless, and that it is the only continuous distribution with this property. This groundbreaking result is illustrated by an interactive animation.

THE MEMORYLESS PROPERTY

The exponential distribution is memoryless

The exponential distribution is the only continuous memoryless distribution

Distribution of the observations whose values are larger than a threshold value s

____________________________________________________________

 Interactive animation * Adjustable threshold s and extra-life t.* Progressive estimation of P{X > t} and P{X > s + t | X > s}* Both are equal to exp(-t) * See also next animation below...

TUTORIAL

_______________________________________________________________________

 Tutorial 4

In the next Tutorial, we demonstrate the strong memoryless property of the exponential distribution as described here. This demonstration is more difficult than that of the weak property, but it exploits useful probabilistic techniques, and we made it detailed enough to be accessible with only a basic background in probability theory.

The strong memoryless property generalizes to n independent exponential distributions. We give this generalized result without demonstration.

-----

We illustrate this important result with a twin interactive animation :

• We first return to the weak memoryless property, and build the histogram of the distribution of the values of an exponential r.v. that are larger than an arbitrary (but adjustable) threshold s. This animation comes as a complement to that in the foregoing Tutorial, and that described the weak memoryless property in terms of probabilities (and not of distributions).
• We then illustrate the distribution of the difference X2 - X1 when X2X1, a direct consequence of the strong memoryless property of the exponential distribution.

STRONG MEMORYLESS PROPERTY

OF THE EXPONENTIAL DISTRIBUTION

The strong memoryless property

Theory

Consequences

Strong memoryless property

Independence with respect to X1

Distribution

Generalization to n exponential distributions      (No demonstration)

___________________________________________

 Interactive animation * Memoryless property (weak) :        - Progressive histogram of X  for X > s * Strong memoryless property :        - Progressive histogram of  X2 - X1 for X2 > X1

TUTORIAL

_________________________________________________________

 Tutorial 5

In this Tutorial, we define and determine the properties of the Hazard Rate Function (HRF) (also known as "Failure Rate Function"), a fundamental quantity in survival issues. We show that it is a characteristic of a distribution (just as is the moment generating function).

The HRF of the exponential distribution is constant, a circumstance that is therefore equivalent to the memoryless property. Real devices (not to mention people) do wear out, and a constant HRF is then an irrealistic assumption. Various assumptions about the time evolution of the HRF lead to defining  lifetime probability distributions other than exponential, and that are here only briefly touched upon.

THE HAZARD RATE FUNCTION

 Hazard rate function The general case Special case : the exponential distribution The hazard rate function uniquely determines a distribution The general case Increasingly varying hazard rates Constant rate : the exponential distribution Rate increasing linearly with time : the Rayleigh distribution Rate increasing as a power of time : the Weibull distribution Rate increasing exponentially with time : the Gompertz distribution TUTORIAL

_________________________________________________________

 Tutorial 6

In the next Tutorial, we consider a set of independent devices, whose lifetimes are all exponentially distributed. At time 0, all the devices are operational. But sooner or later, one device will fail. We calculate the probability for any of these devices to be the first one to fail.

We'll also use this result in a different context when we study the superposition of Poisson processes.

FIRST DEVICE TO FAIL

Probability for any device to be the first one to fail

Theory

__________________________________________

 Interactive animation * Three adjustable exponentials. * Progressive estimation of the probability    for each one to be the min.

TUTORIAL

______________________________________________________

 Tutorial 7

We now examine three classical ways to assemble devices or components into one sytem :

• Devices in series (simple),
• Devices in parallel (a bit more tricky),
• One device on line, replacement devices on stand-by.

and we calculate the lifetime distribution and expected lifetime of each of these three set-ups.

DEVICES IN SERIES, PARALLEL AND STAND-BY

Minimum of exponentials, devices in series

The problem

Distribution of min(X1, X2)

Expectation of min(X1, X2)

Interactive animation

Maximum of exponentials, devices in parallel

The problem

Distribution of max(X1, X2)

Expectation of max(X1, X2)

One device on line, another one on stand-by

The problem

Distribution

Distribution

___________________________________________________________________________

 Interactive animation * X1, X2 and X3 adjustable exponentials. * Progressive histogram of the distribution of  min(X1, X2, X3 )

TUTORIAL

_________________________________________________________________________________

The next two Tutorials are dedicated to applications of the strong memoryless property.

__________________________________________________________________________________

 Tutorial 8

In this Tutorial, we revisit the problem of components in parallel. We analyzed the setting made of only two components in parallel. We didn't go any further because, although straightforward in principle, the calculation of the distribution of the max of independent exponential r.v. is intractable in practice.

We now consider any number of independent devices in parallel, but that are constrained to have identical distributions : they are all ~Exp() for some . Even with this simplification, we won't attempt to calculate the distribution of the lifetime of the system, but only concentrate on a more modest goal : calculating the expected lifetime of the system.

The strong memoryless property, in its most general form relative to n exponential distribution, will lead us to the result. We'll then discover that increasing the number of identical components in parallel is a very inefficient way of increasing the lifetime of a system, as this expected lifetime increases only very slowly with the number of components, and all the more so that the number of components is already large. This sad fact is often called "the law of diminishing returns", as investing in more components brings about only a vanishingly small increase in expected lifetime.

-----

In the process, we'll also calculate the distribution of the difference between two consecutive order statistics of the exponential distribution, a quantity known as a "spacing" in the Theory of Distributions.

-----

We illustrate these two results with an interactive animation.

THE "LAW OF DIMINISHING RETURNS"

SPACINGS OF THE EXPONENTIAL DISTRIBUTION

Expectation of the max Zn of n iid exponentials

The "law of diminishing returns"

Warm standbys

Cold standbys

Distribution of spacings

___________________________________________

 Interactive animation * X is an exponential distribution.* Sample size adjustable, spacing selectable. * Progressive histogram of the selected spacing.* Expectation of rightmost observation.

TUTORIAL

__________________________________________________

 Tutorial 9

We now spend some time on a  little problem that we dub "Two identical components in parallel versus one component", or "2// vs. 1" for short.

The question is :

"What is the probability for A to fail before B ?"

The problem looks deceptively simple. We propose three solutions :

* The first solution is short, elegant and requires virtually no calculation. Yet it calls twice on the strong memoryless property, and on two important results established in the preceeding Tutorials, so we find it useful to go over it in some detail.

* The second solution is more straightforward, although it requires some pedestrian calculations. Because it calls on material developed elsewhere on this site, we only outline the solution and leave the actual calculations as an exercise.

* The third solution is often perceived by beginners as the most intuitive one. Unfortunately, it is wrong. We give the "solution" but leave it as a teaser to find the flaw in the reasoning.

THE "2// vs. 1" PROBLEM

 The problem First solution Second solution (Outline only) Third solution (WRONG !) TUTORIAL

__________________________________

 Tutorial 10

We now establish the distribution of the r.v. Z, defined as the sum of a random number of iid exponential variables. The number of variables in the sum is assumed to be geometrically distributed. The problem looks complicated and in fact, the solution relies on somewhat advanced material but the final solution is very simple. This is fortunate because, although the problem looks rather academic, it accurately describes a realistic situation encountered in Reliability Theory.

1) We'll also use this result in a different context when we study the splitting of a Poisson process.
2) The distribution of the sum of a fixed number of iid exponential rvs is addressed here.

-----

Although the mean and variance of the distribution were already calculated somewhere else, recall that we established expressions for the mean and variance of a random sum of iid rvs in a more general context. We now use these expressions on the problem at hand, but just as an exercise.

DISTRIBUTION OF THE SUM OF A

RANDOM NUMBER OF EXPONENTIAL R.V.

 Distribution of Z Example of application Calculating the mean and the variance of Z by the general method TUTORIAL

________________________________________________________________

 Tutorial 11

We conclude with two illustrated exercises.

The strong memoryless property states that :

* If X1 and X2 are two independent exponential r.v.,

* Then the distribution of X2 - X1 conditionally to X2 > X1 is identical to the (unconditional) distribution of X2.

But it says nothing about the respective distributions of X1 and of X2  under the same condition.

The goal of the following two exercises is to calculate these distributons..

-----

The upper frame of the following animation displays two exponential distributions : a red one (that of X1) and a green one (that of X2). You can change the value or the parameter of the distribution of X1 by sliding horizontally the small ball at the upper end of the vertical segment marking the mean of the red distribution with your mouse.

# Distribution of X1 conditionally to X1 < X2

* By default, the lower frame displays the distribution of X1 conditionally to X1 < X2.

* Click on "Go" and observe the build-up of the histogram of this distribution. Note that only draws with X1 < X2 contribute to this histogram, the other draws being ignored (click on "Pause", then several times on "Next").

* The question is : "What is this distribution ?".

We answer the question (as well as the next one) in the Tutorial below.

 The "Book of Animations" on your computer

# Distribution of X2 conditionally to X1 < X2

Click on "Reset", then select "X2".

The lower frame now displays two curves :

* The green curve is the distribution of X2 conditionally to X1 < X2.

* The blue curve is the distribution of max(X1, X2). It plays no active role in the animation, and is displayed for the sole purpose of convincing you that the distribution we are trying to identify, although similar to that of max(X1, X2), is in general not identical to it.

* Also, although "Gamma looking", it's not a Gamma distribution.

* Click on "Go" and observe the build-up of the histogram of the distribution. As before, only those draws resulting in X1 < X2 are retained.

What is this distribution ?

-----

In this Tutorial, we give two solutions to this second exercise :

* The first one is a direct calculation.

* The second one calls on the result of the first exercise, and on the strong memoryless property.

SOLUTIONS OF THE TWO EXERCISES

 Distribution of X1 conditionally to X1 < X2 Animation Distribution of X2 conditionally to X1 < X2 TUTORIAL

_________________________________________________________________

 Gamma distribution Geometric distribution Poisson processes Record values Truncated exponential Shifted exponential