Interval estimation

Point estimation

If you're not familiar with estimation, we suggest that you first read the entry on "Point estimation".

Confidence interval

Given a n-sample drawn from a distribution f(x, θ), a point estimator θ* delivers a number θ*, the point estimate of the parameter θ. If this estimator has the nice properties expected from a good estimator, this estimate is our best bet about the true value θ0 of θ. Yet, θ* is a raw number that carries no information about how close it may be to the true value θ0. All we know is that, in vague terms, θ* is probably not very far from θ0, but we are unable to quantify the terms "probably" or "far". For example the sample mean has all the virtues you would expect from a good estimator of the distribution mean µ. Yet, given the value of the sample mean, there is no way to evaluate how far it is from the distribution mean µ.
What we would like is a procedure that gives us a feel, in probabilistic terms, of where the true value θ0 actually lies. This is the objective pursued by interval estimation.

Confidence interval and confidence level

More precisely, given a sample, we would like to identify an interval of the line that bears the parameter :

* Whose left end point L(x) and right end point R(x) are determined by the sample only (and are therefore random variables, and more precisely, statistics),

* And such that we can state with certainty :

"The probability for this interval to cover the true value θ0 is equal to P."

where P is an arbitrary probability chosen by the analyst (for example, .90, or .95).

This deserves some explanations. Suppose we know how to build the interval. We then imagine that a large number of n-samples are drawn from the distribution. For each new sample, a new interval is constructed with our procedure. This (random) interval will then cover θ0  (the true and unknow value of the parameter) 100.P% of the time (upper and lower images of this illustration) :

Then, for any given sample, we'll have identified a limited (random) region that covers the true value θ0 with probability P. If P is large, we'll have then found a limited region that is very likely to cover θ0.

Note that we resisted the temptation to say "The probability for θ0 to be inside the interval is P", for θ0 is unknown but  is not a random variable. The interval, though, is random.

An interval as described above, when it can be identified, is called a confidence interval, and the probability P is called the confidence level of the confidence interval.

Note that building a confidence interval does not require that a point interval of the parameter be calculated first.

Formal definition of a confidence interval

We mentionned that a confidence interval (or interval estimate) is a random interval defined by the sample only. This means that there exist two functions :

* L(x) and

* R(x)

such that L(x) is the abscissa of the left end of the interval, and R(x) the abscissa of its right end.

So, just as the most general definition of a point estimator is "A function of the sample (a statistic)", the most general definition of a confidence interval is :

A pair [L(x), R(x)] of functions of the sample such that for all x, one has L(x) < R(x).

In point estimation, whether a particular statistic is useful for estimating a parameter depends on the properties of the statistic (consistency, unbiasedness etc...). In interval estimation, whether a particular pair of functions such as defined above is of any interest for the purpose of interval estimating a parameter depends :

* On the possibility to assign a covering probability relative to θ0 (i.e., a confidence level) for any possible sample x,

* And on the fact that the intervals thus defined will be reasonably short for a given value of the confidence level 1 - α ( See below).

Generalization of the concept of "Confidence level"

The above definition of a confidence level implicitely assumes that the probability for the interval to cover θ0 does not depend on the value of θ0. If this probability depends on θ0, then the very concept of "confidence level" becomes meaningless because θ0 is unknown.

Unfortunately, it often happens that covering probabilities Pθ depend on θ0 and therefore cannot be numerically calculated. Yet the situation may still be partially salvaged if it is possible to calculate the smallest value P that Pθ can take over the range of θ :

P = inf {Pθ  over the range of θ}

Given a sample, it is then possible to construct an interval and assert that the probability for this interval to cover θ0 is at least P.

Confidence interval on the mean of a normal distribution (variance is known)

Let us now give what may be the simplest example of a confidence interval.

Let N(µ, σ²) be a normal distribution (red curve in the illustration below) from which a n-sample is drawn. The variance σ² is supposed to be known, but not the mean µ. We use the sample mean m as as estimate of the distribution mean µ. It is distributed as N(µ, σ²/n) (blue curve).

The distribution of m can be turned into a standard normal distribution (mean equal 0, unit variance) by the simple transformation T :

We have m'~N(0, 1) (black curve).

Now position two cut-off points L' and R' on this standard normal distribution in such a way that the area under each wing of the standard gaussian is, say, 0.025 (red areas). These points are changed into L and R by the inverse of the transformation T. The transform of the mean (that is, of m') has a 0.95 probability of being outside the red area, and so has m.

More generally, if we denote the probability P (arbitrarily chosen by the analyst) by 1 - α, then :

where zα/2 denotes the number such that the area under the standard gaussian curve to the right of zα/2  is α/2.

The inverse of transformation T gives :

This expression is equivalent to :

-----
These two expressions have different interpretations :

* The first expression is the probability for the random variable m to be inside a fixed interval centered on µ and of length 2(zα/2.σ.n-1/2 ).

* The second expression if the probability for a certain random interval centered on m, of the same fixed length  2(zα/2.σ.n-1/2 ), to cover µ.

So, indeed, we have found a (random) interval :

* Whose end points are completely determined by the sample,

* And that covers the distribution mean µ with a given, arbitrary probability P = 1 - α,

which is exactly what we were looking for.

This interval is called the confidence interval of m (green interval in the illustration above) for the confidence level 1 - α.

Note that 2(zα/2.σ.n-1/2 ), the length of the confidence interval, is proportional to the standard deviation of the original normal distribution, and therefore to the standard deviation of m, the point estimate of µ. So it can be said that :

* Point estimation tells us about the central tendency of the distribution of the estimator of the parameter,

* Interval estimation tells us about both the central tendency and the spread of the distribution of the estimator of the parameter.

-----

Note that in this particularly simple example, the length of the confidence interval (for a given confidence level) depends only on the sample size, and not on the details of the sample. It is therefore possible, for a given confidence level, to calculate the number of observations to be collected in order for the confidence intervals to have a given length. Conversely, if the number of observations is determined by practical considerations (e.g. time, budget), then the length pf the confidence intervals is entirely determined by the chosen confidence level.

Pivot

In the above example, what made the dicovery of a confidence interval easy was the fact that both end point of the interval clearly have distributions that do not depend on the value of the estimated parameter µ, because the distribution of the quantity

does not depend on µ.

Quite generally, a quantity whose distribution does not depend on the value of any of the parameters of the original distribution is called a pivotal quantity, or a pivot. Note that a pivot is not a statistic because its definition involves not just the observations in the sample, but also the true value θ0 (here, µ) of the parameter.

Any pivotal quantity :

* Involving only one parameter,  and

* Whose distribution is known,

can be used for building confidence intervals for this parameter. Any pair of numbers L' and R' define two "rejection" regions (the red regions in the above example) under the probability distribution curve of the pivot. The sum of the areas of these two regions is a number α, and the same argument we developed above shows that the back-transformed of L' and R', call them L and R, define a confidence interval with confidence level 1 - α.

Length, confidence level and sample size

When it is possible to assign a covering probability (confidence level) to an interval, the three quantities :

1) Confidence level,

2) Length of the confidence interval,

3) Sample size,

and intimately related.

Interval length and confidence level

We mentioned that the value of the confidence level is generally defined arbitrarily. For a given sample, if a larger confidence level is chosen, then the confidence interval will become longer. This is because increasing the probability for the interval to cover the true value of the parameter can be obtained only by making the interval longer, and consequently by increasing the uncertainty about the whereabouts of this value.

Conversely, making the confidence interval shorter can be obtained only by making the confidence interval smaller : the smaller the interval, the less likely it is that it will cover the true value of the parameter (lower image of this illustration) :

If the confidence level is made to tend to 0, so does the length of the confidence interval. Its right and left end points then converge to a common limit known as the (point) Hodges-Lehmann estimator of the parameter θ.

Length and sample size

Although the principle of interval estimation does not demand that a point estimation be made first, many classical confidence intervals are obtained by covering a point estimate θ* by an interval whose middle point is θ*. The confidence interval then has the form

θ* ± Δ

and the length of the interval is 2Δ.

Δ depends on the confidence level (see above), but also on the sample size for a given confidence level. More precisely, Δ decreases (on the average) when the sample size increases. This could certainly be expected : as more and more observations are collected, more and more information about the distribution is also collected, and the values of its parameters can be determined with greater and greater accuracy, leading to shorter and shorter confidence intervals.

Confidence and sample size

By the same token, if the confidence intervals are constrained to always have the same length, the confidence levels tend to increase as larger and larger samples are considered. It is then sometimes possible to calculate how many observations have to be collected in order to obtain both a given confidence level and a given length of the confidence intervals.

Shortest confidence interval

Given a sample and an arbitrary confidence level 1 - α, there is usually an infinity of confidence intervals (i.e. pairs [L(x), R(x)]) at this given level of confidence. In the above example (normal mean), we chose the confidence interval to be symmetric about m, but any interval such that the sum of the red areas to the left and to the right of the gaussian curve is a bona fide confidence interval at the 1 - α confidence level (lower image of this illustration) :

Of all these candidate confidence intervals, which one should we choose ?

In the above example, we would like the mean µ to be as narrowly localized as possible, meaning that we want the confidence interval to be as short as possible for a given confidence level.

As it happens, the symmetric interval identified above is in this particular case indeed also the shortest for a given confidence level.

Indeed, it can be shown that whenever the distribution f(x) of θ* is :

* Unimodal,

* And such that, for a given α :

f(-zα/2 ) = f(zα/2)

then the symmetric interval [-zα/2, zα/2] is always the shortest confidence interval.

-----

Most often, the pair [L(x), R(x)] is chosen so as to minimize the length of the interval estimate. When this proves difficult, another possibility is to relax the

condition on "shortest length" and settle for the less constraining "shortest expected length", meaning that the expectation of the random variable "length of the interval estimate" is rendered as small as possible by an appropriate choice of the pair [L(x), R(x)].

-----

In some cases even the above condition may be hard to satisfy, and as a last resort one may attempt to minimize the length not of the confidence interval itself, but of its image in the pivotal space (blue segment in the above illustration).

-----

In simple problems such as finding a confidence interval for the normal mean, the three definitions are equivalent, but it is not always the case for more complex settings.

Approximate confidence intervals

Most of the time, given a parameter, no exact pivot is known for this parameter. But is then often possible to identify pivots that will allow the construction of approximate confidence intervals.

Asymptotic pivots

It is sometimes possible to identify a quantity that cannot be used as a pivot :

• Either because its distribution depends on the value of the estimated parameter,
• Or because its distribution is not known,

but whose distribution can be shown to converge, when the sample size grows without limit, to a known limit distribution that does not depend on the value of the parameter, and is therefore an asymptotic pivot.

An approximate confidence interval is then built using this asymptotic pivot. This interval is not exact, but becomes more and more accurate when the sample size tends to infinity.

An example of asymptotic pivot is given below (without the demonstration, which is difficult).

Approximate pivot

It is also sometimes possible to identify a quantity Q whose distribution would converge towards an asymptotic pivot for large samples, but for which it is possible to find a pivotal quantity whose distribution is close to the distribution of Q for any sample size. This pivot is then used for building an approximate confidence interval.

A classical example of this approach is Welch's approximation (see below) that is used to build an approximate confidence interval for the difference of two independent normal distributions whose variances are unknown and not assumed to be equal.

Confidence intervals in data modeling

One of the main applications of interval estimation is to calculate confidence intervals :

* For the parameters of a model. The values of these parameters are point estimates of the true values of these parameters. Confidence intervals for these estimates indicate how trustworthy these estimates are.

See for example the calculation of confidence intervals for the parameters of a Multiple Linear Regression model.

* For the model predictions. For example, in Regression, the uncertainties about the mdel predictions can be visualized as a "confidence strip" (lower image in the illustration below).

The model predictions are more reliable in regions where the strip is narrow than in regions where the strip is wide.

Local data density plays a major role in defining the strip width : the uncertainty about the model predictions is larger in low density areas than in of high density areas.

____________________________________________________________

 These Tutorials are the same as those you'll find on the page about confidence intervals.

 Tutorial 1

The first Tutorial describes the methods used to obtain exact confidence intervals for the means of normal distributions under various circumstances.

EXACT CONFIDENCE INTERVALS

FOR THE MEANS OF NORMAL DISTRIBUTIONS

 Confidence interval of a mean Variance is known Variance is unknow Difference of a mean and a reference value Comparing two means Paired samples Independent samples Variances are known Variances are unknown but known to be equal Variances are unknown, and NOT assumed to be equal : a failure TUTORIAL

_________________________________________________________________

 Tutorial 2

In the general case (variances unknown and not assumed to be equal), no exact confidence interval is known for the difference of the means of two independent normal distributions. But is is possible to calculate two types of approximate intervals :

• An asymptotic interval, that is almost an exact interval for large samples.
• An approximate interval based on "Welch's approximation", that delivers reasonably accurate intervals even for moderately sized samples.

ASYMPTOTIC CONFIDENCE INTERVALS

AND WELCH'S APPROXIMATION

 Asymptotic confidence interval  (no demonstration) Welch's approximation TUTORIAL

• The asymptotic confidence interval is simple enough, but demonstrating its validity is difficult, and is not done in the tutorial.
• Welch's approximation is also simple, but needs some (very instructive) calculations, that are given completely.

__________________________________

 Point estimation Confidence intervals