Quantiles

Also called "fractile".

Distributions

Let p(x) be a continuous probability distribution. The area under the curve representing  p(x) is equal to 1.

Let a be any number between 0 and 1 :

0 < a < 1

There is a unique point xa such that the area under p(x) between - and a is equal to a. This point is the quantile of order a of the distribution.

 

 

Quantiles are often expressed as percentages (%), and xa will then be called the 100.a% quantile de la distribution.

If F(x) is the distribution function of p(x), then a = F(xa), and (lower image of the above illustration) :

xa = F-1(a)

and xa  is then defined as the number such that Pr{x xa}.


Some authors define xa by Pr{x > xa}. This ambiguity is not a problem as the context usually makes it clear which definition is used.

-----

For most probability distributions, quantiles cannot be expressed mathematically in a closed form. Therefore, they have to be calculated once and for all using approximations (that are quite sufficient for most applications). The values are then "tabulated", and displayed in table (or in software). It is then possible for the analyst or for software to read the value of the quantile for a given percentage a.

-----

Quantiles are necessary for interval estimation and for tests. Both rely on the possibility to establish that the probability for the value of a certain statistic to be outside a certain range is less than a given probablity a. The limits of the range are directly related to the quantiles of the distribution of this statistic.

For example, let m be the mean of a sample drawn from a normal ditribution whose variance s² is known, but not its mean µ. The mean µ is estimated by m. A confidence interval with confidence level (1-a) is given by (see here) :

where za /2 denotes the (1 - a /2) quantile of the standard normal distribution.

 

 

Sample

Quantiles may also be defined for samples (or any finite set of points) :

 

Q-Q plots

Quantiles are helpful for graphical representations of samples, and more specifically :

These questions are developed below.

_____________________________________________________________

 

 

Tutorial 1

 

 A "Quantile-Quantile plot", or "Q-Q plot" is a visualization technique that allows comparing :

In both applications, the analyst's judgement is essential. Conclusions drawn from a QQ plot are therefore necessarily subjective. Only tests can reject the normality hypothesis about a distribution known through a sample, or the identity hypothesis about the two distributions behind two samples. But tests do not provide any detailed insight about the sample(s) as QQ plots do.

 

 

QQ-PLOTS

Sample vs a reference distribution

Standardizing the data

Why quantiles ?

The QQ plot

Systematic departure from the ideal configuration

Rankit plot

Sample vs Sample

TUTORIAL

_________________________________________________________________

  

 

Tutorial 2

 

Quantile Inter-quartile (QIQ) transformations can be used to isolate the shape of an empirical distribution relative to any theoretical distribution that is not completely specified (such as a Normal distribution with an unknown mean µ and/or standard deviation s.).

In addition QIQ transformations can also be used to classify tails of an empirical distribution (long, medium, or short) so as to help guide a choice of theoretical distributions to consider.  Prior to understanding the meaning of a QIQ transformations it is first important to understand two lesser known statistical concepts that the QIQ methods are based off of :  the mid-distribution function and the continuous quantile function.

 

 

THE QUANTILE INTER-QUARTILE TRANSFORMATION

The mid-distribution function

The continuous quantile function

The Quantile Inter-quartile (QIQ) transformation

TUTORIAL

 

We thank Mr. Matthew BATES for this Tutorial

______________________________________________________________

 

 

 

Case study

 

The purpose of this example is to demonstrate both the importance and use of modeling data with a proper selection of common statistical distributions.  Some major topics used to accomplish this end are:

 

The following case study is based on a real industrial problem : a car manufacturer is faced with the problem of fitting parts that were machined independently. The distributions of the dimensions of the parts will have to be estimated, and the adjustment of the machine-tools will have to be specified so as to minimize the probability of misfit between two parts that need to be assembled.

The probability density functions will be estimated by the empirical quantile methods to determine the form of the distribution (i.e. such as Normal, Exponential, etc.).  We will in particular demonstrate how to choose a model without relying on estimation of regular parameters (location and scale) pertaining to a given theoretical distribution.

 

 

QUANTILE MODELING FOR

IMPROVED PROCESS CONTROL

The problem

The data

The solution

Step 1  : Identification of the distributions with QIQ-plots

Step 2 :  Estimating the parameters of the distributions

Normal distribution

Exponential distribution

Estimating the mean

Estimating the lower bound

Step 3 : Simulating the distributions

CASE STUDY

 

We thank Mr. Matthew BATES for this Case Study

 

______________________________________

 

Related readings

 Distribution function

Download this Glossary

 

Want to contribute to this site ?