Given a sample from an unknown probability distribution p(x), a histogram is a model that is hoped to give a reasonably faithful graphical representation of this probability distribution.
It is built as follows :
So, a histogram is a series of bins placed side by side on the x axis, and such that the height of a bin is the number of observations in the corresponding bracket.
The sample is, as usual, hoped to be a faithful
representation of p(x) : many observations are expected to be
found in regions where p(x) is large, and few observations
are expected to be found in regions where p(x) is small.
Bins are expected to be tall where p(x) is large, and short
where p(x) is small. So, the "skyline" of the (appropriately
is expected to be a faithful, discretized, staircase-like representation
of p(x) itself (lower image of the illustration below).
A histogram is :
Simple as it is, the histogram provides a very good illustration of one of the most fundamental aspects of practical, down-to-earth statistical data modeling : the so-called bias-variance tradeoff. For histograms, the bias-variance tradeoff reads as follows :
So a good histogram must have just the right number of bins. What is this number ? The answer is disappointing : it cannot be calculated. But it could be roughly estimated by validation techniques.
This question of great practical importance is developed in more details in the next page. It is also illustrated in an interactive animation that you'll find here.
For more on the bias-variance tradeoff, please see here.