Replacement (Sampling without)
Virtually all the probability distributions in this Glossary refer to infinite populations (with the exception of the hypergeometric distribution). Therefore, once an observation has been taken out of the population and put away in a sample, the population is unchanged. So all the draws are made from the same population, and that makes the drawn observations independent. In turn, this independence makes subsequent calculations "simple", and we call many times on this independence throughout the calculations you'll encounter.
Things are quite different when the population is finite.
Take an individual out of the population to measure one of its characteristics (the random variable). After the measurement is made, you can either :
* Return the individual to the population, from which it will possibly be selected again in a later draw. This is the sampling with replacement scheme. All the individuals are then selected from the same population, and the situation is not different from what it is when sampling from an infinite population.
* But the most common sampling scheme for finite populations consists in taking one individual at a time from the population, put it in the sample, and not return it to the population. This is the sampling without replacement scheme.
The consequences of not returning an individual to a finite population after measurement are profound. The next individual will be randomly selected from a new population, which is the previous population reduced by the previously selected individual, which itself had been randomly selected.
From a technical standpoint :
* In infinite populations, it is common to consider a n-sample as the realization of a n-random vector {X1, X2 , ..., Xn}, where the Xis are independent and identically distributed random variables.
* The same paradigm can be used with finite populations without replacement, except that the Xis are now not independent.
This lack of independence of the Xis is making life substantially more complicated. This will be made clear is this Tutorial, that addresses two of the most common activities of the Statistician : estimating a mean, and estimating a variance.
-----
Estimating a proportion (for example, the proportion of individuals that will prefer brand A over any other brand) will the comparatively simple, as it will turn out to be a special case of estimating a mean for an appropriately defined distribution in the population.
_______________________________________________________
|
Tutorial |
In this Tutorial :
1) We first show how the mean of a finite population can be estimated when sampling without replacement.
* We demonstrate that the sample mean is an unbiased estimator of the population mean, a not too surprising result.
* We then calculate the variance of this estimator. This calculation is quite involved, but also very instructive as we'll have to establish two intermediary results during the course of the demonstration :
- We'll see that as a consequence of the lack of independence between observations, the covariance between two such observations is not zero (as it would be in an infinite population), and we'll calculate it.
- In turn, establishing this result will require calculating the expectation of one observation conditionally to the observed value of another observation. Again, this expectation is not the mean of the population because of the observations are not independent.
2) The expression for the variance of the sample mean incorporates σ², the variance of the population, which is unknown. But we'll then identify an unbiased estimator of this variance. Again, the result is far from being as simple as it is for infinite populations.
3) We calculate the mean and variance of a proportion in the sample. This will turn out to be a special case of estimating a mean for an appropriately defined distribution in the population.
ESTIMATION IN A FINITE POPULATION
BY SAMPLING WITHOUT REPLACEMENT
|
Estimation of the population mean Unbiased estimator of the population mean Variance of the estimator Outline of the demonstration Conditional expectation of observations Covariance of two observations Variance of the estimator Estimation of the population variance First form Second form Estimation of a proportion Unbiased estimator of the proportion Variance of the estimator |
||
|
TUTORIAL |
||
______________________________________________________
Related readings :