Sanderson M. Smith

UNBIASED ESTIMATORS

Consider the number set P = {2,4,6}.

If P is a population, then

Mean = m = (2+4+6)/3 = 4.

Variance = s2 = [(2-4)2 + (4-4)2 + (6-4)2]/3 = 8/3 = 2.666667.

Standard deviation = sqrt(s2) = sqrt(8/3) = s = 1.632993.

NOTE: The formula for s2 involves dividing by the population size n. In this case, n = 3.

If P is a sample, then

Sample mean = x(bar) = (2+4+6)/2 = 4.

Sample variance = s2 = [(2-4)2 + (4-4)2 + (6-4)2]/2 = 8/2 = 4.

Sample standard deviation = sqrt(s2) = sqrt(4) = s = 2.

NOTE: The formula for s2 involves dividing by n-1. In the case, n=3. Hence n-1 = 2.

Now, let's consider P to be a population.

The table below shows all possible samples of size 2 chosen from P, with replacement. There would be 3x3 = 9 samples.

 Sample x(bar) for sample s2 for sample s for sample s2 for sample s for sample 2,2 2 0 0 0 0 2,4 3 2 1.414214 1 1 2,6 4 8 2.828427 4 2 4,2 3 2 1.414214 1 1 4,4 4 0 0 0 0 4,6 5 2 1.414214 1 1 6,2 4 8 2.828427 4 2 6,4 5 2 1.414214 1 1 6,6 6 0 0 0 0 Column Means 4 2.666667 1.257079 1.333333 0.888889

To summarize, we have listed all samples of size 2 (with replacement) from a population P of size 3. We have calculated statistics for each sample of size 2. Here is an important definition:

A statistic used to estimate a population parameter is unbiased if the mean of the sampling distribution of the statistic is equal to the true value of the parameter being estimated.

Note that

The mean of the sample means (4) is equal to m, the mean of the population P. This illustrates that a sample mean x(bar) is an unbiased statistic. It is sometimes stated that x(bar) is an unbiased estimator for the population parameter m .

The mean of the sample values of s2 (2.666667) is equal to s2 , the variance of the population P. This illustrates that the sample variance s2 is an unbiased statistic. It is sometimes stated that s2 is an unbiased estimator for the population variance s2.

Note carefully that the sample statistic s is not an unbiased statistic. That is, the mean of the s column in the table (1.257079) is not equal to the population parameter s = 1.632993.

Also, if you use the s2 formula for samples, the resulting statistics are not unbiased estimates for a population parameter. Note that the means for the last two columns in the table are not equal to population parameters.

In summary, the sample statistics x(bar) and s2 are unbiased estimators for the population mean m and population variance s2, respectively.