SOMETIMES A CONFIDENCE INTERVAL MAKES NO SENSE

Sanderson M. Smith

SOMETIMES A CONFIDENCE INTERVAL MAKES NO SENSE

It is sometimes useful to demonstrate that there are times when statistical computations or constructions make no sense. Below one can observe two samples of size 12. For each sample, we will construct a 95% confidence interval for the population mean. In one case, the construction makes sense, and it the other case it does not!

Since the sample sizes are small, the t distribution would be used. Here are some general rules for using a t distribution.

It is used for small sample sizes (generally less than 30) when population parameters are not known.
If sample size is less then 15, the t distribution can be used if the data is close to normal.
If sample size is at least 15, the t distribution can be used except in the presence of outliers or strong skewness.
If sample size is large (40 or more), the t distribution can be used for clearly skewed distributions.

Using these guidelines, a confidence interval construction makes sense for Sample #1, but not for Sample #2. However, we will construct 95% confidence intervals for both samples.

SAMPLE #1

SAMPLE #2

98
0

78
100

88
100

77
100

88
100

89
100

82
100

92
100

93
100

88
100

84
100

86
100

N = sample size

12
12

Mean

86.917
91.667

s

6.067
28.868

SE = standard error = s/ ÷N

1.751
8.333

Degrees of freedom

11
11

t value (95% CI)

2.201
2.201

95% CI

(Mean) plus/minus t*(SE)

83.062 to 90.772

73.325 to 117.650

Now, assume that I was not aware of the actual sample values, but I knew that [83.062, 90.772] was a 95% confidence interval for a mean calculated from a sample of size 12. Assuming that everything was done properly, and that the conditions for use of the t distribution were met, it would not be unreasonable to think that the mean of the population from which the sample came was, for example, 85. Note that I am not suggesting the mean is 85. I'm simply saying that 85 would not be an unreasonable guess for the mean, whereas a value of 78 (for instance), would not be statistically reasonable.

Now consider Sample #2. Assume I was not aware of the actual sample values, but I knew that [73.325, 117.650] was a 95% confidence interval for the mean calculated from a sample of size 12. If I made the (incorrect) assumption that the conditions for use of the t distribution were met, then it would not be unreasonable to think that the mean of the population from which the sample came was 112. However, one need only look at the sample values to realize that the value of 112 is not a reasonable estimate of the population mean.

The moral of the story is that meaningful use of statistical methods often depends on certain conditions being met. If these conditions are not met, then statistical computations, while they can be done, can be very misleading or inaccurate.

"He uses statistics like a drunker man uses a lamppost...for support rather than illumination."

ANDREW LANG

RETURN TO WRITING HOME PAGE

Previous Page | Print This Page

	SAMPLE #1	SAMPLE #2
	98	0
	78	100
	88	100
	77	100
	88	100
	89	100
	82	100
	92	100
	93	100
	88	100
	84	100
	86	100

N = sample size	12	12
Mean	86.917	91.667
s	6.067	28.868
SE = standard error = s/ ÷N	1.751	8.333
Degrees of freedom	11	11
t value (95% CI)	2.201	2.201

95% CI (Mean) plus/minus t*(SE)	83.062 to 90.772	73.325 to 117.650