"If you believe in miracles, head forthe Keno lounge."

Jimmy the Greek

10.2 TESTS OF SIGNIFICANCE (Pages 531 - 556)

OVERVIEW: This is certainly one of themost important sections in the book. It contains what I would callthe "meat" of a first-year statistics course. There is a lot ofimportant material in this section, and it is difficult to summarizeit in a complete and thorough manner. Two examples will be used toillustrate the concepts introduced. Be willing to spend timeunderstanding the text material, using the summary below as asupplement.

Example 1:

A tire manufacturer advertised that a new brand oftire have a mean life of 40,000 miles with a standard deviation of1,500 miles. A research team tested a random sample of 100 of thesetires and obtained a mean life of 39,500 miles.

Given the sample results, we can now ask if themanufacturer's claim is reasonable by asking how likely is it thatone would obtain a random sample of 100 tires with a mean life of39,500 miles from a population with a mean life of 40,000 miles and astandard deviation of 1,500 miles. If we consider the set of means ofall samples of size 100, the Central Limit Theorem says that the sethas a mean of 40,000 and a standard deviation of 1,500/sqrt(100) =150. We have z39,500 = (39,500 - 40,000)/150 = -3.33. The normal distributiontable shows that it is highly unlikely that one would obtain such asample if the population is as indicated. Using the TI-83, we havenormalcdf(-1E99,39500,40000,150) = .000429, or about .0429%. This isthe P-value of the test. Since the probability is so small, we wouldlikely conclude that the manufacturer's claim is incorrect and thatthe mean life of the tires is something less than 40,000miles.

OK, let's formalize this presentation a bit:

Null hypothesis H0: m = 40,000.

Alternate hypothesis Ha: m < 40,000.

Type of test: 1-tail.

Level of significance: 1% (usually determinedbefore test).

Critical region: z < -2.33 (From normaldistribution table, or invNorm(.01,0,1) =-2.326347877).

Calculated sample mean: 39,300.

Mean of sample means: 40,000 (Central LimitTheorem).

Standard deviation of sample means: 1500/sqrt(100)= 150 (Central Limit Theorem)

Calculated z for sample: z39500 = -3.33

Since the calculated z is in the critical region,we reject H0.

P-value for test: .0429%.

Note: Since the P-value is less than 1%, thissupports the rejection of H0.

The example above involved a 1-tail test. We wereinterested in a deviation in one direction only. It isn't a concernto consumers if the mean life of the tires is more than theadvertised value of 40,000 miles. In the next example, a 2-tail testwill be illustrated. That is, we are interested in deviations in twodirections.

Example 2:

An automatic opening device for parachutes has astated mean release time of 10 seconds with a standard deviation of 3seconds. To test this claim, a parachute club tested a random sampleof 36 of these devices and found the mean release time to be 10.6seconds. Is this result statistically significant at the 5% level ofsignificance?

Null hypothesis H0: m = 10

Alternate hypothesis Ha: m is not equal to 10

Type of test: 2-tail

Level of significance: 5% (usually determinedbefore test).

Critical region: z < -1.96 or z > 1.96 (Fromnormal distribution table, orinvNorm(.025,0,1) = -1.959963986).

Calculated sample mean: 10.6.

Mean of sample means: 10 (Central LimitTheorem).

Standard deviation of sample means: 3/sqrt(36) =0.5 (Central Limit Theorem).

Calculated z for sample: z10.6 = (10.6-10)/0.5 =1.2.

Since the calculated z is not in the criticalregion, we do not reject H0.

P-value for test: 2normalcdf(10.6,1E99,10,0.5)=2(.11507) = .23014

Note: Since the P-value is greater than 5%, thissupports the decision to not reject H0. That is, there is notstrong evidence to suggest that this sample did not come from apopulation with mean = 10. (Careful here: This doesn't mean thatthere is strong evidence to suggest that the mean of the populationis 10. Think about this.)

Things to note:

(1) A null hypothesis is basically a hypothesis ofno change. Basically, if one does not reject a null hypothesis, thenthe test results are not statistically significant.

(2) Accepting a null hypothesis at a low level ofsignificance is not strong evidence that it is true. Acceptance of anull hypothesis such as m = 10 simply means that it isnot unreasonable to assume that the population mean is 10. For allyou know, it might be some other number that is close to 10.

(3) Rejecting a null hypothesis is equivalent tosaying that test statistic is statistically significant. That is, thecalculated test statistics is not a likely result of purechance.

(4) The null and alternate hypotheses are bothstated in terms of population parameters, not sample statistics. Youare attempting to use sample statistics to come to reasonableconclusions about population parameters.

(5) A P-value is the probability that one wouldobtain a statistic as "extreme" as that which was calculated from thesample. A small P-value, such as .01, means that the statistic is notlikely he result of pure chance. A P-value such as .35 means that thestatistics is not an unusual or unexpected result.

(6) The level of significance for a test isusually set beforehand. If a calculated P-value is smaller than thelevel of significance, then the test statistic is statisticallysignificant.

(7) A statistical test could be 2-tail or 1-tail.Which one is used depends on the purpose and nature of the test. Ifboth positive and negative deviations from a parameter are important,then one should use a 2-tail test. If only positive (or negative)deviations are important, then one should use of a 1-tail test.