The binomial distributions

"Research is the process of going upalleys to see if they are blind."

Marston Bates

12.2 COMPARING TWO PROPORTIONS (Pages678 - 689)

OVERVIEW: This section, although notlengthy in terms of pages, has lots of "meat" in it. There are twostandard error formulas that are commonly used when comparingproportions from two independent samples. Both formula are listed onthe formula sheet that is provided to those who take the AdvancedPlacement Statistics Examination. On page 679, the authors remindreaders that "variances add" when you are talking about differences.This thought is important in understanding the standard error formulalisted on page 681. Just a reminder... the "variances add" idea goesback to page 400. Review, if necessary.

I'll attempt to summarize this important sectionwith two examples:

Example 1 (In whichwe assume that there is no reason to assume that the samples comefrom populations with equal variances.) We will use formulas shown onpage 681.

Do voters want State Proposition A? Assume that the following represents the results foundwhen examining two random samples from two different towns in thestate.

Sample Size

# wanting Prop. A

Sample Proportion

Sample #1

N₁= 93

74
p₁(hat) = .796

Sample #2

N₂ = 87

54
p₂(hat) = .621
SE = sqrt[(.796)(1-.796)/93 + (.621)(1-.621)/87] =0.06672
Since the sampling distribution ofp₁(hat) -p₂(hat)isapproximately normal, the 95% confidence interval forp₁ -p₂ is
(.796-.621) plus/minus 1.96(0.06672) = 0.174plus/minus 0.1308 = [4.32%, 30.48%].
Since this is a 95% confidence interval, we knowthat 19 out of 20 confidence intervals constructed in this fashionwould contain the true percentage difference. In other words, it isstatistically reasonable to assume that the actual value for theparameter p₁- p₂ isbetween 4% and 30%. Since this interval does not contain 0%, it isreasonable to conclude that these samples came from populations thatdiffer in their support for Proposition A.

Example 2 (In whichwe initially assume that the samples come from populations with equalvariances.) We will use formulas shown on page 684.

Assume we have the following statistical resultsfrom a research project.

Sample Size

# successes

Sample Proportion

Test group

N₁= 1,865

103
p₁(hat) = .0552

Placebo group

N₂ = 1,712

79
p₂(hat) = .0461
Null hypothesis H₀: p₁= p₂
Alternate hypothesis Ha: p₁is not equal top₂.
Pooling the data, we obtain p(hat) =(103+79)/(1895+1712) = .0509.
z = (.0552-.0461)/sqrt[(.0509)(1-.0509)(1/1865 +1/1712)] = 1.24.
At the 5% level of significance (2-tail), thecritical region is z < -1.96 or z > 1.96. Since our calculatedz of 1.24 is not in this region, we would not rejectH₀.
Alternate approach:Using the TI-83, normalcdf(1.24,1E99,0,1) = .1074877610. Since this is a 2-tailsituation, the P-value is 2(.1074877610), or about 21.5%. There isnot strong evidence to reject H₀ .

RETURN TO TEXTBOOK HOME PAGE

	Sample Size	# wanting Prop. A	Sample Proportion
Sample #1	N₁= 93	74	p₁(hat) = .796
Sample #2	N₂ = 87	54	p₂(hat) = .621

	Sample Size	# successes	Sample Proportion
Test group	N₁= 1,865	103	p₁(hat) = .0552
Placebo group	N₂ = 1,712	79	p₂(hat) = .0461