"In God we trust. All others mustbring data."

Robert Hayden, Plymouth StateCollege


OVERVIEW: The binomial distributionis frequently useful in situations where there are two outcomes ofinterest, such as success or failure. It is often used to modelreal-life situations, and it finds its way into many extremely usefuland important statistical applications and computations.

The binomial setting:

(1) Each observation is in one of twocategories: success or failure.

(2) A fixed number, N, of observations.

(3) Observations are independent. (Knowing the result of oneobservation tells you nothing about the other observations.)

(4) The probability of success is the same foreach observation.

If a count, X, has a binomial distribution withnumber of observations, N, and probability of success, p, then

Mean(X) = mx = Np

Standard Deviation(X) = sx = sqrt[Np(1-p)]

The probability that one will get exactly ksuccesses is NCkpk(1-p)N-k.


I roll a single die 60 times. If Xrepresents the number of times I roll a "3", then

mx = 60(1/6) = 10.

sx = sqrt[60(1/6)(5/6)] =2.89.

The probability that I will roll exactly ten 3'sis 60C10(1/6)10(5/6)50 = .1370 = 13.7%.

On the TI-83, binompdf(100,1/6,10) =.1370


Example (Small sample size from large population. Use of binomial distribution is appropriate.):

Assume that 30% of a population isHispanic. A random sample of size 4 is chosen from this population. If X is the number of Hispanics in the sample, then

mx = (4)(.3) = 1.2

sx = sqrt[(4)(.3)(.7)] =0.9165

Prob(X=0) = 4C0(.3)0(.7)4 = 0.2401

Prob(X=1) = 4C1(.3)1(.7)3 = 0.4116

Prob(X=2) = 4C2(.3)2(.7)2 = 0.2646

Prob(X=3) = 4C3(.3)3(.7)1 = 0.0756

Prob(X=4) = 4C4(.3)4(.7)0= 0.0081

The probability that a sample would contain two orfewer Hispanics is Prob(X=0) + Prob(X=1) + Prob(X=2) = 0.2401 +0.4116 + 0.2646 = 0.9163 = 91.63%.

The TI-83 can be very useful in calculatingbinomial probabilities. For instance, the probability that thesample contains exactly 2 Hispanics is binompdf(4,.3,2) = .2646.

The probability that the sample contains 2 orfewer Hispanics is binomcdf(4,.3,2) = .9163.


NOTE: It is important to understand when one hasa binomial setting, and when one doesn't have such a setting. Forinstance, consider a regular shuffled deck of 52 cards.

Setting #1: I pick a card at randomand note whether or not it is a heart. I put the card back in thedeck, thoroughly shuffle the deck, and then randomly pick a card. Again, I note whether or not it is a heart. I repeat this processten times. If X is the total number of hearts I obtained in the tentrials, then this is a binomial setting. Possible values for X are0,1,2,3,4,5,6,7,8,9,10. [N = 10, p = 0.25, and each observation isindependent of the previous ones.]

Setting #2: Basically the same situation, exceptthat I do not put the randomly picked card back in the deck beforeeach reshuffling. After ten trials, possible values for X are0,1,2,3,4,5,6,7,8,9,10. However, this is not a binomial setting. Each observation after the first is not independent of the previousone.