Sanderson M. Smith

# BAYES' THEOREM...A SIMPLE EXAMPLE

Thomas Bayes (died 1761, year of birth believed to be 1701) was a skillful mathematician, but he lived most of his life in England as an ordained nonconformist minister. None of his works on mathematics were published during his lifetime. He is credited with a theorem which has had a major influence on the development of modern statistics.

Notation: Prob(A) means "the probability of event A" and Prob(A|B) is "the probability of event A, given that event B has happened."

Bayes' Theorem: Prob(A|B)xProb(B) = Prob(B|A)xProb(A)

Now, Prob(A|B) and Prob(B|A) are often confused by even the most intelligent of people. The confusion often appears in legal cases and is sometimes called the Prosecutor's Fallacy. Bayes' Theorem relates these two distinct conditional probabilities.

Here is a simple example to illustrate Bayes' Theorem.

The situation: A serious crime has been committed. It is known that only one person was involved. There are only 25 people who could be considered as suspects. It is also known for certain that the individual who committed the crime has red hair.

The table at the right represents the population of individuals who could be considered as suspects.

In a random selection of an individual from the population, events A and B are defined as follows:

• A: person is innocent
• B: person has red hair

Using the displayed table, here are calculated probabilities.

• Prob(A) = 24/25 = 96%
• Prob(B) = 2/25 = 8%
• Prob(A|B) = 1/2 = 50%
• Prob(B|A) = 1/24 = 4.17%

We now note that

• Prob(A|B)xProb(B) = (1/2)((2/25) = 1/25
• Prob(B|A)xProb(A) = (1/24)(24/25) = 1/25

Bayes' Theorem does indeed work in this simple example.

RED: Individual with red hair.

NR: Individual whose hair color is not red.

 NR NR NR NR NR NR RED NR NR NR NR NR NR NR NR NR NR NR NR NR NR NR RED NR NR

We can conclude that the probability an innocent person has red hair is only 4.17%. This calculation probably won't be pleasing to the innocent person with red hair. The innocent redhead gets a better deal if it is reported that the probability that a person with red hair is innocent is 50%. Obviously, there is a big difference between 4% and 50%, yet the stated conditional probabilities frequently sound very much alike to those not used to using probability to evaluate uncertainties.

The example provided may be too simple to appreciate the power of Bayes' Theorem. Let's make an attempt to jazz it up a bit. Suppose my buddy, Herkimer, has statistical power and is assisting a lawyer defending a client with red hair who has been accused of a serious crime. The working premise is

• The population of suspects is very large.
• Only one person committed the crime.
• The person who committed the crime has red hair.
• The prosecutor tells the jury that the probability an innocent person has red hair is .0099, or 0.99%.

Wow! That .0099 could certainly sway the jury into thinking the individual on trial is guilty. But Herky now brings his statistical knowledge into the picture. The prosecutor has stated

Prob(red hair|innocent) = .0099

Herky's research shows that 1% of the population has red hair.

Bayes' Theorem tells Herky the following:

Prob(innocent|red hair)xProb(red hair) = Prob(red hair|innocent)xProb(innocent)

Using basic algebra, Herky obtains

Prob(innocent|red hair) = [Prob(red hair|innocent)xProb(innocent)]/Prob(red hair)

Since the population is large, the probability that a randomly chosen individual is innocent is extremely close to 100%. In other words, it is reasonable to assume Prob(innocent) = 1. Substitution into Bayes' Theorem yields

Prob(innocent|red hair) = [(.0099)(1)]/(.01) = .99, or 99%.

Double wow! The probability that an individual with red hair is innocent is 99%. What a contrast to the .0099 presented by the prosecutor, which, of course, represents an entirely different conditional probability.

If you want a model for this last presentation, consider a population of 10,000 with 100 redheads. This might make the mathematics involved in the use of Bayes' Theorem less abstract. And, think about this. The defense lawyer being assisted by Herky might want to present a model to the jury to convince them that Bayes' Theorem does produce meaningful results.

Here's an incredible story provided by Brian S. Everitt in his book Change Rules: An Informal Guide to Probability, Risk, and Statistics (Springer-Verlag New York, Inc., 1999). In 1997, the Court of Appeal in the United Kingdom issued the following statement:

Introducing Bayes' Theorem, or any similar method, into a criminal trial plunges the jury into inappropriate and unnecessary realms of complexity, deflecting them from their proper tasks. Reliance on evidence of this kind is a recipe for confusion, misunderstanding, and misjudgement.

Not surprisingly, these ill-considered comments were severely criticized by professional statisticians and organizations, including the Royal Statistical Society of London.. The Society condemned the arrogance of the legal profession for banning disciplined, scientific reasoning in court, and asked if the Pythagorean Theorem or Ohm's Law should be equally open to rejection.

There are three kinds of lies; Lies, damned lies, and statistics.

Benjamin Disraeli