I don’t think this is very good, but I’m tired of working on it and I said I’d write it, so here it is. See http://yudkowsky.net/rational/bayes for a better essay on thinking about probability.

A friend asked me about this oft-cited probability problem:

A woman and a man (who are unrelated) each have two children. We know that at least one of the woman’s children is a boy and that the man’s oldest child is a boy. Can you explain why the chances that the woman has two boys do not equal the chances that the man has two boys?

(text copied from here, which attributes the problem’s popularity to an “Ask Marilyn” column in 1996)

Given a tricky problem, some people are stumped, clever people find a tricky answer, and brilliant people find a way of thinking so the problem isn’t tricky any more. This problem is about interpreting evidence, and the brilliant person who discovered how to think about evidence is Laplace. (His method is called Bayes’ Rule. Bayes came first, but Laplace seems more important based on my cursory reading of the history.) To illustrate it, let’s start with different problem, related only by the fact that it’s also about using evidence.

I picked a day of the week by first choosing randomly between weekend and weekday, then picking randomly inside that set. I wrote my chosen day down and picked a letter at random. The letter was ‘d’. What is your best guess for the day that I picked? How confident are you? (“At random” means there was uniform probability.)

First, write out all the possibilities. These are called the “hypotheses” (plural of “hypothesis”).

- Saturday
- Sunday
- Monday
- Tuesday
- Wednesday
- Thursday
- Friday

Next, find their starting probabilities, the ones we have before learning what letter was picked. Half the probabilities goes to the weekend, so they’re 1/4 each. The rest goes to the five weekdays, so they’re 1/10 each. This is called the “prior”. The picture shows lengths proportional to probability.

Let’s imagine we did this experiment 10,080 times (a number I’m choosing since it will work out well later). Then the weekends come up 2520 times each and the weekdays 1008 times each. (This is the expectation value for how many times these days would show up. In a real experiment there would be random deviations.)

Next we look at the evidence – the letter ‘d’. Out of our 10,080 experiments, how many have the letter ‘d’ result? It turns out this will happen 1565 times – 315 times on Saturday, 420 times on Sunday, etc.

The illustration looks like this (partially filled in)

The bottom bar represents all the trial where the letter ‘d’ popped up. It is too small to read, so we’ll blow it up.

We can divide by the total number of times the letter ‘d’ came up to get the probabilities. (Individually this step is called “normalizing”, but it’s really part of updating.)

We don’t really need to consider doing the experiment 10,080 times; I just thought that made it more convenient to visualize. What’s important is the probability distribution at the end. This solves the problem. We now know, given that ‘d’ was the chosen letter, the probability for each day of the week.

To recap, an outline for the procedure is

- Find all the possibilities (i.e. define the hypotheses).
- Determine how likely the hypotheses are beforehand (i.e. choose the prior).
- Update the hypotheses based on how well they explain the data (i.e. multiply them by their probabilites of producing the evidence).
- Finish updating by making the hypotheses’ probability add up to one (i.e. normalize).

Let’s reword the original problem to make it clear exactly what evidence we’re collecting, then apply the method:

There are two bros who like to tan their balls. Unfortunately, this can cause testicular cancer. Given their amount of ball-tanning, each testicle of each bro has a 50% chance of having cancer. (The testicles are all statistically independent of each other.) The two bros decide to conduct testicular exams to see whether they have cancer, and their self-administered exams are perfectly accurate. The first bro decides to examine both his testicles, then report whether or not at least one of them has cancer. The second bro decides to examine only his left testicle because he thinks that examining both would count as cradling them and be gay. Suppose the first bro reports that indeed, at least one of his balls has cancer. The second bro reports that his left ball has cancer. Do they have the same probability of having two balls with cancer?

The bros go about collecting different data. The evidence they bring to bear is different, so a good way of handling evidence should be able to show us that the probabilities are different. But there is no need to be clever about finding the solution when you have a general method at hand.

The hypotheses we’ll use are left/right both cancerous, left cancerous/right healthy, left healthy/right cancerous, both healthy. They are all equally likely.

In this case, updating is very simple. Each hypothesis explains the results either perfectly or not at all. Here is what updating looks like for the bro who tested both balls:

Here it is for the bro who tested on the left ball:

So the bro who tested both balls has a 1/3 chance of having cancer in both balls, while the bro who tested the left ball has a 1/2 chance. Both came back with a positive result, but the bro who tested both balls has weaker evidence. It is easier to satisfy “at least one ball has cancer” than to satisfy “the left ball has cancer”, so when he comes back and reports that at least one ball has cancer, he hasn’t given as much information, and his probability doesn’t shift upwards as much. A strong test is one which the hypothesis of interest passes, but which eliminates the competing hypotheses. More hypotheses pass the test “at least one ball has cancer”, so that test doesn’t eliminate as much and is not as strong.

Let’s apply the same method to the Monty Hall problem. Your hypotheses are that the prize is behind door one, door two, or door three. These are equally-likely to begin. You choose door one, and Monty Hall opens door two. That’s the evidence.

If the prize is behind door one (the door you chose), Monty Hall could have opened either door 2 or door 3, so there was a 50% chance to see the observed evidence. If the prize was behind door 2, there was 0% chance, so that hypothesis is gone. If the prize was behind door 3, Monty Hall was forced to open door two, so that hypothesis explained the evidence 100% and becomes more likely.

Here’s a diagram for the probabilities after updating:

So you should switch doors. The usefulness of this method is you don’t need a clever trick specific to the Monty Hall problem. As long as you understand how evidence works, you can solve the problem without thinking and spend your limited thinking power somewhere else.

Within this framework, we can notice some things that help make the problem more intuitive, even after it’s solved. Suppose you have a hypothesis and you collect some evidence. How much more-likely does the hypothesis become? What matters is how much better the hypothesis explains the data than the other hypotheses, not just how well it does by itself. So, if you have two hypotheses and they both explain the data equally well, that data isn’t evidence either way.

You can apply this idea to the bro who tested his left testicle. Whether the right testicle has cancer or not, the data on the left testicle is equally-well explained. Therefore, the right testicle’s probability remains unchanged from 50%, so the bro has a 50% chance of two cancerous balls.

The same is not true for the bro who tested both testicles. If that bro has cancer in his right testicle, that explains the result better than if he doesn’t. That is to say, if the bro has cancer on his right testicle, that explains the result perfectly. But if he doesn’t, there’s only a 50% chance of explaining the result. As a result, the 1:1 odds for the right testicle get multiplied by 2:1 to give 2:1 odds. The probability for his right ball to have cancer increases to 2/3. (The probability for his left testicle to have cancer is also 2/3, but they are not independent.)

In the Monty Hall problem, Monty Hall will reveal an empty door no matter what, so the hypothesis “the prize was behind your original door” explains the data just as well as “the prize was not behind your original door”. Therefore, the probability that the prize was behind your original door doesn’t change; there was no evidence for it. So your original door still has a 1/3 chance. So good Bayesians are not confused by the Monty Hall problem.

Tags: bayes' law, bayes' rule, evidence, inference, probability, problems, statistics, tricky, tricky problems

## Leave a Reply