The Drunkard’s Walk

In one speech Charlie Munger told that, if you do not understand elementary probability then you go through a long life like a one-legged man in an ass kicking contest. I recently finished reading the book The Drunkard’s Walk: How Randomness Rules Our Lives by Leonard Mlodinow. The author teaches elementary probability and statistics using real life examples from fields like sports, psychology, medicine, and, financial markets. In the process he also shows how chance plays an important role in our lives. In this book, I came across few interesting problems in probability that I am sharing in this post.

The Monty Hall Problem

The Monty Hall problem is a probability puzzle, loosely based on the American television game show Let’s Make a Deal and named after its original host, Monty Hall. Given below is the problem statement.

Suppose the contestants on a game show are given the choice of three doors: Behind one door is a car; behind the others, goats. After a contestant picks a door, the host, who knows what’s behind all the doors, opens one of the unchosen doors, which reveals a goat. He then says to the contestant, “Do you want to switch to the other unopened door?” Is it to the contestant’s advantage to make the switch?

The problem appears very easy. But it is not. One Harvard professor who specializes in probability and statistics told – Our brains are just not wired to do probability problems very well. The video contains the solution to this problem.

Mammogram Test

In Germany and the United States, researchers asked physicians to estimate the probability that an asymptomatic woman between the ages of 40 and 50 who has a positive mammogram actually has breast cancer if 7 percent of mammograms show cancer when there is none. In addition, the doctors were told that the actual incidence was about 0.8 percent and that the false-negative rate about 10 percent.

Given that the mammogram test is positive what is the probability of the women having cancer?

In Germany several physicians estimated the probability to be 90 percent, and the median estimate was 70 percent. In the American group, 95 out of 100 physicians estimated the probability to be around 75 percent. The actual answer is around 9 percent. You can solve this problem by using Bayes theorem.

Let us assume that the total population = 1000

0.8% of the population might have cancer = 8 [1000 * 0.008]
10% is false-negative (negative test but they have cancer) = 0.8 [8 * 0.1]
Mammogram test is positive and having cancer = 7.2 [8 - 0.8]
7% is false-positive (positive test but they do not have cancer) = 70 [1000 * 0.07]

P(cancer with positive test) = Test positive and have cancer / Test positive
                             = 7.2 / (7.2 + 70)
                             = 7.2 /77.2
                             = 9.32%

Two-Daughter Problem

In a family there are 2 kids. If one of them is a girl then what is the probability that the other one is also a girl?

I came up with the answer as 50% without thinking. But it is incorrect. There are 2 kids and hence we have 4 possibilities - {boy, boy}, {boy, girl}, {girl, boy}, and, {girl, girl}. We know that one of the child is a girl hence we can eliminate {boy, boy}. We are left with 3 possibilities -  {boy, girl}, {girl, boy}, and, {girl, girl}. What we need is a {girl, girl}. Hence the probability is 1/3 or 33.33%

A small variation to the previous problem

In a family there are 2 kids. If the first one is a girl then what is the probability that the other one is also a girl?

This time the answer is 50%. As in the first case there are 4 possibilities - {boy, boy}, {boy, girl}, {girl, boy}, and, {girl, girl}. Since we know the first child is a girl we have 2 possibilities - {girl, girl} and {girl, boy}. What we need is a {girl, girl}. Hence the probability is 1/2 or 50%

Here is an interesting variation to the problem

In a family there are 2 kids. If one of the children is a girl named Florida, then what is the probability that the other one is also a girl?

Let girl named Florida be girl-F and the other girls be girl-NF. Now we have 9 possibilites - {boy, boy}, {boy, girl-NF}, {boy, girl-F}, {girl-NF, boy}, {girl-NF, girl-NF}, {girl-NF, girl-F}, {girl-F, boy}, {girl-F, girl-NF}, and {girl-F, girl-F}. We know that one of the girl is Florida and hence there are 5 possibilites - {boy, girl-F}, {girl-NF, girl-F}, {girl-F, boy}, {girl-F, girl-NF}, and {girl-F, girl-F}. Let us assume that no parents will name both their daughters as Florida. Hence we can remove {girl-F, girl-F} and now we have 4 possibilities - {boy, girl-F}, {girl-NF, girl-F}, {girl-F, boy}, and, {girl-F, girl-NF}. What we need is {girl-NF, girl-F} and {girl-F, girl-NF}. Hence the probability is 2 / 4 or 50%.

Closing Thoughts

Knowing the basics of elementary probability is not very hard. But having it in a usable form and applying it in our life everyday is very hard. The only way to acquire is to learn it properly and use it everyday. Here is an advice from Charlie Munger.

By and large, as it works out, people can't naturally and automatically do this. If you understand elementary psychology, the reason they can't is really quite simple: The basic neural network of the brain is there through broad genetic and cultural evolution. And it's not Fermat/Pascal. It uses a very crude, shortcut-type of approximation. It's got elements of Fermat/Pascal in it. However, it's not good. So you have to learn in a very usable way this very elementary math and use it routinely in life—just the way if you want to become a golfer, you can't use the natural swing that broad evolution gave you. You have to learn—to have a certain grip and swing in a different way to realize your full potential as a golfer.

15 thoughts on “The Drunkard’s Walk

  1. If I toss two dice, and one of them is a 6, then what is the probability that the other one is also a 6?

    Obviously 1/6, because the two dice are independent. Neither one has any effect on the other.

    Equally, if there are two kids and one of them is a girl, the probability that the other one is a girl is 50%, because the gender of each is unaffected by the gender of the other.

    You could just as well frame the problem like this: if there are two kids, and one of them is “a mystery to be revealed later”, then what is the probability that the other one is a girl?… 50%, every time.

    Also, the idea that the name of the child would affect the probability is ludicrous.

    • Noel,

      Great points and I will try my best to clarify.

      Instead of a single die let us take I roll two dice and tell you that one of them is 6 and ask you what is the probability of the other being 6. Now what is the probability? It is not 1/6. Let me show you how.

      Since there are 2 dices there are 36 possible outcomes – {1,1}, {1,2}, … {6,5}, {6,6}. But since I told you that one of them is a six, your sample space is {1,6}, {2,6}, {3, 6}, {4,6}, {5,6}, {6,1}, {6,2}, {6,3}, {6,4}, {6,5}, and, {6,6}. Your starting sample space has 11 possibilities. The one that we are interested in is {6, 6}. So the probability is Favorable Outcome / Total Number of outcomes which means it is 1 / 11 or 9.09%. You can use the same logic to derive it for the {boy, girl} problem

      For girl name with Florida some numbers will explain it clearly. Imagine there are 1000 families with 2 kids. Of which let us assume 25% (250 families) of them have 2 girls, 50% (500 families) of the 1 boy and 1 girl, and 25% (250 families) of 2 girls. I told you that one of the girl is Florida. Every girl will not be named Florida and let us assume that 10% of them have name Florida and each family will have only one Florida. The first thing to find out is the sample space.

      Since we know one of them is Florida 25% of 2 boys can be removed. Also 10% of the names are Florida, so…

      In 500 families of 1 boy and 1 girl there are 50 girls with name Florida (500 * 0.1) and hence there are 50 families. In 250 families of 2 girls there are 50 girls with name Florida (250 * 2 * 0.1) and hence there are 50 families.

      So your starting sample space is 100 (50 + 50) families. Of which we want is the one with 2 girls which is 50. Hence the probability is 50 / 100 => 1 / 2

      Initially I thought Florida name did not matter. But after working on a paper with some number the concept started to make sense.

      Regards,
      Jana

      • The third example-the girl named Florida, is not legit. Note that you only stipulated in your answer, “let us assume no parent will name both their children Florida”. Without that stipulation, the probability is 1/3, just as in the first problem. No fair adding a new condition in the answer–that’s changing the rules of the game!

  2. Following up on my previous comment, here is the fallacy in the article author’s reasoning:

    When setting out permutations where no element is known ahead of time, the elements of the permutation are distinct even if they look the same. So instead of the possible permutations being

    {boy, boy}, {boy, girl}, {girl, boy}, and {girl, girl}

    as the author claimed, the possible permutations should be

    {boy1, boy2}, {boy2, boy1}, {boy, girl}, {girl, boy}, {girl1, girl2}, and {girl2, girl1}

    Then if it is known that one child is a girl, the valid permutations are reduced to

    {boy, girl}, {girl, boy}, {girl1, girl2}, and {girl2, girl1}

    and so the probability that the other child is a girl is 2/4 or 50%.

    Further, if it is known that the firstborn child is a girl, the valid permutations are

    {girl1, boy} and {girl1, girl2}

    and so the probability that the secondborn child is a girl is 1/2 or 50%.

    • Noel is correct. If the problem specifies ordering, then the possibilities become:
      When the girl is first, {girl,girl}, {girl,boy} and when girl is second, {girl,girl}, {boy,girl}
      If ordering then ordering is probable.

  3. Upon reflection, I should have used different terminology in the last part of that comment, as follows:

    Further, if it is known that the firstborn child is a girl, the valid permutations are

    {X, boy} and {X, girl}, where X is the specified firstborn girl,

    and so the probability that the secondborn child is a girl is 1/2 or 50%.

  4. It is a great book, I read it a few years ago. If you like this one then “The Black Swan” is another that you will like to read and re-read. Dr. Steve

  5. “A girl named Florida” is not guaranteed to be distinct from “a girl”. Let’s imagine that the couple,on their first date, talking about baby names, say “We should name our first boy Dakota and our first girl Florida.” In that case there are the following options: (let B-D be Boy named Dakota, B-ND be Boy not named Dakota, G-F and G-NF be Girl named Florida and Girl not named Florida): {B-D, B-ND}, {B-D, G-F}, {G-F, G-NF}, {G-F, B-D} . In this specific case you still get 2:1 that the other child is a boy.

    If they pick names out of a hat like nuclear weapon tests, your analysis is valid. “Ivy Mike”, that’s a pretty name!

  6. IMHO: I agree with the first 2 results. But the result of the 3rd problem (Florida girl) is wrong.
    It’s wrong because when you created the set of possibilities, getting a total of 9, you assumed there are 4 times less boy-boy pair of brothers in this world than girl-girl pair of sisters (because you have 1 {boy, boy} and 4 {girl-NF, girl-NF}, {girl-NF, girl-F}, {girl-F, girl-NF}, {girl-F, girl-F}), which is wrong. The % of BB in a set should be equal to GG, and not 1/9 for BB vs 4/9 for GG like in your set.

    The problem can not be solved because we have no information on the % of girls named Florida in the total number of girls.

    • Adrian,

      Yes I agree with you that the problem needs percentage of girls named Florida. I used some numbers to give some clarity to this. I am assuming 10% of the names are Florida.

      For girl name with Florida some numbers will explain it clearly. Imagine there are 1000 families with 2 kids. Of which let us assume 25% (250 families) of them have 2 girls, 50% (500 families) of the 1 boy and 1 girl, and 25% (250 families) of 2 girls. I told you that one of the girl is Florida. Every girl will not be named Florida and let us assume that 10% of them have name Florida and each family will have only one Florida. The first thing to find out is the sample space.

      Since we know one of them is Florida 25% of 2 boys can be removed. Also 10% of the names are Florida, so…

      In 500 families of 1 boy and 1 girl there are 50 girls with name Florida (500 * 0.1) and hence there are 50 families. In 250 families of 2 girls there are 50 girls with name Florida (250 * 2 * 0.1) and hence there are 50 families.

      So your starting sample space is 100 (50 + 50) families. Of which we want is the one with 2 girls which is 50. Hence the probability is 50 / 100 => 1 / 2

      Regards,
      Jana

      • Hmm.. It is very counter intuitive for me to have different results (no matter what results) for these 2 problems: “In a family there are 2 kids. If one of them is a girl then what is the probability that the other one is also a girl?” [1] and “In a family there are 2 kids. If one of the children is a girl named Florida, then what is the probability that the other one is also a girl?” [2]. [1] can be rephrased like “In a family there are 2 kids. If one of them is a girl which has a certain name, then what is the probability that the other one is also a girl?” What result will we have for this case?

        If I am not wrong, using same steps like you did in your last comment, I will get the same “1/2″ result, no matter if % of girls named Florida is 10% (like you assumed), or 20%, 30% etc. Can you confirm that? If this is correct, it means we can solve the problem not knowing the % of girls named Florida.

        Still, I think there is something wrong in the way you solved the problem, even if I can’t point that now.

        I bookmarked this page and when I’ll have some time all come back and rethink on all these.

  7. The video for the Monty Hall Problem gives the correct answer; but although it is better than most explanations, it does not give the complete solution. Monty Hall gives you some information by which door he opens – but it is only useful if he is biased. The solution given is incomplete because it does not use that information.

    Say you somehow know that he always opens Door #3 on Mondays (and today is Monday), and that you picked Door #1. If you see him open Door #2, the car must be behind Door #3, and there is a 100% chance that switching doors wins. But if you see him open Door #3, there is only a 50% chance that switching wins. Note that this does not contradict what the video says – the first case happens only in the 33% of the cases where the car is behind Door #3, while the second happens in the other 67%. So the two average out to the 67% claimed in the video.

    I’m not suggesting you should assume such a bias exists, just that the complete solution should use the information contained in Monty Hall’s choice. If your original choice was a goat (a 2/3 chance), the car is equally likely to be behind either of the other two doors. But then Monty has no choice. That makes a 1/3 chance that you have a goat AND Monty Hall would open the door he did. If you originally picked the car (a 1/3 chance), Monty must choose. If we let B represent the probability he would open the door he did, there is a B/3 chance that you have the car AND Monty Hall would open the door he did. So the chances that switching will win is the first of these probabilities, divided by their sum. That’s 1/(1+B). However, You can only assume B=1/2, which makes this 2/3.

    I’m not bringing this up to criticize the video, but because the Two-Daughter Problem works the same way. The answer depends on why you know that one of them is a girl in exactly the same way the previous answer depends on Monty Hall’s theoretical bias. If, in the cases where the family has a son AND a daughter, there is a 100% chance you would know about the daughter and a 0% chance you would know about the son, then Leonard Mlodinow’s solution is correct and there is a 1/3 chance of two daughters. But if the probability is B for knowing “daughter” and (1-B) for knowing “son,” the answer is 1/(1+2B). This is 1/2 if you assume B=1/2. The question is, why should we assume anything else? Specifically, why should we assume B=1 as Mlodinow does?

    There are several things wrong with Mlodinow’s “Florida” version that I won’t go into. The answer changes in his scenario, where he assumes B=1, because he is applying a similar bias for the name. That is, he assumes that you can only know about Florida no matter what gender, or name, her sibling has. His answer is really “a little less than 1/2,” and it changes because a family of two girls is almost twice as likely to meet his requirement of having one named “Florida.” When you make your knowledge unbiased, the answer is 1/2 no matter what else you know about a daughter.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s