Archive for the ‘math’ Category

My Brown Big Spiders

March 21, 2011

Professor: You have to learn to be able to play it blindfolded. The page, for God’s sake! The notes!

David: I’m sorry I was, uh, forgetting them, Professor.

Professor: Would it be asking too much to learn them first?

David: And-And then forget them?

Professor: Precisely.

from the movie Shine

If I want to find the volume and surface area of a sphere, I do it with calculus:

V = \int_{r = 0}^R\int^{2\pi}_{\phi = 0}\int_{\theta = 0}^\pi r^2\sin\theta \textrm{d}\theta \textrm{d}\phi \textrm{d}r  =  \frac{4}{3}\pi R^3


S = \int_{\theta = 0}^\pi\int_{\phi = 0}^{2\pi} R^2 \sin\theta\textrm{d}\theta\textrm{d}\phi = 4\pi R^2

This is correct, but I can’t use it with high school geometry students because they don’t know what an integral is, much less a Jacobian.

However, Archimedes came up with a beautiful way of discovering the volume and surface area of a sphere. He did it by relating the sphere to a known shape – a cylinder with a cone cut out of it.

He drew a picture like this:

Archimedes' illustration of the geometry of a sphere

On the left there’s a hemisphere with radius R. On the right, there’s a cylinder with radius and height both also R, so that the hemisphere would fit perfectly inside the cylinder. The cylinder has had a cone cut out from the top down tapering down to the center of the bottom. First, we’ll show that these two shapes have the same volume.

We imagine slicing the hemisphere horizontally at some certain height h. This would reveal a circle as seen in the picture. Call its radius r.

At the same height, we also slice the cylinder, leaving us with a disk. We’ll find the areas of this circle and disk.

The area of the circle is \pi r^2, which by the Pythagorean theorem is also \pi (R^2 - h^2).

Looking at the cylinder, the outer edge of the disk has radius R and the inner edge has radius h, so the area of the disk is also \pi (R^2 - h^2).

Because every horizontal slice of the hemisphere has the same area as the corresponding horizontal slice of the drilled-out cylinder, they must have the same volume. The volume of the cylinder is its original volume minus the volume of the cone, or \pi R^3 - 1/3 \pi R^3 = 2/3 \pi R^3. Hence, the volume of a full sphere is

V = 4/3 \pi R^3

Next, we’ll show that the hemisphere has the same surface area as the outside of the cylinder (the cone is now unimportant).

Take a slice of the outside of the cylinder at height h and of thickness \textrm{d}h. This forms a band around the cylinder whose area is

\textrm{d}S = 2 \pi R \textrm{d}h

Now slice the sphere at the same height with the same \textrm{d}h. This also forms a band. The band is a shorter distance around, but due to the slant of the edge of the circle, it’s also thicker. Let’s call the thickness of this band \textrm{d}x.

Slices of equal thickness dh at equal heights h on a cylinder and sphere.

The area of the band around the hemisphere is the circumference at height h multiplied by the thickness \textrm{d}x.

\textrm{d}S = 2\pi\sqrt{R^2 - h^2}\textrm{d}x

If we draw a tangent line on the sphere, it’s perpendicular to the radius. This gives us similar triangles.


\frac{\textrm{d}x}{\textrm{d}h} = \frac{R}{\sqrt{R^2 - h^2}}

Plugging back into the previous expression,

\textrm{d}S = 2\pi\sqrt{R^2 - h^2}*\textrm{d}h * \frac{R}{\sqrt{R^2 - h^2}}  = 2\pi R \textrm{d}h

So the band around the outside of the cylinder and sphere have the same surface area, so the entire shapes have the same surface area. That makes the surface area of a sphere

S = 4 \pi R^2

This is a really lovely argument. The problem is pretty hard, but the solution is simple. (I’m not sure if this is quite how Archimedes did it. To be honest I never even met the guy. I learned the idea from this animation).

I was reviewing solid geometry with a high school junior the other day, so I showed her this argument (but only the volume part). I was proud of myself for offering this little example of how interesting mathematical ideas can be. At least, I was as we began.

“It’s all so complicated!” she moaned a few minutes later when I asked her to identify a certain quantity in our sketch.

Complicated? I had thought the argument was remarkably simple – just draw a sphere and a cylinder next to each other and you’re practically done. What could be simpler? Somehow my student was getting entangled in brambles I couldn’t even see.

I did not draw quite the same picture for her that I drew earlier in this post. I didn’t want to give it all away, so I drew something more like this and asked for r:

Finding r is a simple application of something she knew well – the Pythagorean theorem. She didn’t see it, though, so I showed her this right triangle:

But then she didn’t see how long the new line I just drew was. It’s just R because it’s a radius of the sphere, but although she knew that all radii of a sphere have the same length, she couldn’t easily identify the two lines as radii and call up the relevant information. So I showed her that step, too.

After a bit more prodding, she wrote down r = \sqrt{R^2 + h^2}, a mistake that comes from applying the Pythagorean theorem incorrectly. She knows better, and should have found r^2 = R^2 - h^2, but by this point she was already flustered from her earlier mistakes, confused about what we were trying to do, self-conscious, and generally unable to approach the problem equanimously.

When she realized she had applied the Pythagorean theorem wrong, her frustration mounted, and moments later, at my next question, I was shocked with, “It’s all so complicated!”

Why did this happen? Why did I so horribly misjudge the difficulty of the exercise?

The other day I read this comment on an essay on teaching

I used to teach English as a second language. It was a mind trip.

I remember one of my students saying something like “I saw a brown big spider”. I responded “No, it should be ‘big brown spider'”. He asked why. Not only did I not know the rule involved, I had never even imagined that anyone would ever say it the other way until that moment.

Tutoring has been exposing my own brown big spiders – the little steps and bits of knowledge that I take for granted – for years. I’ve rarely stopped to notice it.

Just to follow each step in the Archimedes argument, you must make an enormous number of mathematical connections behind the scenes in your mind. Here’s a partial list:

  • A “sphere” is a round three-dimensional object like, a ball
  • Every point on the surface of a sphere is the same distance from the center
  • The “surface” of the sphere means its outside edge, or skin
  • A “point” is a little dot with no size at all. It simply marks a place.
  • You can represent three-dimensional figures in two dimensions with certain types of drawing.
  • The point of doing this drawing is to make things easier to visualize.
  • A “hemisphere” is half a sphere – the top half in this case
  • A “cylinder” is basically a tube with constant width.
  • The center of the bottom of the hemisphere is the same point as the center of the sphere it came from.
  • The height of the hemisphere is the same as the distance from the center to the edge horizontally.
  • This means that the cylinder drawn is twice as wide as it is tall.
  • The volume of a cone is one third the area of its base times its height.
  • The volume of a cylinder is its base times its height
  • The area of a circle is \pi times the square of its radius

And so on. I only stopped writing so that I’d eventually finish the rest of this post. Each item I added to that list sparked off several new ones I hadn’t considered.

Try writing your own list and you’ll quickly be overwhelmed by the exponentially-proliferating leaves on your conceptual tree. We didn’t even get close to things like the Cavalieri’s principle.

The items on my brown big spider list are not supposed to be mathematical facts so much as cognitive patterns the reader is required to have. For example, mathematically a point is not, “a little dot with no size at all,” as I called it. It’s a primitive notion and has no definition. The list still calls a point a dot, though, because the mathematically-accurate description isn’t helpful to a student, and isn’t they way most people think of it even when they’ve already learned geometry well.

When I started writing the list, I found myself wanting to say, “A sphere is a set of all points equidistant…”, but that’s no good. It uses the significant brown big spiders of “set” and “equidistant”, as well as the general idea of giving mathematical definitions, something most high schoolers don’t yet understand well. Then I wanted to say, “A sphere is a shape that’s symmetric with respect to rotations about any axis…” but this has all the same problems.

Ultimately, I chose “a sphere is a ball.” It’s imprecise, but it’s the way you think about a sphere before you’ve packaged the concept away so tightly you don’t need to think about it any more. Anyone who tells you a sphere is the two-dimensional manifold S^2 is someone who has forgotten how much they actually know about spheres. They’ve forgotten it in the good way, of course – the way David was supposed to forget the notes to Rachmaninoff. Unfortunately, I experience a crippling side effect when I forget things this way. I forget that other people haven’t yet forgotten them.

This forgetting is the psychological phenomenon of “chunking“. The most famous example involves chess players. Give expert chess players a position from a game between grandmasters and they can easily memorize the positions of thirty pieces. Give them pieces strewn randomly about the board and they’ll remember just a few – no more, in fact, than your average Joe who knows little more about chess than what the real name of the horsey is.

A position from a real game has lots of meaning, if you’re an expert. If you’re an expert you extract order from the position automatically, without consciously processing every detail. The entire task must seem quite simple to a grandmaster. Similarly, the experienced mathematician sees all the important properties of the sphere and the cylinder and the cone without having to list them out one by one, and the process is so automatic they don’t even realize they’re doing it.

In “Simple” Isn’t “Easy”, I learned not to judge the difficulty of new ideas by how simple they are, but by how familiar to the student. Despite this, I have continued to make a similar mistake when dealing with ideas the students have already learned.

“Learned” isn’t “chunked”. My student understood the meaning of “hemisphere” and the formula for the volume of a cone, but she still needed conscious effort to recall and wield those bits of knowledge. Each sat in its own corner in her mind, accessible only by dint of concerted effort, and certainly not ready to flow into a flood of beautiful ideas.

I was trying to dictate a soliloquy for her to transcribe, but I was assuming that because she could see the letters on her keyboard, should could touch-type. It turned out that the effort to hunt-and-peck was so great, all the artistry of the speech was lost.

I want to watch out for my brown big spiders in the future. I want to be more patient when they are discovered and more studious in cataloging, remembering, and working with them. Most of all, I want to look back later, and remember my students forgetting them.


Divisibility By 7 Revisited

February 7, 2011

One of the most-viewed posts on this blog describes a rule for checking whether a number is divisible by seven.

This post is about another, simpler way to do it.

Take a long number, say


One way to check if it’s divisible by 7 is to subtract multiples of seven until you get down to something small. For example, 910,000,000,000 is a multiple of seven because 91 is. So subtract this number from the original to get 50,937,563,483. Now subtract 49,000,000,000 to get 1,937,563,483, etc.

This procedure is fine for small numbers, but you’ll only eliminate one or two digits per subtraction. Here’s a method useful for very long numbers.

First, turn all the 7‘s into 0‘s, all the 8‘s into 1‘s, and all the 9‘s into 2‘s. This is just a simple case of the rule above – subtracting some multiples of 7. The original number becomes


(If you want, you can take this step further by turning 6 into -1, etc. I won’t do that here.)

Now break apart each group of three digits separated by parentheses, starting from the right. Put a negative sign on every other group of three digits, then add them all up.

\begin{array}{cc} {} & +413 \\ {} & -563 \\ {} & +230 \\ + & \underline{-260} \\ {} & -180 \end{array}

This number is divisible by 7 if and only if the original is.

We need to check -180 for divisibility by 7. Here, adding and subtracting multiples of 7 is easy. For example, add 210 to get 30. Now subtract 28 to get 2. The remainder when you divide 960,937,563,483 by 7 is 2.

If you want to use this rule, but the number of digits isn’t a multiple of three, you can simple add some zeros on front. For example,


turns into


and we get

\begin{array}{cc} {} & +010 \\ {} & -331 \\+ & \underline{+045} \\ {} & -276 \end{array}

-276 + 280 = 4, so the remainder when you divide 45,338,017 by 7 is 4.

This rule works due to a convenient pattern in the remainders of the powers of ten.

If we start with n = 0, the remainder when you divide powers of 10 by 7 is

1, 3, 2, -1, -3, -2, 1, 3, 2, -1, -3, -2,  \ldots

Each group of three digits, after alternating groups are multiplied by -1, has the same rule for divisibility by 7. For example, the remainder when 124,000,000,000 is divided by seven is the same as the remainder for -124. So we just take those groups of three and add them, simplifying the task greatly.

Collecting the Harmonic Numbers

January 1, 2011

When I was a kid, I collected baseball cards. I’ve stopped long since. I can no longer recite streams of players’ yearly RBI totals, and I think my parents threw the cards out when I was away at college. Still, one problem related to those baseball cards still interests me.

Cards were sold in packs of twelve, each containing a random sample selected from the complete set of seven hundred. With my one-dollar allowance and the fifty-cent price of a pack of a cards, how long should I have expected to invest my income before getting a complete set?

To get a complete set of 700 cards, I first need to get a card that’s not a repeat, then get another one that’s not a repeat, and so on 700 times. The first card I pull out of the first pack can’t be a repeat. After that, the second card has a 699/700 chance of being new. Once I have c different cards, the probability of the next one being distinct is (700-c)/700.

We need to know how many cards I expect to draw before getting a new one, when the probability of the next card being new is p. This is similar to the classic problem of how many times one expects to roll a die before getting a 4. The answer is just what you’d think – six times. One argument for this is that if I roll the die many times, then one sixth of the rolls will be 4’s, so the average number of rolls from one 4 to the next is six. That means the expected number of rolls to get a 4 is six. The general answer is 1/p.

So once I have c different cards, the expected number of cards I have to look at to get another new one is 700/(700-c). This makes my expected number of cards purchased to complete the set

\frac{700}{700} + \frac{700}{699} + \frac{700}{698} + \ldots + \frac{700}{2} + \frac{700}{1}

Factoring out 700 and writing the terms backwards, this is

700 \left(\frac{1}{1} + \frac{1}{2} + \frac{1}{3} + \ldots + \frac{1}{700}\right)

In other words, if there are n possible equally-likely outcomes to a random event, the expected value of the number of trials to get every value to occur is n times the nth harmonic number.

An image from Wikipedia shows that the harmonic numbers are approximated by the integral of 1/x, so that the answer scales with n as n*log(n).

For my baseball card problem, n=700 and I’d need about 4500 cards, or around 4 years’ allowance, to complete the set (I never made it).

The integral picture immediately suggests an improvement. The curve drawn is an underestimate – the rectangles always pop up higher than the curve. What is the approximate area left over above the curve?

A simple guess comes from taking each of those left-over areas to be a triangle. These triangles all have a base of 1. Their heights vary, but the nice thing is that the top of one triangle begins at the bottom of the next. Imagine sliding all the triangles over so they’re stacked on top each other. This would give a total height of 1-1/(n+1) for the triangles out to the nth harmonic number. So a better guess for the for the nth harmonic number H_n is

H_n = \ln(n) + \frac{1 - 1/(n+1)}{2}

This is also an underestimate because the area above the curve doesn’t really consist of triangles, as you can see. H_n truly converges to \ln(n) + \gamma. \gamma is the Euler-Mascheroni constant. It’s now know to 30 billion digits, but the most important ones are 0.577…

Browsing Wikipedia reveals a trove of information about these mathematical entities. For example, if you have ever tried to sum a geometric series, you saw that

If you integrate x^n from zero to one, you get 1/n, so

This lets us define fractional harmonic numbers, and for simple fractions they involve things like \pi and \sqrt{2}. Further down that page we learn that harmonic numbers are related to the Riemann zeta function and the Riemann hypothesis and a bunch of other things I don’t know about.

Why might I find this stuff interesting? If you have a closed physical system with many particles, there are many possible physical states of those particles. The ergodic hypothesis assumes that as the system evolves in time, it passes through all those states, and all states are equally likely to be accessed at some far future time (assuming we restrict attention to those states that obey conservation laws, like conservation of energy). So the harmonic numbers might give some insight into how long I have to wait to observe particular configuration of the system. Would the harmonic numbers be discovered by a Boltzmann brain?

Solution: Halving the Triangle

November 2, 2010

The problem asked

Suppose you have an equilateral triangle and you want to cut it in half using a pizza cutter (which can cut curves, not just straight lines). What is the shortest cut you can make?

Two commenters got the correct answer. The cut is one sixth of a circle whose center is at a vertex of the triangle.

It’s not immediately obvious to me why this is true. However, the following picture from Mahajan’s book makes it pretty clear.

Image from Sanjoy Mahajan's "Street Fighting Mathematics"

Assuming the cut goes from one side of the triangle to another, we could tile six triangles around to make a hexagon, and find a minimum-length cut to divide all the triangles in half. As long as you believe a circle has the minimum perimeter for a fixed area, this should be pretty convincing.

The Entropy of the SAT

November 2, 2010

If you get a question wrong on the SAT, you get a 1/4 point deduction. If you skip that question, there’s no penalty. This way, the test is “guessing-neutral”. Each question has five choices, so if you guess randomly your expected score increase is 1/5*1 + 4/5 * (-1/4) = 0. On average, you neither win nor lose points by guessing.

However, guessing still increases the variance of your score by 0.1 raw points per guess. Maybe there is a better way to account for test-takers getting some correct answers at random?

I remember the words of my 9th grade geometry teacher. We had to answer some True/False questions, and he was wise to the strategy of writing an ambiguous sort of pencil scratch which, if it were marked wrong, you could later come back and complain had been misread.

This ambigram is the intellectual property of the internet.
I stole this one, too.

Mr. Allen told us, “Be very clear about your T’s and F’s. If I can’t tell whether what you wrote is a T or an F, I’m going to guess. And I always guess whichever one is the wrong answer. I am the worst guesser ever.” He paused a moment after that and added, “Well, I guess that means I’m the best guesser ever, too.”

What he meant was that if you guess randomly, you’ll guess that the student got it right 50% of the time. If your guess has the student getting it wrong 100% of the time, that means you, the guesser definitely know the answers. From the student’s point of view, getting all the answers wrong on a T/F test is just as hard as getting them all right.

This makes a T/F test very easy to pass. Even if you only know the answers to 20% of the questions, you can still guess on the rest and get an expected score of 60%.

We can adjust the grading of the test to get an estimate of the student’s knowledge by analyzing the amount of information they put into the test. By “information” what I mean is the reduction in the entropy.

Say we have a T/F test with 100 questions. There are 100891344545564193334812497256 ways to get exactly 50/100 on the test, but only 1 way to get 100/100 or 0/100. There are 100 ways to get 99/100 or 1/100.

If we take the binary log of these numbers, we find the entropy of a score 50/100 is 96.3 bits. The entropy of a perfect score is 0 bits. To convert to a reasonable test score, we multiply this entropy by -1 and add 96.3. Then to get on a scale of 0 – 100 multiply by 100/96.3. (This is equivalent to using not a binary log, but a log with base 2^0.963 = 1.95). Since this score is based on information, let’s call it the iScore. The random guessing iScore is zero, which reflects the test-taker’s complete lack of knowledge. The perfect iScore is 100. A raw score of 99/100 is worth 93.1. One answer wrong means a lot! Here’s a plot of the iScore as a function of the raw score.

In the general case of n questions with c choices each and a raw score r, the formula for the iScore is

iS(r) = \left(-\log\frac{n!(c-1)^{n-r}}{r!(n-r!)} + \log\frac{n!(c-1)^{n-n/c}}{n/c!(n-n/c)!}\right)\frac{100}{\log\frac{n!(c-1)^{n-n/c}}{n/c!(n - n/c)!}}

For example, here’s the iScore plot for a test with 100 questions, each with 4 choices.

The more choices there are, the less difference between the iScore and the raw score. In the limit where c \to \infty, the iScore and the raw score are the same thing. That means that if you want to give a test so the raw score accurately determines how much students know, make it free response.

The SAT does not report your raw score to colleges, even after its guessing adjustment. Instead it reports a scaled score, indicating how well you did relative to the other test takers. I have a suggestion for an improved algorithm for relative scoring, too. The basic idea is that you should be rewarded for getting very difficult questions correct. Meanwhile, if you make a boo boo on one of the easy questions, it shouldn’t mess you up too much because it was probably not indicative of a gap in your knowledge, and might be due to something dumb like accidentally marking the wrong bubble on the answer sheet.

When you get your SAT score report, you get to see every question on the test along with the answers you gave and the percent of test takers who got that question right. If 50% of test takers got a certain question right, that question is only as hard as a normal true/false question. We could construct an effective number of choices for each question by taking 1/(proportion correct answers). Let’s call that the question’s “quality”. The question that 50% of people got right has quality 2, while a question that only 10% of people got right as quality 10.

We take the binary log of the quality of each question and call that the question’s “weight”. A question’s weight is the number of true/false questions it’s worth. Now we multiply your raw score on each question (1 or 0) by the question’s weight and sum to create a weighted raw score. This is a different metric than the iScore, and I’m not sure which I like better. This score allows us to account for the SAT’s free response math questions, though.

Of course, none of this actually removes the advantage of guessing. To answer that problem, I propose that the test’s answer choices be randomized before the test is administered, then before scoring someone’s test, all the answers they left blank are automatically filled in with choice A, or filled in randomly. This simply takes the guesswork out of the test-taker’s hands and makes guessing compulsory, but it seems fair enough to me.

Viete’s Formula and Spinning Pizza

September 17, 2010

Have you seen Viete’s formula?:

It’s a special case of a trig identity found by Euler:

If you plug in \pi/2 to the trig identity and use the half-angle formula for cosine over and over, you get Viete’s formula.

But why would you want to consecutively cut angles in half and multiply their cosines? Well, you might be eating pizza.

You have a slice of pizza that is too hot to hold, so you want to balance it on your fork and gnaw at it instead. There’s a precise spot on the underside of the slice where the fork should go.

Balancing pizza on a fork.

This point is called the slice’s center of mass, and we’re going to find it. By symmetry, it must be on the line cutting the slice in half lengthwise, but we don’t yet know how far down. It depends on the shape of the slice, which we measure by \theta, the angle its edges make.

The center of mass of the slice of pizza is a green dot. It lies on the line cutting the slice in half vertically.

A bigger slice will have its center of mass closer to the tip. We would like to know r(\theta), the distance from the tip to the center of mass as a function of \theta.

We want to know the distance to the center of mass.

There are two limiting cases – a very skinny slice and a whole pie. A very skinny slice is basically an isosceles triangle. Its center of mass is 2/3 the way from the tip to the edge1, so

\lim_{\theta \to 0} r(\theta) = \frac{2}{3}R .

Let’s choose the radius of the pizza as unit of length, so R = 1 from here on.

In the other limiting case, an entire pizza has its center of mass right at the tip (i.e. center), so

r(2\pi) = 0 .

To investigate intermediate cases, we start with a slice of angle \theta and imagine cutting it in half lengthwise, creating two skinny pieces of angle \theta/2. These have their own centers of mass at r(\theta/2).

The center of mass of the big piece is on the line connecting the smaller pieces’ centers of mass.

A bit of trigonometry tells us

r(\theta) = r(\theta/2)\cos(\theta/4)

If we take this formula and divide all angles by 2, we get a formula for r(\theta/2). We substitute this for where r(\theta/2) appeared in the original. We obtain

r(\theta) = \left[r(\theta/4)\cos(\theta/8)\right]\cos(\theta/4)

Repeat the process ad infinitum. Rearranging the order of the terms and substituting the limiting value of r for small \theta, we get

r(\theta) =\frac{2}{3} \cos(\theta/4)\cos(\theta/8)\cos(\theta/16)\ldots

It involves one half of Euler’s trig identity. If we find r(\theta) by a different method and get a different expression for it, we can set our two expressions for r(\theta) equal to each other, and prove Euler’s identity. We’ll do this by invoking some physics ideas.

Suppose you’re spinning some pizza dough in the air. You know, like this:

If the pizza is spinning, each little bit of dough undergoes centripetal acceleration. Where there’s acceleration, there’s force. The pizza isn’t touching anything, so the force on any one piece of pizza must be coming from the rest of the pizza.

Let’s again examine a slice of size \theta, this time still attached to the spinning pizza. It has two forces of size F acting on it; one force is exerted by the slice to its left and one by the slice to its right.

There are two forces on the slice - one from the pizza to the left and one from the pizza to the right. They're both drawn originating from the center of mass. The slice is accelerating towards its tip (red arrow).

The sum of these forces is the mass of the slice times the acceleration of its center of mass. That acceleration is \omega^2 r(\theta). Hence, if we determine the forces we can deduce r(\theta).

Some trigonometry shows that the net force is 2F\sin(\theta/2).

Equating this to mass times acceleration, we get

2F\sin(\theta/2) = \frac{m \theta}{2\pi} \omega^2 r(\theta)

We might as well let m = \omega = 1 and solve for r to get

r(\theta) = 2F\sin(\theta/2)\frac{2 \pi}{\theta} .

We still need to determine F, but we can do that because we know r(\theta) \to 2/3 as \theta \to 0. After a little algebra, we get

r(\theta) = \frac{4}{3} \frac{sin(\theta/2)}{\theta}

This gives us the sought two expressions for r. We can now equate them and simplify to

\frac{sin(\theta)}{\theta} = \cos(\theta/2)\cos(\theta/4)\cos(\theta/8)\ldots

1) To see why an isosceles triangle’s center of mass is 2/3 up the altitude, first show it’s true for an equilateral triangle. Then explain why all isosceles triangles have their center of mass the same fraction of the way down the altitude.

Flat Priors and Other Improbable Tales

September 8, 2010

Some collected and invented stories about erroneous thinking in probability.

A Visit

It’s night. You are coming downstairs for a glass of water. You hear a creaking sound and look around a corner to see a man in a ski mask opening your front door. “What are the odds?” you think. “Normally that guy would have set off my burglar alarm and been scared off by the loud wailing, but he happened to stop by for a visit just one minute after the power went out.”


You know, the most amazing thing happened to me tonight. I was coming here, on the way to the lecture, and I came in through the parking lot. And you won’t believe what happened. I saw a car with the license plate ARW 357. Can you imagine? Of all the millions of license plates in the state, what was the chance that I would see that particular one tonight? Amazing!


About 100 billion people have ever lived, and there are about 7 billion people alive now. Therefore about 7% of people are extremely long-lived.


A man makes it through a long battery of physical and psychological tests and finally achieves his lifelong dream of joining the astronaut program. He immediately takes up heavy smoking. “What gives,” asks his friend. “I thought you were a health nut.”

“I am,” replies the man. “Anybody who smokes a lot will probably die of lung cancer.”

“Why would you want to die of lung cancer?” his friend asks.

“A shuttle explosion will kill you in two seconds,” he replies. “But now I’m gonna die of lung cancer, and that’ll take at least forty years.”

Hitchhiker’s Guide

It is known that there is an infinite number of worlds, simply because there is an infinite amount of space for them to be in. However, not every one of them is inhabited. Therefore, there must be a finite number of inhabited worlds. Any finite number divided by infinity is as near to nothing as makes no odds, so the average population of all the planets in the universe can be said to be zero. From this it follows that the population of the universe is also zero, and that any people you may meet from time to time are merely the product of a deranged imagination.
-Douglas Adams

Foreign Lands

In a certain country, people always name their first child a name that starts with “A”, their second child a name that starts with “B”, etc. Families in this country are anywhere from one to ten children; equal numbers of families have each. It is a tradition in this country for each father to randomly select one of his children each day to accompany him on a walk.

When visiting this country, you meet a man out walking with his daughter, who he introduces as Evelyn. You now know the man has at least five children. If he had exactly five, your chances of meeting the one whose names starts with “E” are 1/5. If he had more, say eight, your chances of meeting the one whose name starts with “E” are 1/8. Therefore, you conclude that it is most likely that Evelyn is the oldest child. You realize there was nothing special about Evelyn, and conclude that any time you meet a man walking with his child in this country, he is most likely to be walking with the oldest one.


Of all the gin joints, in all the towns, in all the world, she walks into mine.


While rummaging around in his parents’ attic, Sean comes across an old love letter to his mother. It’s from Rodrigo Valenzuela, a man he never knew, to his mother. It refers to “nights of fevered frenzy and mornings of muted passion”, is signed, “a mi amor”, and asks when her husband will be away again. The letter is dated eight months before Sean’s birth date. He looks in the mirror, wondering why he didn’t inherit his parent’s fiery orange hair and why salsa music has always stirred his soul.

Sean looks up information on paternity tests, and finds that if you send in one sample of DNA as the suspected father and one as the suspected child, the test will report a probability, which represents the probability that a man with the “father” DNA would sire a child at least as genetically different from him as the “child” DNA. Thus, a low percentage, like 0.001%, means that a true child would have only small chance of being as different from the father as the “child” sample is. This is the result we expect if the child is not from the father. A high percentage, like 60%, means the “child” and “father” DNA are very close, and is what we expect if the man is the true father.

Secretly, Sean collects a sample of the DNA of the man he’s always called “dad” and one of his own and sends them in for testing. As a control, Sean also collects a sample from his own son, and a second sample from himself and sends this sample in as well. Finally, Sean hunts down Rodrigo Valenzuela using Facebook, “friends” him, studies his “likes” and “interests”, uses them to befriend Roderigo in real life, asks to borrow his car, and steals a hair from the headrest. He sends in a third sample of Rodrigo and himself for testing.

Two weeks later the test results come back. Sean isn’t shocked. The probability for him and his “dad” is a scant 0.00004%. The probability from Roderigo and himself is 7%. Finally, the result from his son is 74%. Sean realizes that there’s some natural variation in the test, but the evidence is still clear: Roderigo is his true father.

The next day the clinic and says there’s been a mix up. They accidentally switched the samples from Sean and his son, so the 74% was actually the result of testing Sean’s son in the “father” role and Sean in the “child”. Sean is understandably upset. He goes to bed that night thinking that although Roderigo may be his father, it’s ten times as likely that his own son will, in the course of his life, discover time travel and go back to impregnate Sean’s mother.

Flat Prior

On whether or not the Large Hadron Collider would create a black hole that would consume Earth:

John Oliver: So, roughly speaking, what are the chances that the world is going to be destroyed? Is it one in a million, one in a billion?

Walter Wagner: Well, the best we can say right now is about a one in two chance.

JO: Hold on a second. Is the, if, 50 – 50?

WW: Yeah, 50-50.
WW: It’s a chance. It’s a 50-50 chance.

JO: You keep coming back to this 50-50 thing. It’s weird, Walter.

WW: Well, if you have something that can happen and something that won’t necessarily happen. It’s either gonna happen or it’s gonna not happen. And, so it’s, the best guess is one in two.

JO: I’m not sure that’s how probability works, Walter.

from The Daily Show

A Physical Trig Identity

September 6, 2010

Can a basic physics problem give you insight into math?

For example, mathematically

\cos(x) - \sin(x) = \sqrt{2}\cos(x + \pi/4)

which is easy to verify using the angle addition formula.

I came across this formula while solving a simple problem in statics.

Imagine the classic “block on an inclined plane”. Gravity (F_g) pulls the block down, and you push (F_p) on it sideways, like this:

Gravity pulls down on the block and you push on it to the right.

What is the minimum coefficient of static friction to keep the block stationary? In order to calculate this, we need to know the component of force parallel to the plane.

First look at gravity. We want to find the green component F_{g1}.

The force of gravity can be decomposed into two components. One points along the plane and the other is normal to it.

Let’s say the positive direction is to the right. Then gravity is pulling backwards some, so F_{g1} is negative. I know it’s either a sine or cosine of \theta, and in the limit as \theta \to 0, I see that F_{g1} = 0, so

F_{g1} = -F_g \sin\theta .

Then we look at the pushing force.

The force from pushing is likewise decomposed.

A similar procedure gives

F_{p1} = F_p \cos\theta .

So the total force in the direction of the ramp is

-F_g \sin\theta + F_p \cos\theta .

In the special case where F_g = F_p = 1 the force is

\cos\theta - \sin\theta .

Now we will find this component of the force another way. We start by tip-to-tail adding the force of gravity and the force of the push. They’re at right angles, and assuming they’re equal in magnitude we get a resultant force with length \sqrt{2} that bisects the angle between the gravity and pushing forces.

We can also find the component of force along the plane by first adding the two vectors...

The angle between this resultant force and the plane is \pi/4 + \theta.

...and then finding a component of the sum.

The component of this force along the plane is then the cosine of \pi/4 + \theta, so the force along the direction of the plane is

\sqrt{2} \cos(\pi/4 + \theta)

and since it’s the same quantity we calculated before, we have

\cos\theta - \sin\theta = \sqrt{2}\cos(\pi/4 + \theta) .

There is no physics in this calculation, but if you had simply asked me to write \cos\theta-\sin\theta as a single trig function, I wouldn’t have thought to approach it like this.

Answer: Lemmings (part 2)

August 13, 2010

Here is another answer to the Lemming problem, complementing yesterday’s answer.

In case you’re behind, the problem was:

On a remote Norwegian mountain top, there is a huge checkerboard, 1000 squares wide and 1000 squares long, surrounded by steep cliffs to the north, south, east, and west. Each square is marked with an arrow pointing in one of the eight compass directions, so (with the possible exception of some squares on the edges), each square has an arrow pointing to one of its eight nearest neighbors. The arrows on squares sharing an edge differ by at most 45 degrees. A lemming is placed randomly on one of the squares, and it jumps from square to square following the arrows.

Prove that the poor creature will eventually plunge from a cliff to its death.

For this proof, we’ll again focus on closed loops, this time proving they cannot not have a “rotation”, and so the lemming must fall.

As before, we argue that because the number of squares is finite, the lemming will run around a closed loop if and only if it never falls off. Also, the loop must be simple, specifically it cannot cross itself (see previous solution for detail).

Next we introduce the new concept of a path. A path is like a closed loop – a sequence of squares next to or diagonal from each other that you can trace your way around. The difference is that a loop is something the lemming would actually follow because the arrows point from one square on the loop to the next, whereas in creating a path we ignore the arrows. So a loop is a special case of a path. Because we know that any viable loop doesn’t cross itself, we will only consider paths that also do not cross themselves.

An example path.

An example path. It ignores the arrows underneath it, which in this example are in a legal configuration.

Although we don’t look at the arrows when creating a path, we can still notice the arrows once the path is created. That is what we’ll do, using the arrows to define a “rotation” for each path.

Consider an arbitrary path. Trace your way around the path, noting each jump involved. For each jump, the arrow underneath where you are will, in general, rotate. It can rotate up to 45 degrees when moving between squares that share an edge and up to 90 degrees when moving diagonally.

If the arrow rotates 45 degrees clockwise as you jump, call that +45. If it rotates 45 degrees counterclockwise, call the -45. Similarly for 90 degrees. If the arrow remains the same, call that zero. Sum the rotations from all the jumps going around the path and call that the path’s rotation.

A path that closes on itself must have rotation that is a multiple of 360, because it starts and ends with the same arrow.

Consider a 2×2 path. There are four transitions, each with a rotation of at most 45 degrees. Since this cannot let you build up to 360 degrees, any 2×2 path must have rotation zero.

A 2x2 path.

A 2x2 path must have rotation zero. Red are the arrows. Green is the path. Blue are the jumps.

Similarly for a triangular path like this:

A small triangular path.

A small triangular path like this must also have rotation zero.

How about a 3×2 path? Such a path is built up from two 2×2 paths. When placing the two 2×2 paths on top each other, the overlapping parts cancel because they are traversed in opposite directions. Hence, the rotation for a 3×2 path is the sum of the rotations for the two 2×2 paths. All 3×2 paths have rotation zero.

A 3x2 composite path.

If you traverse the green path, then the purple path, the transition you do twice cancels, and all the other transitions sum to the same thing as the gold path.

We can build up an arbitrary non-crossing path from lots of 2×2 squares and triangles. An arbitrary path’s rotation is the sum of its constituent little paths. Since the little paths’ rotations are zero, all big paths have rotation zero, too.

A big composite path.

Our original example path made up of lots of little squares and triangles. Some squares are used twice.

On the other hand, a loop like the lemming would follow is also a path, and it must have rotation +/- 360 since loops cannot cross themselves. A loop cannot have rotation zero and rotation 360. These loops cannot exist, and the lemming dies.


June 12, 2010

I took a short break from reading Steven Strogatz’s Sync: How Order Emerges From Chaos In the Universe, Nature, and Daily Life earlier today and checked Facebook. Usually, the status updates of my Facebook friends are a seemingly-random menagerie of links to news stories, jokes, anecdotes, and these things: ^_^. Today, though, I found that in just the last twenty minutes, ten or so of my friends had posted nearly identical messages. They had somehow synced.

In this case, it’s not surprising. They were restating the result of the recently-concluded World Cup soccer game, but with more exclamation points than I’d get from Reuters. (Actually, Facebook status updates are the primary way I keep in touch with mainstream sports.) My Facebook synced today because of a strong, external signal influencing all the individual updates. That’s the way we normally think about synchrony. If you want it, you need some sort of a central clock for everyone to follow. A computer chip’s parts sync this way. Coworkers on a project are synced by a manager. Orchestras have conductors. Tug-of-war teams count to three.

By contrast, Strogatz is interested in spontaneous synchrony – synchrony where you won’t expect it and no one’s in charge. A great visual and audio introduction is Strogatz’s own TED talk.

Sync is a broad survey of nonlinear systems from spirals in oscillatory chemical reactions to synchronized menstruation induced by armpit sweat. What’s captivating about it is the story. Like James Gleik’s Chaos or Kip Thorne’s Black Holes and Time Warps, it carries you along from a few researchers diddling around with a curious idea to the creation of a large scientific field. We explore different branches where the original research lead, all the time seeing the different ways scientists and mathematicians approach their problems. From Strogatz, you also get a sense of the way these different approaches contribute to a complete understanding. At different times, Strogatz describes analytical work (solving equations), computer simulations, visualization (including building models from string and clay), laboratory experiments, and field research. Each endeavor feeds back into the others in this story about the science of synchrony.

I was curious, as I read the book, what it would be like if it had been technical as well. What if Strogatz had included didactic discussions of the solvable systems he’d worked on, or outlined the topological proofs he mentioned, or showed the results of the research as he would in a technical scientific talk, all integrated into the same story? A skeptical answer would be that lay readers wouldn’t touch the book and that technical readers would not be interested in the fluff. Strogatz already wrote an introductory textbook on nonlinear dynamics (which I haven’t read, but I’m told it’s good). I’ve seen textbooks that have little biographies inserted here and there, and I’ve seen popular books that use some equations or put technical appendices at the end. I am curious about a book intended to teach an undergraduate course that’s a truly integrated historical story and didactic text. There is an extensive bibliography allowing me to pursue the technical aspect of whatever ideas interest me the most, but that is something quite different from an organized presentation.

I picked up Sync while browsing, and read it because I remembered both the TED talk I linked above and Strogatz’s amusing math columns in the New York Times.