Posts Tagged ‘statistics’

How interesting is that license plate?

September 19, 2013

You know, the most amazing thing happened to me tonight. I was coming here, on the way to the lecture, and I came in through the parking lot. And you won’t believe what happened. I saw a car with the license plate ARW 357. Can you imagine? Of all the millions of license plates in the state, what was the chance that I would see that particular one tonight? Amazing!

-Richard Feynman (source)

A friend on Facebook commissioned a survey with three conditions, which were to be assigned randomly to participants (uniform distribution). The number of responses was

condition A: 58

condition B: 94

condition C: 108

total: 260

Seeing the lopsided result, she called the survey company to ask what was up. The representative said it was just random chance. How should one react? What sort of reasoning is useful here, and what is not? Is something strange going on with the survey?

If you crunch through some quick math, you’ll see that if the survey is fair, the odds of getting a result as extreme as 58/260 via random chance are a bit less than one in a thousand. (I’m accounting for over- or under-representation in any of the three categories.) How meaningful is that?

Suppose you are walking home and find a 20-dollar bill. The odds of that might be about 1/1000, but you likely don’t think anything fishy is going on. You chock it up to good luck and pocket the bill. But next suppose you remember a time when you found a 20-dollar bill when you were walking home as a little kid, and you realize you found it right outside your grandparents’ house (which you pass on the way), and they happened to be watching from the window when it happened. These circumstances don’t change the probability of finding a 20-dollar bill by random chance, but they change our estimate of the probability that finding the bill was a fluke. How meaningful an unlikely result is depends on not only how unlikely it is, but also on the plausibility of competing alternatives.

This is captured in Bayes formula. For the survey, it is

P(random|data) = \frac{P(data | random)P(random)}{P(data)}

where P(random|data) is the probability the survey process was a fair, uniform, random one given the observed 58-94-108 split. P(data|random) is the probability of observing our results given a fair random process. P(random) is the prior probability we assign to the process being a fair random one, and P(data) is the overall chance of seeing our 58-94-108 split under any circumstances, including unfair ones.

The easy one is P(data|random). It comes to 10^{-6} (see calculation here).

Discussion of the issue, then, ought to focus on estimates for P(random) and P(data). In part, it did

Are you counting people as they start the survey or as they finish? Because if it’s the latter, and option A is more work than the other two…

(suggests P(random) isn’t very high due to varying attrition rates, and that P(data) isn’t very low because varying-attrition could cause the observed bias.)

mostly I assume that it is random because randomness is pretty easy to code

(P(random) is high)

Actually, what you should be calculating is the Bayes factor, given the observed data, of a uniform distribution vs. a categorical distribution with a Dirichlet prior.

(Focuses on P(data). However, it suggests a slightly-different metric to look at than estimating the probability that the survey process is fair. Seems like a good suggestion, but it’s not my main point here.)

How does one calculate the probability that Qualtrics would make a mistake?

(focusing on P(random))

I believe that if something went wrong with that kind of coding, the outcome would look very different (like it would skip one group altogether)

(P(data) is low)

One thing everyone should keep in mind is that the alternative hypothesis here is NOT “their random number generator is broken.” (I mean, that’s possible, but it’s not on my list of top ten likeliest alternative hypotheses).

The alternative hypotheses here are things like “I misunderstood how to use Qualtrics ‘randomizer’ function.” Or, “Qualtrics intentionally assigns lower probabilities to longer test conditions.” Or “There’s a higher dropout rate in this test condition.” (Although I *think* that last hypothesis has been falsified by now.)

(Suggests why P(random) isn’t necessarily so high, and why P(data) is significant.)

Ultimately, the estimate you generate will be subjective, i.e. based on your priors and your assumptions about how to model the survey process. That’s why we see people using a lot of heuristic reasoning about the calculation – heuristics are how we deal with subjective estimates.

But in addition to discussion aimed at estimating the components that go into the Bayesian calculation, there was an entirely different type of heuristic reasoning, one focusing on human biases

I think Kahneman and Tversky did research on this? On coin flips people think HTHTTHT is more likely than HHHHHHH because it “looks more random.” Both are equally likely.

The funny thing about this (related to what A—- was saying) is that we’ve got research which shows just how difficult it is for human beings to accept randomness when they see it.

although a chi-square test shows that it’s a highly unlikely result, anything is possible, and it would be unusually NOT to see some unlikely results.

The statistical calculation only gives us the probability that such an outcome would occur. It doesn’t rid us of our preconceived notion that it should not occur, nor does it remind us that even low probability events occur quite often.

What I’m saying is that a program that has been used hundreds (or perhaps thousands?) of times to randomize subjects into conditions will often produce an outcome in the tails of the distribution.

The idea seems to be that one shouldn’t be alarmed just because your model says an event of low probability occurred. Even if your models of the world are in general correct, so many things happen that you’ll observe rare events once in a while. Further, we’re biased to make a big deal out of things, thinking they’re not random when they are. This bias in what we notice is the basis for Feynman’s joke – no one ever points out every mundane thing that occurs; only the few that seem surprising to them. Most, they don’t notice.

But “human biases” doesn’t seem to have any obvious spot in Bayes’ formula. The calculation gives a probability that doesn’t have anything to do with your biases except insofar as they affect your priors. Who cares whether the program has been used hundreds or thousands of times before? We’re only interested in this instance of it, and we don’t have any data on those hundreds or thousands of times. The only extent to which that matters is that if the program has been used many times before, it’s more likely that they’ve caught any bugs or common user errors.

In the end, the “unlikely events are likely to occur” argument doesn’t seem relevant here. If we looked at a large pool of surveys, found one with lopsided results, and said, “Aha! Look how lopsided these are! Must be something wrong with the survey process!” that would be an error, because by picking one special survey out of thousands based on what its data says, we’ve changed P(data|random). That is, it is likely that the most-extreme result of a fair survey process looks unfair. But we didn’t do that here, so why all the admonitions?

Another point made by commenters was that HTHTTHT is equally-likely with HHHHHHH given a fair coin, but only the second one raises an eyebrow. This is because HTHTTHT is one of a set of a great many similar sequences while HHHHHHH is unique. But this doesn’t seem relevant here, either. We didn’t look at the exact sequence of responses 260 (BBCABACCABAC…) and claim it was unlikely. All sequences are equally unlikely given a fair random process. But instead we looked at a computed statistic – the distribution of A, B, and C, which captures most of what we’re interested in. So again, why did commenters bring this up?

Maybe I’m missing an important point, but my guess is that it’s just pattern matching. “Oh, someone is talking about an unlikely thing that happened. Better warn them about Feynman’s license plate.” Of course, we do pattern-matching all the time because it usually works. But we also need to get feedback whenever our pattern-matching fails, then try to figure out why it failed, then try to update the pattern-matching software to work better next time, gradually giving fewer false positives and more true positives. There’s a tradeoff between them, and I’d guess it’s better to err on the side of committing false positives, since you can go build a general skill of going back and checking over what you’ve said carefully after initially pattern-matching it, especially in writing.


Bayesian Bro Balls and Monty Hall

April 13, 2013

I don’t think this is very good, but I’m tired of working on it and I said I’d write it, so here it is. See for a better essay on thinking about probability.

A friend asked me about this oft-cited probability problem:

A woman and a man (who are unrelated) each have two children. We know that at least one of the woman’s children is a boy and that the man’s oldest child is a boy. Can you explain why the chances that the woman has two boys do not equal the chances that the man has two boys?

(text copied from here, which attributes the problem’s popularity to an “Ask Marilyn” column in 1996)

Given a tricky problem, some people are stumped, clever people find a tricky answer, and brilliant people find a way of thinking so the problem isn’t tricky any more. This problem is about interpreting evidence, and the brilliant person who discovered how to think about evidence is Laplace. (His method is called Bayes’ Rule. Bayes came first, but Laplace seems more important based on my cursory reading of the history.) To illustrate it, let’s start with different problem, related only by the fact that it’s also about using evidence.

I picked a day of the week by first choosing randomly between weekend and weekday, then picking randomly inside that set. I wrote my chosen day down and picked a letter at random. The letter was ‘d’. What is your best guess for the day that I picked? How confident are you? (“At random” means there was uniform probability.)

First, write out all the possibilities. These are called the “hypotheses” (plural of “hypothesis”).

  • Saturday
  • Sunday
  • Monday
  • Tuesday
  • Wednesday
  • Thursday
  • Friday

Next, find their starting probabilities, the ones we have before learning what letter was picked. Half the probabilities goes to the weekend, so they’re 1/4 each. The rest goes to the five weekdays, so they’re 1/10 each. This is called the “prior”. The picture shows lengths proportional to probability.

Let’s imagine we did this experiment 10,080 times (a number I’m choosing since it will work out well later). Then the weekends come up 2520 times each and the weekdays 1008 times each. (This is the expectation value for how many times these days would show up. In a real experiment there would be random deviations.)

Next we look at the evidence – the letter ‘d’. Out of our 10,080 experiments, how many have the letter ‘d’ result? It turns out this will happen 1565 times – 315 times on Saturday, 420 times on Sunday, etc.

The illustration looks like this (partially filled in)

The bottom bar represents all the trial where the letter ‘d’ popped up. It is too small to read, so we’ll blow it up.

We can divide by the total number of times the letter ‘d’ came up to get the probabilities. (Individually this step is called “normalizing”, but it’s really part of updating.)

We don’t really need to consider doing the experiment 10,080 times; I just thought that made it more convenient to visualize. What’s important is the probability distribution at the end. This solves the problem. We now know, given that ‘d’ was the chosen letter, the probability for each day of the week.

To recap, an outline for the procedure is

  1. Find all the possibilities (i.e. define the hypotheses).
  2. Determine how likely the hypotheses are beforehand (i.e. choose the prior).
  3. Update the hypotheses based on how well they explain the data (i.e. multiply them by their probabilites of producing the evidence).
  4. Finish updating by making the hypotheses’ probability add up to one (i.e. normalize).

Let’s reword the original problem to make it clear exactly what evidence we’re collecting, then apply the method:

There are two bros who like to tan their balls. Unfortunately, this can cause testicular cancer. Given their amount of ball-tanning, each testicle of each bro has a 50% chance of having cancer. (The testicles are all statistically independent of each other.) The two bros decide to conduct testicular exams to see whether they have cancer, and their self-administered exams are perfectly accurate. The first bro decides to examine both his testicles, then report whether or not at least one of them has cancer. The second bro decides to examine only his left testicle because he thinks that examining both would count as cradling them and be gay. Suppose the first bro reports that indeed, at least one of his balls has cancer. The second bro reports that his left ball has cancer. Do they have the same probability of having two balls with cancer?

The bros go about collecting different data. The evidence they bring to bear is different, so a good way of handling evidence should be able to show us that the probabilities are different. But there is no need to be clever about finding the solution when you have a general method at hand.

The hypotheses we’ll use are left/right both cancerous, left cancerous/right healthy, left healthy/right cancerous, both healthy. They are all equally likely.

In this case, updating is very simple. Each hypothesis explains the results either perfectly or not at all. Here is what updating looks like for the bro who tested both balls:

Here it is for the bro who tested on the left ball:

So the bro who tested both balls has a 1/3 chance of having cancer in both balls, while the bro who tested the left ball has a 1/2 chance. Both came back with a positive result, but the bro who tested both balls has weaker evidence. It is easier to satisfy “at least one ball has cancer” than to satisfy “the left ball has cancer”, so when he comes back and reports that at least one ball has cancer, he hasn’t given as much information, and his probability doesn’t shift upwards as much. A strong test is one which the hypothesis of interest passes, but which eliminates the competing hypotheses. More hypotheses pass the test “at least one ball has cancer”, so that test doesn’t eliminate as much and is not as strong.

Let’s apply the same method to the Monty Hall problem. Your hypotheses are that the prize is behind door one, door two, or door three. These are equally-likely to begin. You choose door one, and Monty Hall opens door two. That’s the evidence.

If the prize is behind door one (the door you chose), Monty Hall could have opened either door 2 or door 3, so there was a 50% chance to see the observed evidence. If the prize was behind door 2, there was 0% chance, so that hypothesis is gone. If the prize was behind door 3, Monty Hall was forced to open door two, so that hypothesis explained the evidence 100% and becomes more likely.

Here’s a diagram for the probabilities after updating:

So you should switch doors. The usefulness of this method is you don’t need a clever trick specific to the Monty Hall problem. As long as you understand how evidence works, you can solve the problem without thinking and spend your limited thinking power somewhere else.

Within this framework, we can notice some things that help make the problem more intuitive, even after it’s solved. Suppose you have a hypothesis and you collect some evidence. How much more-likely does the hypothesis become? What matters is how much better the hypothesis explains the data than the other hypotheses, not just how well it does by itself. So, if you have two hypotheses and they both explain the data equally well, that data isn’t evidence either way.

You can apply this idea to the bro who tested his left testicle. Whether the right testicle has cancer or not, the data on the left testicle is equally-well explained. Therefore, the right testicle’s probability remains unchanged from 50%, so the bro has a 50% chance of two cancerous balls.

The same is not true for the bro who tested both testicles. If that bro has cancer in his right testicle, that explains the result better than if he doesn’t. That is to say, if the bro has cancer on his right testicle, that explains the result perfectly. But if he doesn’t, there’s only a 50% chance of explaining the result. As a result, the 1:1 odds for the right testicle get multiplied by 2:1 to give 2:1 odds. The probability for his right ball to have cancer increases to 2/3. (The probability for his left testicle to have cancer is also 2/3, but they are not independent.)

In the Monty Hall problem, Monty Hall will reveal an empty door no matter what, so the hypothesis “the prize was behind your original door” explains the data just as well as “the prize was not behind your original door”. Therefore, the probability that the prize was behind your original door doesn’t change; there was no evidence for it. So your original door still has a 1/3 chance. So good Bayesians are not confused by the Monty Hall problem.

Why Least Squares?

February 7, 2012

Rod and Pegboard

Suppose you have a bunch of pegs scattered around on a wall, like this:

You see a general trend, and you want to take a rod and use it to go through the points in the best possible way, like this:

How do you decide which way is best? Here is a physical solution.

To each peg you attach a spring of zero rest length. You then attach the other side of the spring to the rod. Make sure the springs are all constrained to be vertical.

Now let the rod go. If most of the points are below it, the springs on the bottom will be longer, exert more force, and pull the rod down. Similarly, if the rod’s slope is shallower than that of trend in the points, the rod will be torqued up to a steeper slope. The final resting place of the rod is one sort of estimate of the best straight-line approximation of the pegs.

To see mathematically what this system does, remember that the energy stored in a spring of zero rest length is the square of its length. The system finds a stable static equilibrium, so it is at a minimum of potential energy. Thus, this best-fit line is the line that minimizes the squares of the lengths of the springs, or minimizes the squares of the residuals, as they’re called.

This picture lets us find a formula for the least-squares line. To be in equilibrium, the rod must have no force on it. The force exerted by a spring is proportional to its length, so the lengths of all the springs must add to zero. (We count length as negative if the spring is above the rod and positive otherwise.)

Mathematically, we’ll write the points as (x_i, y_i) and the line as y = mx+b. Then the no-net-force condition is written

\sum_i y_i - (mx_i+b) = 0

There must also be no net torque on the rod. The torque exerted by a spring (relative to the origin) is its length multiplied by its x_i. That means

\sum_i x_i \left(y_i - (mx_i + b)\right) = 0

These two equation determine the unknowns m and b. The reader will undoubtedly be unable to stop themselves from completing the algebra, finding that if there are N data points

m = \frac{\frac{1}{N}\sum_i x_iy_i - \frac{1}{N}\sum_i y_i \frac{1}{N}\sum_i x_i}{\frac{1}{N} \sum_i x_i^2 - (\frac{1}{N}\sum_i x_i)^2}

b = \frac{1}{N}\sum_i y_i - m \frac{1}{N} \sum_i x_i

These formulas clearly contain some averages. Let’s denote \frac{1}{N}\sum_i x_i = \langle x \rangle and similarly for y and combination of the two. Then we can rewrite the formulae as

m = \frac{\langle xy\rangle - \langle x \rangle \langle y\rangle }{\langle x^2\rangle - \langle x\rangle ^2}

\langle y \rangle = m \langle x \rangle + b

This is called a least-squares linear regression.


The story about the rod and minimizing potential energy is not the really the reason we use least-squares regression; it was only convenient illustration. Students are often curious why we do not, for example, minimize the sum of the absolute values of the residuals.

Take a look at the value \langle x^2\rangle - \langle x\rangle ^2 from the expression for the least-squares regression. This is called the variance of x. It’s a very natural measure of the spread of x – more so than the one you’d get by adding up the absolute values of the errors.

Suppose you have two variables, x and u. Then

\mathrm{var}(x+u) = \mathrm{var}(x) + \mathrm{var}(u) + \langle 2xu\rangle - 2\langle x \rangle \langle u \rangle

The reader is no doubt currently wearing a pencil down to the nub showing this.

If x and u are independent, the last two terms cancel (down to the nub!), and we have

\mathrm{var}(x+u) = \mathrm{var}(x) + \mathrm{var}(u)

In practical terms: flip a coin once and the number of heads has a variance of .25. Flip it a hundred times and the variance is 25, etc. This linearity property does not hold for absolute values.

So variance is a very natural measure of variation. Simple linear regression is nice, then, because it

  1. makes the mean residual zero
  2. minimizes the variance of the residuals
Defining the covariance as a generalization of the variance \mathrm{cov}(x,y) \equiv \langle xy\rangle - \langle x\rangle \langle y\rangle (so that \mathrm{var}(x) = \mathrm{cov}(x,x)), we can rewrite the slope m in the least-squares formula as
m = \frac{\mathrm{cov}(x,y)}{\mathrm{var}(x)}

The Distance Formula

The distance d of a point (x,y) from the origin is

d^2 = x^2 + y^2

In three dimensions, this becomes

d^2 = x^2 + y^2 + z^2

The generalization to n dimensions is clear.

If we imagine the residual as coordinates of a point in n-dimensional space, the simple linear regression is the line that brings that point in as close to the origin as possible, another cute visualization.

Further Reading

The physical analogy to springs and minimum energy comes from Mark Levi’s book The Mathematical Mechanic. Amazon Google Books

The Wikipedia articles on linear regression and simple linear regression are good.

There’s much mathematical insight to be had at Math.Stackexchange, Stats.StackExchange and MathOverflow

My Peers’ Birthdays

May 18, 2011

follow-up to My Friends’ Birthdays

The main conclusion I drew from examining my Facebook friends’ birthdays is that I didn’t have enough data to see the birth month effect – when your month of birth influences your success in a field because it decides your relative age to your peers early on in sports or school.

The birth month effect is real in some circumstances. Just now, I searched for “US junior baseball team” and found this roster.

In Outliers, Malcolm Gladwell explained that the cutoff date for youth baseball leagues in the US is July 31. (It’s now changed to May 1, so in ten years we can do this experiment over and see the effect.) Thirteen players on the roster were born in the half of the year directly following July 31 (August through January), and only five were born in the next half (February through July). With data like that, even a sample of eighteen people is enough to see the strong effects that birth month has on athletic success. The odds of such lopsidedness occurring by random chance are about 5%.

If 18 baseball players is enough to see a significant birth month effect in sports, then shouldn’t more than 100 Facebook friends have been plenty to see it in education?

In American education, there is no firm, uniform cut-off date like there is with baseball. Different states have different dates. Also, parents may have a choice about when to send their child to kindergarten if the child is born in a certain window. I was born in December in Maryland, where entering kindergarteners must be five years old by December 31. I could have been one of the youngest students in my grade, but my parents held me back, making me one of the oldest. Their stated reason was that they thought I’d appreciate being one of the first kids with a driver’s license come high school.

Mixed-up birth months, along with other obfuscating factors the reader may imagine, could easily make a real signal difficult to pick up, so I asked the Caltech registrar’s office for data on all the domestic Caltech students. They kindly obliged, with birth months tallied for the 5083 students enrolled since 1985. I was asked not to release the data directly, but I can report on its statistics.

Since September to December babies can be either old or young when entering kindergarten, let’s leave them out. The hypothesis is that entering Caltech students are more likely to be born in the January to April time frame than May to August. (If you want to be a stickler for experimental design, we could say that the null hypothesis is that students are equally likely to be in those categories.)

There were 3399 students whose birth months fell into one of these two ranges. If each student were a simple binomial variable with even probability we’d expect a standard deviation of 29 in the number of students in each range. We should also take into account that these periods aren’t perfectly equal in numbers of births. According to a Google result, a baby born anywhere from January to August has a 51.85% chance of being born in the May-August window, due partially to the three extra days and partially to higher birth rates. Thus, we expect that if domestic Caltech students have birth month patterns that mirror the American population at large, there should be 1762 +/- 30 students born in the May-August window. If there are fewer than 1700, we have evidence that Caltech students are less likely to be born in the summer.

The statistic is 1713 born in those months, compared to 1686 in January – April. The discarded period, from September to December, has 1684. There is no significant evidence to suggest that Caltech students are more likely to be born in any particular month.

This certainly doesn’t disprove the idea that your month of birth impacts your success in school, but the effect, if present, is not as powerful in education as it is in organized sports.

Let’s Read The Internet! Week 1

October 12, 2008

Earth From Above

Sensory overload.  Thirty photographs of socially-relevant scenes from around the world, each of which could easily launch me on a few hours of reading and comparing.  Taken together, they present an overwhelming mosaic of a vibrant, living, interconnected, diverse, and changing planet.

But, damn, I just noticed that the story has gone from displaying 30 photographs to just ten, and the impact is nowhere near as great.  It seems hypocritical that the coordinator of the exhibit (not the artist) should ask the website to take photographs down, when the whole point of the exhibit, as expressed by the artist, is that it be completely free and displayed out on the streets in cities across the world to reach as broad and diverse an audience as possible.

Not the end of evolution again!

  John Wilkins at “Evolving Thoughts”

You might have heard about some guy telling the media that human evolution is over because we now care for our sick.  Wilkins presents a brief, irate counterargument.

I’ve done some reading on evolution from time to time.  What I’ve learned is that’s it’s deceptively difficult to understand.  Although the basic idea that heritable variation and selection pressure combined lead to evolution is straightforward, there are an awful lot of intricacies you find when you begin to look more closely.

There are some things you can do – such as study genomes to see how closely-related two species are, or study fossil records to document the evolutionary history of a species.  But there are a lot of things you can’t do, such as say, “Dinosaurs evolved to be really big because they were in an arms race.  Prey got bigger, so predators were forced to get bigger, and then prey got bigger again and off they went.”  That is not a falsifiable hypothesis, because you can’t go back and test it.

You can study evolution mathematically, and you can make falsifiable predictions, and then compare those predictions to observation.  But statements like, “human evolution is over because we care for our sick” are basically pseudoscience.

Extremely simplistic thinking about evolution leads to paradoxes.  For example, now that we first-world men don’t have to worry much about dying in our mothers’ arms during infancy, getting killed in battle at age 16, or starving to death when the buffalo find a new migration route at age 27, shouldn’t the biggest factor left in our reproductive success be how good we are at attracting women?  And therefore, shouldn’t every man spend all his effort spreading his seed far and wide?  Shouldn’t guys just be thinking about sex all the time…  Oh, never mind.

How We Evolve

Benjamin Phelan at Seed Magazine

A lengthy article about the sort of thing I referenced above – collecting data from genomes to study evolution.  Here, the scientists took genomes from humans of varying ethnic background and looked for characteristic differences in their DNA as evidence of evolution.  Bottom line: Yes, people are still evolving.  For example, as a white man, I am a highly-evolved lactase-producing being, unlike the those primitve, dairy-bloated Asians.

In Defense of Difference

Maywa Montenegro and Terry Glavin at Seed Magazine

In a companion article to the one above, the authors discuss why we might want to save the rainforest, anyway (because we like rain?).  Not just because we like toucans or want to display a World Wildlife Fund bumper sticker on our Prius.  Because it has economic, social, medical, and scientific value to humans.

The idea is that biology is a huge information-gathering system.  From a protein to an organism to an ecosystem, evolution allows biology to record information about how to live in the world, and also provides a ready-made task force 10^30 cells strong that will do its best to find out how to live in a changing world.  The more stuff we destroy, the more Earth loses the ability to adapt to hard times.  Clear the rainforest to raise crops, and disease or natural disaster or pollution find it much easier to brutally rape the new, homogenized biosphere.

The argument is then extended to such things as preserving human languages, which record the results of thousands of experiments in creating human culture.  So nature is basically a billion billion whatever tiny lab books full of experiments, and instead of reading them, we’re throwing them out.

Nobel Sur-prize

Peter Coles at “In The Dark”

Particles are everywhere.  While you read this, particles are in your home, in your infant child’s crib.  In her anus.

I don’t understand them.  But here’s a fairly simple explanation of the work that won this year’s Nobel Prize in physics.  Basically, the uproar is that this guy Cabibbo had a big idea about physics that helped explain a mystery about particles.  Later, Kobayashi and Maskawa solved a mathematical problem that expanded on Cabibbo’s idea.  Both were important – the original idea and the difficult mathematical extension to it – but only one was awarded the prize.

Also check out a more basic article from Mark Chu-Carroll at “Good Math, Bad Math”

What positive psychology can help you become

Martin Seligman on TED

A talk interesting enough that I watched it twice.  The second time while smiling. Seligman decided to break happiness, or life satisfaction, into three categories.  (To me this is rather arbitrary.  It’s not like the categories of happiness are just sitting out there, waiting to be discovered.  But it’s a persistent plague among people who study such things to break them down into categories they believe are fundamental. i.e. “four personality types”, “two political ideologies” (left/right), or even “four kingdoms of life” (some people now say up to 13.  others 2))

The categories are: surface pleasures like active social life, good love life, and sensual pleasure; “flow”, or the state of intense focus and concentration associated with, say, rock climbing or physics sets; and “meaning”, or finding something greater than yourself to dedicate your life to.

Seligman describes the beginnings of a movement to apply a scientific approach to the study of happiness, as opposed to the traditional psychological model of simply curing mental illness and depression.  He claims that it is possible for people to increase the fulfillment they feel in life by a concentrated effort in the right direction.  It’s a summary of the beginning of a quest to understand people in a new way, and apply that understanding to make life better.


A site you may have seen before.  They use maps of the world to visualize data.  For example, compare a map where each country’s size represents the number of personal computers in that country, to a map showing how many people died of “often preventable deaths”.

There are a lot of technically-interesting things about this project.  How did they get the countries to fit together, when their land area needs to be fixed at some arbitrary size?  What sort of properties of the standard world map’s topology were they trying to preserve?

But more interesting are the social, economic, and political insight you can get.  Compare this project to the Gapminder.

Small samples, and the margin of error

Terry Tao at “What’s New”

Terry Tao tones it down a notch to present something even I can understand.  He discusses how a random sampling of just 1000 people can give an accurate picture of the opinions of a nation of 200,000,000 voters.  He also gives a short proof of the accuracy of the sample, without resorting to the binomial distribution.