False Positives in Little Brother

Cory Doctorow’s Little Brother is a novel available for free online. Maybe I’ll post about the book in general when I finish reading it, but here I want to discuss a didactic section that’s somewhat imprecise. I think it’s fair to do this, not nit-picky, because the book is clearly intended to disseminate information as much as to be a story.

The narrator, a high schooler fighting the man, gives the following discussion when the Department of Homeland Security begins using computers to monitor almost everything everyone in San Francisco does in the aftermath of terrorist bombings there:

Now, say you’ve got some software that can sift through all the bank-records, or toll-pass records, or public transit records, or phone-call records in the city and catch terrorists 99 percent of the time.

In a pool of twenty million people, a 99 percent accurate test will identify two hundred thousand people as being terrorists. But only ten of them are terrorists. To catch ten bad guys, you have to haul in and investigate two hundred thousand innocent people. (pp 52/53 of PDF version)

The problem with this analysis is revealed in the buildup example that came just before the terrorists, where Doctorow’s character considers the hypothetical “SuperAIDS virus”

You develop a test for SuperAIDS that’s 99 percent accurate. I mean, 99 percent of the time, it gives the correct result true if the subject is infected, and false if the subject is healthy.

Doctorow’s test is characterized by one number – its accuracy. But even in an idealized situation, this is an oversimplification. Such a test should be characterized by two numbers.

Imagine we can very cleanly divide people into two categories (healthy and infected.)

Then there are four possible results of a single test:

  1. A healthy person is identified as healthy.
  2. A healthy person is identified as infected.
  3. An infected person is identified as healthy.
  4. An infected person is identified as infected.

So when you want to talk about the accuracy of the test, there are four probabilities involved. However, if you know the probability that a healthy person will test as healthy is p, you automatically know that they have a 1-p probability of testing as infected.

Knowledge of the test’s accuracy for healthy people doesn’t tell you anything about the test’s accuracy for infected people, though. So you need another probability to tell you the accuracy of the test when applied to infected people. Call that q. It’s the probability that an infected person taking the test will read as infected. The novel assumes p=q, but for a real test that is unlikely, as there’s no particular reason that the mechanism for false positives is similar to that for false negatives. If poppy seeds make you fail a heroin tests when you’re clean, but breath mints make you pass when you’ve been using, there’s no reason to assume the proportion of clean people eating poppy seed muffins is the same as the proportion of junkies with an affinity for peppermint.

Doctorow was interested in the ratio of false positives. Presumably, you’ll want to treat (or do further tests, or whatever) everyone who tests positive. But you don’t want to waste your time with false positives. So we want to find the proportion of positive tests that are false positives.

You can’t do this with just knowledge of p and q, even though they completely characterize the test. You also need to know the percentage of people who are infected. As the narrator points out, if a very low percentage of people actually have the disease, there’s a good chance that false positives will dominate. On the other hand, if a lot of people are infected, you’ll get a lot of positives from those people, and a positive result becomes more meaningful (less likely to be false).

To make this precise, let the true probability that a given person is infected be r. Then the probability that a given test will results in a positive result for an infected person is r*q, the product of the probabilities that the person is infected and that given he/she is infected, the test gets the right answer.

The probability for a false positive is (1-r)*(1-p), because first the person must not be infected (probability (1-r)), and then the person must test positive (probability (1-p)). Therefore the ratio of false positives to true positives is

\frac{(1-r)*(1-p)}{r*q} .

If q (the accuracy for infected people) is already, say, .9, then improving it to .99 is not going to make a big difference, especially if there are lots of false positives. You’ll only get 10% more correct positives that way, which is no big change. Increase q from .99 to .999999999 and you’ve still done practically nothing to the ratio of false positives. q is still important when it comes to finding all the infected people, but for reducing the false positive ratio it’s mostly useless. The exception is if q is very small. Then increasing q from .01 to .1 makes a difference in the false positive ratio, because you’ve multiplied the number of correct positive tests by ten. On the other hand, a test that only catches 1% or even 10% of infected people is not a very good test.

What you really need to do is reduce the number of false positives by increasing p, the accuracy for healthy people. If p is .9, and you increase it to .99, that does have a big difference, because you have cut the false positives down by 90% (from 10% to 1%). Increase p to .9999 and again you cut down the false positives a huge margin. So in a test where you consider the accuracy for both healthy and infected people, the accuracy for infected people dictates how many lives are lost due to missed detections, but the accuracy for healthy people dictates whether you have to deal with a nightmarish sea of false positives.

There’s one other way to decrease your ratio of false positives. Just increase r by going around and infecting everybody. If the infection rate reaches 100%, the false positives go to zero!

This might seem frivolous. Since q is relatively unimportant (unless it’s very small, in which case the test sucks anyway), just consider all the figures as representing p.

The problem is in this final passage, where we’re going back to considering a test for terrorist/nonterrorist rather than healthy/infected.

Guess what? Terrorism tests aren’t anywhere close to 99 percent accurate. More like 60 percent accurate. Even 40 percent accurate, sometimes.

Now it does make a difference, because I’m honestly confused. I don’t know whether the narrator means that only 60% of terrorists will be caught (mostly irrelevant to considering false positives) or that only 60% of nonterrorists will be correctly exculpated (seems absurd for a test to think 40% of people are terrorists).

In the story, the narrator tries to decrease p, that is, create more false positives. This is, as I said earlier, to fight the man. If you’ll excuse, me I need to go find out whether it works.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: