14 Billion Cupcakes, Or: Why You Don’t Know What To Do With Your Life

April 28, 2012

So here’s an alternative two-step method for understanding the universe.

Step 1: Remember: Six thousand years ago, God created the Heavens and the Earth.
Step 2: Repeat as necessary.

Isn’t that a whole lot easier than analyzing electromagnetic background for evidence of some “Big Bang” fourteen billion years ago? Fourteen billion is a pretty big number, and God didn’t create us so we could waste time trying to picture fourteen billion cupcakes. (DON’T TRY THIS!)

One, Two, aaargh!

-Stephen Colbert, I Am America (And So Can You!)

You have a stronger mind than Stephen Colbert. If I ask you to picture 14 billion cupcakes, you’ll say, “No problem. Doing it right now.” Little do you realize that the ability to deal with 14 billion cupcakes is the heart of not knowing what to do with life.

So you claim to be picturing 14 billion cupcakes? That’s not what you’re really doing. Instead, you are (perhaps) imagining two cupcakes for every person on Earth. So I ask you to picture every person on Earth. “Okay,” you say, “sure.”

But you’re not. You’re picturing a map of the world, or maybe you see a crowd of faces with different ethnicities. The details vary, but in general your mind constructs a much simpler, more concrete idea that takes the place of “every person on Earth” – you create an icon.

For me, the interesting thing is that I know I can’t picture a billion cupcakes but it still feels as if I can. Our minds can build very high-level abstractions, and we’re so good at it that the process is transparent. That’s where the problem comes in.

What do I want to do with my life?

It takes me only a second to read this question and ten or a hundred seconds to ponder it before my mind wanders. Perhaps I can go a thousand seconds if I’m particularly melancholy or my pet chinchilla just died. But the expanse of time I am considering is a few billion seconds. I cannot imagine them all. The icon I construct for “the rest of my life” inevitably becomes distorted: idealized, homogenized, and definitized beyond reason, and this happens without my conscious recognition.

This isn’t just my personal affliction. The people we overestimate most are our future selves. In 2006, Netflix offered a million-dollar prize to anyone who could improve their algorithm for predicting users’ film ratings. Their goal was to make better recommendations for what to watch next. The prize was won in 2009, but it turns out the Netflix didn’t use the improved algorithm.

Over the years the contest ran, Netflix’s business model shifted. In 2006, users were mostly getting new movies by mail, meaning they were placing orders for movies they wanted to watch several days from now. Why, me? I’m a connoisseur! A few days from now, I will be very interested in watching an intellectually- challenging cinematic landmark.

By 2009, Netflix users had shifted most of their watching to streaming over the internet. Suddenly, well, it’s certainly true that I’m a connoisseur, but I didn’t get too much done at work and I feel bad about not calling my mom enough. It’s a bit too late to turn a new leaf today, so I guess I’ll see what Steve Carell is up to in his latest movie, but tomorrow it’s Ingmar Bergman all the way. The difference was so striking that the algorithm based on the 2006 challenge was out of date by 2009.

And this goes on. Aguirre, der Zorn Gottes tomorrow, Rush Hour 3 tonight. Vegetables and whole grains tomorrow, pizza and beer tonight. Ulysses tomorrow,Grand Theft Auto tonight. See the world in a grain of sand tomorrow, masturbate to fetish porn and fall asleep with your shoes on tonight.

When I’m thinking about the future, I occasionally write a To-Do list. It will start off with a mix of errands and the important stuff: go to seminar, visit to the bank, read the latest chapter, grade these assignments, get some exercise, check out this paper, etc. But when my list is long enough to fill up the day, I always have a few extra things in mind, so I write those down too. That brings more stuff to mind, and before long my list has items like “learn quantum field theory” and “overthrow the evil empire”. Even though the time scale would be the same, somehow my list never has “buy groceries two thousand times.” My future apparently consists of nothing but pure ideals and great achievements. Every mundane detail is excised, leaving only deep, meaningful stuff. It’s like I expect to start living in an Ayn Rand novel.

So any time I have tried to think about the future, I’ve never been close. Worse, I don’t realize how delusional I am. I can’t see the tricks my mind is playing on me. I become obsessed with the wonderful, abstract existence I’m about to create for myself.

How many times have you thought, “once I find a new job, everything will get better?” And if not that, we fantasize the turning point will be moving to a new place, graduating, falling in love, breaking an addiction, finishing a project, having a successful IPO, etc. Once I get over this hump, it will all get better.

That’s not true. “Happily ever after” isn’t how it works. I don’t mean we can’t be happy. I mean it’s an insufficient description of “ever after”. Our brains can’t hold an entire, rich future in view at once, so we compress it down to something like “let’s grow old together”. That’s a bad icon, but brains basically work like a man stumbling around a dark garage and grabbing things off the shelf at random. It’s the first available solution, not the best one, that gets thrown at the problem. The result? Three months after the Disney movie ends, the princess is homesick and Prince Charming is eyeing the chambermaid. The grass is always greener…

At long last, we find what we sought, only to realize it is not quite how we imagined.

Cruelly, the more optimistic you are, the harder you’re hit by this. Don’t trust your retirement portfolio to a happy person.

We tend to handle the big questions with small answers: aphorisms, epitaphs, haiku, koans, parables, quotations. The briefer the wiser. This seems backwards of how it ought to be. Beware of any medium in which the message seems to say more the shorter it is. It’s a sign you’re not getting advice so much as having your far-view blindness hacked by a platitude. It’s the journey, not the destination, man.

You can’t act on a wise saying, but I don’t have any more-specific advice for you either. Once I start claiming that such-and-such thing will solve this problem, it’s a lot easier for me to be wrong. The best you can hope for in this business is to get people to pay thousands of dollars before you tell them what to do. That way they’ll be sure to convince themselves it worked; it’s the only way they can keep from having wasted their money.

Even writing this essay has not released me from my poor grip on the future. Somehow, I still have that same feeling. Once I find what to do with my life, it will all get better. But I wouldn’t bet 14 billion cupcakes on it.

Further Reading:
Cobbled together from stuff in Dan Ariely’s Predictably Irrational, Dan Gilbert’s Stumbling on Happiness, and Robin Hanson’s blog Overcoming Bias.

My apologies to anyone reading this the night before their wedding.

taken from my answer on Quora: http://www.quora.com/Life/How-can-I-figure-out-what-I-really-want-to-do-with-my-life

Three Things Every Man Should Have and Know

April 26, 2012

I turn around and all the sudden my Facebook friends are getting excited about becoming 30-year old women. So in response to 30 Things Every Woman Should blah blah blah, here are some things a man should have and know before he turns 30.


Something he’s good at (preferably marketable).
Some self-confidence.
Training in at least five ways to exterminate a zombie.


How to eat well, exercise, and manage money.
Who his friends are.
Where the clit is.

I think that should pretty much cover it.

Do The Math

April 24, 2012

In a follow-up to yesterday’s post, I want to point out a blog by astrophysicist Tom Murphy at UC San Diego (I don’t know him).

Do the Math looks at back-of-the-envelope calculations related to energy, environmentalism, and related issues. Tom produces more high-quality material than I’ve been able to absorb, but what I’ve read has consistently been insightful and, thankfully, sane. Check out his post index for some food for thought.

Earth to Humans: You’re Doing It Wrong.

April 24, 2012

Here’s my Earth Day article. You may notice it’s late. That’s because I didn’t realize it was Earth Day until a few hours after midnight when somebody said something dumb. Here it is:

The founder of a popular British festival has even said that he would consider powering the event on beer piss, should science find a way. Don’t laugh — human beings collectively produce around 6.4 trillion liters of urine a day, so an effective way of harvesting energy from this golden wonder-fuel might end our fossil fuel dependency overnight, as well as mitigating the effects of one more way we go about polluting the environment.

We do not produce 6.4 trillion liters of urine a day, even on a steady diet of coffee, alcohol, and the vague first-world boredom that leads to a bathroom break every half hour or ten games of Draw Something, whichever comes first. The 6.4 trillion figure is around 250 gallons of urine per person per day. If that were so, your urine would fill two midsize cars every week. At an average flow rate of 20 mL/sec, you’d have to pee for fourteen hours every day to get it all out.

That’s the dumb part – a silly gaffe. But there’s a stupid part, too. You can’t get more energy out of beer urine than you can get out of beer. You can’t get more energy out of beer than you can get out of beer plants. You can’t get more energy out of beer plants than you can get from the sunshine they absorbed. Processing your sunlight by way of a barley seeds, the digestive system of yeast, and a human liver is, as a thermodynamic strategy, piss poor.

Humans are not energy producers. Any energy we output came from our food and represents our bodies’ inefficiency. Only a fraction of the energy we eat can be reharvested, and the energy we eat is about one percent of the energy we use on all our gadgets and things. Measured purely by energy consumption, it’s as if every person in the US has 100 personal servants. Recapturing energy from our bodies is like realizing our 100 servants are too expensive, so we make one of them give us a percent or two of their wages back. That means we can only ever get a miniscule fraction of the power we need from any human activity – urination, generators inside exercise equipment, piezoelectric thingymabobbers in the floor, engines run on body heat, etc.

Even if you crush your enemies and drive them before you, the lamentation of their women will not provide much power.

Why bother, then? Why is there a dance club whose floor generates electricity for lighting as revelers hop around on it? Why don’t they just dance during the day?

Human-generated electrical power could make sense in special circumstances – charging your bicycle light with energy from the bicycle, for instance, but as a general plan it’s insane. The floor in that club is not about generating electricity. It’s very unlikely that the energy generated could ever recoup the cost of the installation – if you exercise for an hour, you’ll generate around a penny worth of electricity, and that’s with high efficiency. Instead, the floor is about advertising that it generates electricity.

This is what we’ve done with energy conservation – made it into a luxury item more about social signalling than ecological benefit. How many people, proud of their environmentally-conscious Prius, have any idea how much energy went into the car’s manufacture? How many of them drive it alone? (Though Prius owners may deny it, the car’s popularity is mostly about social signalling. For cars that come in gas-only or hybrid variants, the hybrids don’t sell well. If it’s not a hybrid-only brand, it’s a lot harder for people to recognize how environmentally-conscious you are.)

No one would tie a helium party balloon to a hippopotamus and say, “See? I did my part to help it fly!” Yet they feel just like that when they bring their own bags to the grocery store. On Earth Day, people turn their lights out for an hour. (Did that happen this year? Or is it some other day? Whatever.) If everyone turned all their lights out in their homes all the time, it would reduce power consumption in the US by about two percent.

The lights-out thing is symbolic, of course. It’s there to remind you of the importance of energy conservation, and to show other people you think energy conservation is important. The problem we’re facing is that everything is symbolic – our efforts at conservation are almost random, showing no systematic effort to focus on the big-ticket items, or even knowing what they are. How many cell phone chargers would you have to unplug to make up for the energy spent on one cross-country plane flight? Most people don’t know, and so most effort put into energy conservation is wasted.

Worse, if you’re conserving energy because you want the warm fuzzies associated with it, you get your warm fuzzies based on how much you inconvenience yourself and how much you show off, not on how much energy you actually save. You feel just as good about unplugging cell phone chargers as deciding to stay local on vacation. Our emotions have no sense of scale.

Even worse than that: when we talk about energy conservation and environmentalism, we’re largely bullshitting, and people pick up on that. That’s the thing with signalling to your tribe. It gets the other tribe pissed off. (And as we’ve learned, piss is not very productive.) The worst part about energy conservation and environmentalism is that they’ve been wrapped up into one issue and shipped off to the place where good debates go to die – politics.

If we could separate our conservation efforts from our warm fuzzies, we’d send out fewer of the pheromones that rile up political associations and drive out even the possibility reasonable discourse. Fewer news stories. Fewer buzz words and applause lights. More Sustainable Energy Without the Hot Air and The Azimuth Project. That is how you get a hippopotamus to fly.

Electric Fields in a Wire

March 8, 2012

A student asked me the following insightful question:

Suppose we have a situation like this:

The lines are wires. The circle is a light bulb. Current runs from left to right. Because the wire has negligible resistance, essentially all the current runs through the top section of wire, skipping around the light bulb, and the bulb doesn’t light.

But what if we look only at this red region?

At the junction, we see current coming in from the left, and deciding to go up rather than continue on to the right. But inside the box, both regions are just made of wire. How does the current know to go up rather than continue straight through?

The answer is that, at the junction, the electric field points mostly up, and very little to the right. But why is that?

Consider two resistors in series. Each one has a voltage drop proportional to its resistance. If the resistors are the same length, the electric field strength inside the resistors is then proportional to the resistance. If one resistor has very high resistance compared to the other, the electric field is much stronger in that resistor.

Go back to the bottom path of our picture. We essentially have three resistors – a wire, a light bulb, and another wire – in series. The light bulb has significant resistance, while the wires do not. Therefore the electric field is much stronger in the light bulb than it is in the wires. It’s weak everywhere, since the bulb isn’t lit. But still, as weak as it is in the bulb, it’s much weaker still in the surrounding wire, by an amount that is proportional to the ratio of the bulb’s resistance to the wire’s resistance.

On the other hand, in the top path, the electric field is the same strength everywhere because it is all just a wire.

Both paths have the same voltage drop because they are in parallel. If they are the same length (they aren’t drawn the same length, but it is easy to imagine), the average electric field strength in them must be the same.

So the average electric field strength is the same in bottom and top. But the electric field in the bottom is localized almost entirely in the light bulb. That means that right at the junction, the electric field is much weaker in the wire heading right than in the wire heading up. Hence, the current almost entirely goes up, making a decision about where to go based only on the local electric field.

Giving Grades

February 10, 2012

Last semester, I was the TA for a small course on the physics of  waves aimed at biophysics majors. Since I was the only TA, I was completely in charge of grading homework and writing homework solutions.

I don’t like the idea of letter grades. They’re a pretty clear example of Goodhart’s law. As such, I especially don’t like arguing over grades, something pre-meds are apt to do.

So I tried out a strategy that worked pretty well. I announced at my first recitation section that due to the inevitable errors I would make while grading homework, I was adding 5% to each homework grade (except where that would take the score over 100%). This would hopefully even out any errors I made of the course of the semester.

However, if the students wanted to find all the grading errors I made and point them out to me, I would still be happy to add back in those missing points. Of course, if that were the case, it would be clear I wouldn’t need to give that student the extra 5%, since it was only there to compensate for errors, and the student planned on catching those errors themself.

By essentially bribing students with a small bonus, I managed to go the entire semester without playing picky-points with anyone. (Although I can’t say for certain that I would have had to play it without the policy, since this was my first course here.)

This semester I’m in a 200-student introductory course and the homework is graded automatically by computer. I do have to grade lab reports, though.

At the TA organizational meeting today, they asked whether we TAs prefer labs to be graded out of 30 points or 100 points. 30 points is the clear winner. The problem with a 100-point scale is that you’re so used to it, the number is instantaneously and unconsciously compared against your expectations, but a less-common scale throws that off, is if you are hearing the temperature in Celsius or the price in yen. (For someone with my US biases, of course.)

26/30 and 87/100 are the same grade, but when you see 26/30, you say, “oh, I lost four points.” When you see 87/100 you say, “a B, what the hell! Where’s that TA?”

That’s the theory, at least. It remains to be scene if this semester will go as smoothly as the last.

Why Least Squares?

February 7, 2012

Rod and Pegboard

Suppose you have a bunch of pegs scattered around on a wall, like this:

You see a general trend, and you want to take a rod and use it to go through the points in the best possible way, like this:

How do you decide which way is best? Here is a physical solution.

To each peg you attach a spring of zero rest length. You then attach the other side of the spring to the rod. Make sure the springs are all constrained to be vertical.

Now let the rod go. If most of the points are below it, the springs on the bottom will be longer, exert more force, and pull the rod down. Similarly, if the rod’s slope is shallower than that of trend in the points, the rod will be torqued up to a steeper slope. The final resting place of the rod is one sort of estimate of the best straight-line approximation of the pegs.

To see mathematically what this system does, remember that the energy stored in a spring of zero rest length is the square of its length. The system finds a stable static equilibrium, so it is at a minimum of potential energy. Thus, this best-fit line is the line that minimizes the squares of the lengths of the springs, or minimizes the squares of the residuals, as they’re called.

This picture lets us find a formula for the least-squares line. To be in equilibrium, the rod must have no force on it. The force exerted by a spring is proportional to its length, so the lengths of all the springs must add to zero. (We count length as negative if the spring is above the rod and positive otherwise.)

Mathematically, we’ll write the points as (x_i, y_i) and the line as y = mx+b. Then the no-net-force condition is written

\sum_i y_i - (mx_i+b) = 0

There must also be no net torque on the rod. The torque exerted by a spring (relative to the origin) is its length multiplied by its x_i. That means

\sum_i x_i \left(y_i - (mx_i + b)\right) = 0

These two equation determine the unknowns m and b. The reader will undoubtedly be unable to stop themselves from completing the algebra, finding that if there are N data points

m = \frac{\frac{1}{N}\sum_i x_iy_i - \frac{1}{N}\sum_i y_i \frac{1}{N}\sum_i x_i}{\frac{1}{N} \sum_i x_i^2 - (\frac{1}{N}\sum_i x_i)^2}

b = \frac{1}{N}\sum_i y_i - m \frac{1}{N} \sum_i x_i

These formulas clearly contain some averages. Let’s denote \frac{1}{N}\sum_i x_i = \langle x \rangle and similarly for y and combination of the two. Then we can rewrite the formulae as

m = \frac{\langle xy\rangle - \langle x \rangle \langle y\rangle }{\langle x^2\rangle - \langle x\rangle ^2}

\langle y \rangle = m \langle x \rangle + b

This is called a least-squares linear regression.


The story about the rod and minimizing potential energy is not the really the reason we use least-squares regression; it was only convenient illustration. Students are often curious why we do not, for example, minimize the sum of the absolute values of the residuals.

Take a look at the value \langle x^2\rangle - \langle x\rangle ^2 from the expression for the least-squares regression. This is called the variance of x. It’s a very natural measure of the spread of x – more so than the one you’d get by adding up the absolute values of the errors.

Suppose you have two variables, x and u. Then

\mathrm{var}(x+u) = \mathrm{var}(x) + \mathrm{var}(u) + \langle 2xu\rangle - 2\langle x \rangle \langle u \rangle

The reader is no doubt currently wearing a pencil down to the nub showing this.

If x and u are independent, the last two terms cancel (down to the nub!), and we have

\mathrm{var}(x+u) = \mathrm{var}(x) + \mathrm{var}(u)

In practical terms: flip a coin once and the number of heads has a variance of .25. Flip it a hundred times and the variance is 25, etc. This linearity property does not hold for absolute values.

So variance is a very natural measure of variation. Simple linear regression is nice, then, because it

  1. makes the mean residual zero
  2. minimizes the variance of the residuals
Defining the covariance as a generalization of the variance \mathrm{cov}(x,y) \equiv \langle xy\rangle - \langle x\rangle \langle y\rangle (so that \mathrm{var}(x) = \mathrm{cov}(x,x)), we can rewrite the slope m in the least-squares formula as
m = \frac{\mathrm{cov}(x,y)}{\mathrm{var}(x)}

The Distance Formula

The distance d of a point (x,y) from the origin is

d^2 = x^2 + y^2

In three dimensions, this becomes

d^2 = x^2 + y^2 + z^2

The generalization to n dimensions is clear.

If we imagine the residual as coordinates of a point in n-dimensional space, the simple linear regression is the line that brings that point in as close to the origin as possible, another cute visualization.

Further Reading

The physical analogy to springs and minimum energy comes from Mark Levi’s book The Mathematical Mechanic. Amazon Google Books

The Wikipedia articles on linear regression and simple linear regression are good.

There’s much mathematical insight to be had at Math.Stackexchange, Stats.StackExchange and MathOverflow


January 30, 2012

Outrun a Tiger

Alice and Bob were walking in the woods when a snarling tiger jumped out in front of them.

Alice bent down and starting changing into running shoes.

“Why are you doing that?” asked Bob. “You can’t outrun a tiger.”

“I don’t have to out…” said Alice before the tiger sank its razor-sharp death teeth into the soft flesh around her jugular. It takes at least a minute to change shoes, and the tiger was only, oh, let’s say 20 meters away to begin with.

Then the tiger killed Bob, too. Not because it was hungry. Just because it lived for the moment when it saw the life go out of its victims’ eyes.

Moral: Tigers are nature’s perfect killing machine. By the time you see one, it’s already too late.

Looking For Keys

A drunk man was in the parking lot outside a bar, looking intently at the pavement under a streetlight. A woman came out of the bar, tottering back and forth some as she walked over to the man and asked, “Oh, did you lose your keys here?”

“I don’t know where I lost them. Probably over there by my car, I guess,” said the man.

“Then why are you looking under the streetlight?” asked the woman.

“Because there’s light here,” said the man.

The woman seemed to think this was ridiculous.

“Look,” said the man. “I suppose there’s about a five percent chance I lost my keys under this streetlight, but if I did lose them here, there’s a ninety percent chance I’ll find them. That makes four and a half percent chance that I’ll find my keys by looking here. On the other hand, there’s a thirty percent chance I lost them in a similarly-sized area around the vicinity of my car, but it’s so dark that even if they are there, there’s only a ten percent chance I’ll find them. If I search near my car I only have a three percent chance of success. Therefore I’m acting logically by looking under this streetlight, even though I don’t think this is where my keys are.”

“Oh, I um…” said the woman.

“Hey,” said the man. “Why don’t you just give me a ride? My place is only two miles away, and I would gladly pay you a fair price for your inconvenience. I can come back tomorrow and look some more when there’s light.”

“You’re weird,” said the woman. Then she shot pepper spray in the man’s eyes.

Moral: Everyone hates nerds.

Zen and the Teacup

A Westerner wanted to learn Zen, so he went to visit an old Zen master in a humble, secluded hut.

The Zen master, on hearing the man wanted to learn, invited his guest in for tea. The master filled the man’s cup all the way up, and the tea started pouring out over the brim and onto the table.

“Stop!” said the Westerner. “You’re overfilling it.”

The Zen master calmly replied, “Like this cup, you are full of your own opinions and speculations. How can I show you Zen unless you first empty your cup?”

“This was a bad idea,” said the Westerner. “You’re crazy.” Then he went back home and tried to live his life as best as he could. He still had good times and bad times, but he was a little less likely to believe any given person had all the answers. Also, before he flew home he bought a samurai sword that looked really cool and authentic and stuff and once it even helped him get laid.

Moral: Just because you act super-calm while you’re doing something doesn’t make it wise.

What is a determinant?

January 24, 2012

A simple introduction to a determinant is that it’s the area of a box.

Working in two dimensions, I’ll outline

  • the geometric picture of a linear transformation
  • the geometric picture of a determinant as an area
  • how the geometric picture leads to a few important properties of determinants
  • how the geometric picture of linear transformations can be expressed with matrices
  • how the geometric picture leads us to a formula for the determinant of a matrix

This post is long already. To keep it from becoming even longer, in some places I have had to leave out certain steps in the logic.

We’ll start with the coordinate plane. It’s a grid of points.


Then we scissor it or blow it up or shrink it down. Here are some examples:

To make them, I took the original image and applied the “shear”, “rotate”, and “scale” tools in GIMP (an open-source PhotoShop equivalent). You can try it yourself on any image just by using those tools.

These are called “linear transformations”. To simplify the way we picture them, we can just look at what they do to a box at the origin. I’ll make the box 3×3 so it’s visible, but imagine that each line represents a distance 1/3, so the sides of the box are length 1.

If we wanted, we could use something more complicated:

But since the apple is made from little boxes and all little boxes get treated the same way, we might as well focus on what happens to just one box.

Under any linear transformation, the box turns into a parallelogram.

The area of that parallelogram is called the determinant of the linear transformation.

There’s one extra rule. If the red and blue sides switch (as they would if I used the “flip” tool in GIMP), the determinant is negative. Here’s an example:

Since any area is made from little boxes and each little box’s area gets multiplied by the determinant, the area of any shape at all gets multiplied by the determinant. So for the apple, the determinant is the area of the apple on the right divided by the area of the apple on the left.

So that’s what a determinant is. What remains is to show what it’s about and what it has to do with the matrices you were wondering about.

Let’s look at some properties first. Imagine doing two transformations in a row. We’ll call this “multiplying the transformations”. The result is just another linear transformation.

When we do these sequential transformations, the area of our box gets multiplied by the determinant each time. If the determinant of the first transformation is 3 and the determinant of the second transformation is 5, the area gets multiplied by 15 overall, so the determinant of the combined transformation is 15. Multiplying transformations means multiplying determinants.

Next we’ll think about inverses. An inverse is a transformation that takes you back to where you started. The inverse of a transformation that rotates 45 degrees clockwise and multiplies everything by 2 is a transformation that rotates 45 degrees counterclockwise and cuts everything in half.

The determinant of the first transformation is 4 because each side of the box is doubled. The determinant of the second transformation is 1/4.

This is a general rule. Suppose two transformations are inverses. Then their determinants must multiply to 1, because the area of the box doesn’t change overall.

Next suppose a transformation’s determinant is zero. Then it doesn’t have an inverse because any number times zero is still zero, so there’s no transformation that takes the determinant back to one.

Geometrically, a transformation with zero determinant collapses everything to a line.

The line doesn’t have to be flat like this. It could be at any angle. Also, I didn’t collapse this completely to a line, since then you couldn’t see it. Transformations with zero determinant are bad news.

To review

  • Linear transformations are some combination of the “scale”, “rotate”, “shear”, and “flip” tools in Photoshop.
  • The determinant of a linear transformation is the factor by which the transformation changes the area.
  • The determinants of inverse transformations multiply to 1.
  • If the determinant is zero, the matrix doesn’t have an inverse. (The converse of this also holds, although we didn’t discuss it.)

Let’s move on to matrices. Take a linear transformation like this:

If we superimpose the original onto the final, we can see the coordinates of the new parallelogram in terms of the original grid.

We can describe the transformation completely using four numbers, two for the coordinates of the blue side and two for the coordinates of the red side. We’ll call those numbers a, b, c, d.

We’ll represent points with column matrices. So the point (a,b) will be represented by the matrix \left[ \begin{array}{c} a \\ b \end{array} \right]. (A matrix doesn’t have to be square. This is a 2×1 matrix.)

With this notation, we can represent our linear transformation by

\left[ \begin{array}{c} 1 \\ 0 \end{array} \right] \to   \left[ \begin{array}{c} a \\ b \end{array} \right]

\left[ \begin{array}{c} 0 \\ 1 \end{array} \right] \to  \left[ \begin{array}{c} c \\ d \end{array} \right]

This actually represents the entire transformation, even though it looks like we’ve only looked at two points. The reason is that any other point is made up out of the two we’ve already examined. For example

\left[ \begin{array}{c} 4 \\ 7 \end{array} \right] =    \left[ \begin{array}{c} 4 \\ 0 \end{array} \right] +    \left[ \begin{array}{c} 0 \\ 7 \end{array} \right] \to    \left[ \begin{array}{c} 4a \\ 4b \end{array} \right] +    \left[ \begin{array}{c} 7c \\ 7d \end{array} \right] =    \left[ \begin{array}{c} 4a + 7c \\ 4b + 7d \end{array} \right]

There’s a much more convenient way to write all this, which is in the form of a 2×2 matrix. \left[ \begin{array}{c} a \\ b \end{array} \right], which is the blue part of our parallelogram, becomes the first column of the matrix. \left[ \begin{array}{c} c \\ d \end{array} \right] is the second column.

We can view matrix multiplication as

\left[ \begin{array}{cc} a & c \\ b & d \end{array} \right]   \left[ \begin{array}{c} e \\ f \end{array} \right] = e   \left[ \begin{array}{c} a \\ b \end{array} \right] +    f \left[ \begin{array}{c} c \\ d \end{array} \right] =    \left[ \begin{array}{c} ea + fc \\ eb + fd \end{array} \right]

Check that this works for the example of \left[ \begin{array}{c} 4 \\ 7 \end{array} \right].

You may have learned to do this multiplication one row at a time rather than one column at a time. The result is the same.

This shows how a matrices describe linear transformations. All that remains is to tie in the concept of a determinant.

Remembering that a determinant is the area of a box, we can find a formula for the determinant by looking at some properties of area.

The area of the original 1×1 box is 1. That means

\left| \begin{array}{cc} 1 & 0 \\ 0 & 1 \end{array}\right| = 1

because that’s the identity matrix. It’s the linear transformation that does nothing. (The vertical lines around the matrix indicate that we’re taking a determinant.)

When we switch the blue and red sides of the box, the determinant is -1. The matrix that does this is

\left| \begin{array}{cc} 0 & 1 \\ 1 & 0 \end{array}\right| = -1

When we multiply the blue side by two, the determinant gets multiplied by that same factor. Since this is represented in the matrix by multiplying the first column by two, we have

\left| \begin{array}{cc} 2 & 0 \\ 0 & 1 \end{array}\right| = 2

and similarly

\left| \begin{array}{cc} 2 & 0 \\ 0 & 3 \end{array}\right| = 6

How about

\left| \begin{array}{cc} 1 & 1 \\ 0 & 0 \end{array}\right| = ?

This matrix is not invertible. It collapse everything onto the x-axis, making a “box” of zero area, so its determinant is zero. Similarly,

\left| \begin{array}{cc} 0 & 0 \\ 1 & 1 \end{array}\right| = 0

The final property we need of determinants/areas is linearity. Check out this picture:

It requires a little explanation. There are three linear transformations here, all sharing the same red side. The first two have the blue and purple sides. These are smaller. When we add them up, we get the third one with the gray side, so this picture represents adding linear transformations (which is different than multiplying them.) The green area is the area of the big transformation with the gray side.

The two smaller ones, with the blue and purple sides, have a total area equal to the green area. We can see this because there is a triangle of stuff that’s outside the green area, and therefore not counted. However, there’s also a triangle of extra stuff in the green area that’s not part two smaller parallelograms. These two triangles have the same area and cancel each other out, so that the small parallelograms have the same total area as the single big one.

Translating this into matrices means we can add determinants when one column is shared. This is called linearity in a column. For example

\left| \begin{array}{cc} a & 0 \\ b & 1 \end{array}\right| +   \left| \begin{array}{cc} c & 0 \\ d & 1 \end{array}\right| =   \left| \begin{array}{cc} a+c & 0 \\ b + d & 1 \end{array}\right|

So the properties we found are

  • The determinant of the identity is one.
  • The determinant of the matrix that switches horizontal and vertical is -1.
  • Multiplying a column by a number multiplies the determinant by that number.
  • The determinant is linear in a column.

These properties combined let us find the determinant of any matrix. Start with

\left| \begin{array}{cc} a & c \\ b & d \end{array}\right|

use linearity in the first column to write this as

\left| \begin{array}{cc} a & c \\ 0 & d \end{array}\right| +   \left| \begin{array}{cc} 0 & c \\ b & d \end{array}\right|

now use linearity in the second column to make it

\left| \begin{array}{cc} a & c \\ 0 & 0 \end{array}\right| +   \left| \begin{array}{cc} a & 0 \\ 0 & d \end{array}\right|    +   \left| \begin{array}{cc} 0 & c \\ b & 0 \end{array}\right| +   \left| \begin{array}{cc} 0 & 0 \\ b & d \end{array}\right|

We have already set up the tools to evaluate each of these individually. The determinant is

\left| \begin{array}{cc} a & c \\ b & d \end{array}\right| = 0 + ad - cb - 0

That’s the area of the parallelogram. You could find it by other geometrical means, too, but knowing the formula for the determinant makes it easy.

A Cute Hat Problem

December 31, 2011

I’ve seen a number of “hat problem” logic puzzles, but this one I found the other day was new to me nonetheless.  I’m stealing from http://www.relisoft.com/science/hats.html, where you can find a beautiful description of the answer.


Three people enter the room, each with a hat on their head. There are two colors of hats: red and blue; they are assigned randomly. Each person can see the hats of the two other people, but they can’t see their own hats. Each person can either try to guess the color of their own hat or pass. All three do it simultaneously, so there is no way to base their guesses on the guesses of others. If nobody guesses incorrectly and at least one person guesses correctly, they all share a big prize. Otherwise they all lose.

One more thing: before the contest, the three people have a meeting during which they decide their strategy. What is the best strategy?