What is a determinant?

January 24, 2012

A simple introduction to a determinant is that it’s the area of a box.

Working in two dimensions, I’ll outline

  • the geometric picture of a linear transformation
  • the geometric picture of a determinant as an area
  • how the geometric picture leads to a few important properties of determinants
  • how the geometric picture of linear transformations can be expressed with matrices
  • how the geometric picture leads us to a formula for the determinant of a matrix

This post is long already. To keep it from becoming even longer, in some places I have had to leave out certain steps in the logic.

We’ll start with the coordinate plane. It’s a grid of points.

grid1

Then we scissor it or blow it up or shrink it down. Here are some examples:

To make them, I took the original image and applied the “shear”, “rotate”, and “scale” tools in GIMP (an open-source PhotoShop equivalent). You can try it yourself on any image just by using those tools.

These are called “linear transformations”. To simplify the way we picture them, we can just look at what they do to a box at the origin. I’ll make the box 3×3 so it’s visible, but imagine that each line represents a distance 1/3, so the sides of the box are length 1.

If we wanted, we could use something more complicated:

But since the apple is made from little boxes and all little boxes get treated the same way, we might as well focus on what happens to just one box.

Under any linear transformation, the box turns into a parallelogram.

The area of that parallelogram is called the determinant of the linear transformation.

There’s one extra rule. If the red and blue sides switch (as they would if I used the “flip” tool in GIMP), the determinant is negative. Here’s an example:

Since any area is made from little boxes and each little box’s area gets multiplied by the determinant, the area of any shape at all gets multiplied by the determinant. So for the apple, the determinant is the area of the apple on the right divided by the area of the apple on the left.

So that’s what a determinant is. What remains is to show what it’s about and what it has to do with the matrices you were wondering about.

Let’s look at some properties first. Imagine doing two transformations in a row. We’ll call this “multiplying the transformations”. The result is just another linear transformation.

When we do these sequential transformations, the area of our box gets multiplied by the determinant each time. If the determinant of the first transformation is 3 and the determinant of the second transformation is 5, the area gets multiplied by 15 overall, so the determinant of the combined transformation is 15. Multiplying transformations means multiplying determinants.

Next we’ll think about inverses. An inverse is a transformation that takes you back to where you started. The inverse of a transformation that rotates 45 degrees clockwise and multiplies everything by 2 is a transformation that rotates 45 degrees counterclockwise and cuts everything in half.

The determinant of the first transformation is 4 because each side of the box is doubled. The determinant of the second transformation is 1/4.

This is a general rule. Suppose two transformations are inverses. Then their determinants must multiply to 1, because the area of the box doesn’t change overall.

Next suppose a transformation’s determinant is zero. Then it doesn’t have an inverse because any number times zero is still zero, so there’s no transformation that takes the determinant back to one.

Geometrically, a transformation with zero determinant collapses everything to a line.

The line doesn’t have to be flat like this. It could be at any angle. Also, I didn’t collapse this completely to a line, since then you couldn’t see it. Transformations with zero determinant are bad news.

To review

  • Linear transformations are some combination of the “scale”, “rotate”, “shear”, and “flip” tools in Photoshop.
  • The determinant of a linear transformation is the factor by which the transformation changes the area.
  • The determinants of inverse transformations multiply to 1.
  • If the determinant is zero, the matrix doesn’t have an inverse. (The converse of this also holds, although we didn’t discuss it.)

Let’s move on to matrices. Take a linear transformation like this:

If we superimpose the original onto the final, we can see the coordinates of the new parallelogram in terms of the original grid.

We can describe the transformation completely using four numbers, two for the coordinates of the blue side and two for the coordinates of the red side. We’ll call those numbers a, b, c, d.

We’ll represent points with column matrices. So the point (a,b) will be represented by the matrix \left[ \begin{array}{c} a \\ b \end{array} \right]. (A matrix doesn’t have to be square. This is a 2×1 matrix.)

With this notation, we can represent our linear transformation by

\left[ \begin{array}{c} 1 \\ 0 \end{array} \right] \to   \left[ \begin{array}{c} a \\ b \end{array} \right]

\left[ \begin{array}{c} 0 \\ 1 \end{array} \right] \to  \left[ \begin{array}{c} c \\ d \end{array} \right]

This actually represents the entire transformation, even though it looks like we’ve only looked at two points. The reason is that any other point is made up out of the two we’ve already examined. For example

\left[ \begin{array}{c} 4 \\ 7 \end{array} \right] =    \left[ \begin{array}{c} 4 \\ 0 \end{array} \right] +    \left[ \begin{array}{c} 0 \\ 7 \end{array} \right] \to    \left[ \begin{array}{c} 4a \\ 4b \end{array} \right] +    \left[ \begin{array}{c} 7c \\ 7d \end{array} \right] =    \left[ \begin{array}{c} 4a + 7c \\ 4b + 7d \end{array} \right]

There’s a much more convenient way to write all this, which is in the form of a 2×2 matrix. \left[ \begin{array}{c} a \\ b \end{array} \right], which is the blue part of our parallelogram, becomes the first column of the matrix. \left[ \begin{array}{c} c \\ d \end{array} \right] is the second column.

We can view matrix multiplication as

\left[ \begin{array}{cc} a & c \\ b & d \end{array} \right]   \left[ \begin{array}{c} e \\ f \end{array} \right] = e   \left[ \begin{array}{c} a \\ b \end{array} \right] +    f \left[ \begin{array}{c} c \\ d \end{array} \right] =    \left[ \begin{array}{c} ea + fc \\ eb + fd \end{array} \right]

Check that this works for the example of \left[ \begin{array}{c} 4 \\ 7 \end{array} \right].

You may have learned to do this multiplication one row at a time rather than one column at a time. The result is the same.

This shows how a matrices describe linear transformations. All that remains is to tie in the concept of a determinant.

Remembering that a determinant is the area of a box, we can find a formula for the determinant by looking at some properties of area.

The area of the original 1×1 box is 1. That means

\left| \begin{array}{cc} 1 & 0 \\ 0 & 1 \end{array}\right| = 1

because that’s the identity matrix. It’s the linear transformation that does nothing. (The vertical lines around the matrix indicate that we’re taking a determinant.)

When we switch the blue and red sides of the box, the determinant is -1. The matrix that does this is

\left| \begin{array}{cc} 0 & 1 \\ 1 & 0 \end{array}\right| = -1

When we multiply the blue side by two, the determinant gets multiplied by that same factor. Since this is represented in the matrix by multiplying the first column by two, we have

\left| \begin{array}{cc} 2 & 0 \\ 0 & 1 \end{array}\right| = 2

and similarly

\left| \begin{array}{cc} 2 & 0 \\ 0 & 3 \end{array}\right| = 6

How about

\left| \begin{array}{cc} 1 & 1 \\ 0 & 0 \end{array}\right| = ?

This matrix is not invertible. It collapse everything onto the x-axis, making a “box” of zero area, so its determinant is zero. Similarly,

\left| \begin{array}{cc} 0 & 0 \\ 1 & 1 \end{array}\right| = 0

The final property we need of determinants/areas is linearity. Check out this picture:

It requires a little explanation. There are three linear transformations here, all sharing the same red side. The first two have the blue and purple sides. These are smaller. When we add them up, we get the third one with the gray side, so this picture represents adding linear transformations (which is different than multiplying them.) The green area is the area of the big transformation with the gray side.

The two smaller ones, with the blue and purple sides, have a total area equal to the green area. We can see this because there is a triangle of stuff that’s outside the green area, and therefore not counted. However, there’s also a triangle of extra stuff in the green area that’s not part two smaller parallelograms. These two triangles have the same area and cancel each other out, so that the small parallelograms have the same total area as the single big one.

Translating this into matrices means we can add determinants when one column is shared. This is called linearity in a column. For example

\left| \begin{array}{cc} a & 0 \\ b & 1 \end{array}\right| +   \left| \begin{array}{cc} c & 0 \\ d & 1 \end{array}\right| =   \left| \begin{array}{cc} a+c & 0 \\ b + d & 1 \end{array}\right|

So the properties we found are

  • The determinant of the identity is one.
  • The determinant of the matrix that switches horizontal and vertical is -1.
  • Multiplying a column by a number multiplies the determinant by that number.
  • The determinant is linear in a column.

These properties combined let us find the determinant of any matrix. Start with

\left| \begin{array}{cc} a & c \\ b & d \end{array}\right|

use linearity in the first column to write this as

\left| \begin{array}{cc} a & c \\ 0 & d \end{array}\right| +   \left| \begin{array}{cc} 0 & c \\ b & d \end{array}\right|

now use linearity in the second column to make it

\left| \begin{array}{cc} a & c \\ 0 & 0 \end{array}\right| +   \left| \begin{array}{cc} a & 0 \\ 0 & d \end{array}\right|    +   \left| \begin{array}{cc} 0 & c \\ b & 0 \end{array}\right| +   \left| \begin{array}{cc} 0 & 0 \\ b & d \end{array}\right|

We have already set up the tools to evaluate each of these individually. The determinant is

\left| \begin{array}{cc} a & c \\ b & d \end{array}\right| = 0 + ad - cb - 0

That’s the area of the parallelogram. You could find it by other geometrical means, too, but knowing the formula for the determinant makes it easy.

A Cute Hat Problem

December 31, 2011

I’ve seen a number of “hat problem” logic puzzles, but this one I found the other day was new to me nonetheless.  I’m stealing from http://www.relisoft.com/science/hats.html, where you can find a beautiful description of the answer.

 

Three people enter the room, each with a hat on their head. There are two colors of hats: red and blue; they are assigned randomly. Each person can see the hats of the two other people, but they can’t see their own hats. Each person can either try to guess the color of their own hat or pass. All three do it simultaneously, so there is no way to base their guesses on the guesses of others. If nobody guesses incorrectly and at least one person guesses correctly, they all share a big prize. Otherwise they all lose.

One more thing: before the contest, the three people have a meeting during which they decide their strategy. What is the best strategy?

 

 

 

Food for the brain

December 17, 2011

I’m not sure why, but someone on Quora wanted to know what you can learn from eating a book. So here you go:

If you eat a book, you might learn

  • whether you can stomach Chuck Palahniuk
  • how to Chew Your Own Adventure
  • that you need a cookbook cookbook
  • that you don’t have to finish a book just because you start it
  • that Finnegan’s Wake is indigestible
  • you’d rather rush to eat salmon than eat Salman Rushdie
  • that some things are worse than airline food
  • that “eat three squares a day” isn’t to be taken literally
  • that you can get away with murder by feigning insanity
  • what people mean when they say, “His words are coming out his ass!”

Why is the integral of 1/x equal to the natural logarithm of x?

December 17, 2011

The title of this post asks a question that many calculus students find befuddling. Here I’ll give some geometric intuition behind it. I leave small logical gaps to avoid cheating the reader of the pleasure of their discovery.

One essential feature of logarithms is that they make a multiplication problem equivalent to an addition problem, by which I mean

\ln(ab) = \ln(a) + \ln(b)

Meanwhile, \int\frac{1}{x}\mathrm{d}x is usually thought of geometrically as the area underneath a curve. The problem, then, is to try to see visually what an area under a curve has to do with turning multiplication into addition.

Here’s a graph of 1/x, and we’re finding, as an example, the area under it from 1 to 2.

integral

Let’s say now that we multiply the limits of integration by two, so we’re now finding the area from 2 to 4. Here’s what that looks like.

second integral

The two portions are actually very similar to each other in their overall shape. The orange one is twice is wide as the green one, but also half as tall. Here they are overlaid.

overlaid integrals

If you take the green shape and first squash it down vertically by a factor of two then stretch it out horizontally by a factor of two, you get the orange shape exactly. (If you don’t believe this, convince yourself it works!) This means the areas of these shapes are exactly the same, even though we don’t know what that area is.

Show for yourself that this result is general. The area under 1/x from a to b is the same as that from ac to bc.

What, then, is the area from 1 to 6? We can break it into two parts – the area from 1 to 2 and the area from 2 to 6. But the area from 2 to 6 is the same as the area from 1 to 3, by the above reasoning.

Thus, the area from 1 to 6 is the same as the sum of the areas from 1 to 2 and from 1 to 3. Note that 6 = 3*2. Again, this is general. The area under 1/x from 1 to ab is the same as the sum of the areas from 1 to a and from 1 to b.

That’s pretty good motivation for the definition

\ln(x) = \int_1^x\frac{1}{t}\mathrm{d}t

Note that this is being taken as a definition of the natural logarithm, not a proof of the relationship. Our argument about the integral of 1/x now translates to the statement

\ln(ab) = \ln(a) + \ln(b)

Now, step by step, we will show that all the other properties you expect of the natural logarithm follow from this definition.

It is evident that

\ln(1) = 0

Our definition implies that the logarithm grows without bound because if we continually multiply the argument of the logarithm by two, we continually add \ln(2) to the value. (i.e. \ln(2x) = \ln(x) + \ln(2)). Since we can multiply any number by two over and over, we can add \ln(2) to the logarithm as many times as we want. That means we can make the logarithm arbitrarily big.

This also means that starting the integral from 1 rather than from zero was a good idea. If we start from zero, the integral is infinite. We can see this because 1/x is symmetric about the line y = x.

symmetric

This implies that the area to the left of the curve is the same as the area under the curve, like this.

areas

We just showed that the area under the curve diverges as we move the right hand side of the integral out to infinity, so the area to the left of the curve diverges, too. If we started the integral at zero, it would be infinite.

What about taking the logarithm of numbers less than one? A good check of whether everything makes sense so far is to work out that \ln(1/x) = - \ln(x).

Since the area under 1/x starts at zero when x=1 and goes up infinitely, it is clear that there must be some number x such that \ln(x) = 1. Let’s choose to call that number e. We don’t know what it is yet, but it certainly exists. Thus

\ln(e) = 1

Again, this is definition, not proof.

It is immediately apparent that, for example, \ln(e^5) = \ln(e*e*e*e*e) = 5\ln(e) = 5. That makes e a pretty handy number. It shows us that the logarithm of a number x is how many times you need to multiply e to itself in order to get x.

How about \ln(e^{3/2})? That is \ln\left([e^{1/2}]^3\right) = 3\ln(e^{1/2}). So in order to understand logarithms of rational numbers, we need to understand roots of e.

That’s not so hard, though.

\ln(e^{1/2}*e^{1/2}) = \ln(e) = 1.

On the other hand,

\ln(e^{1/2}*e^{1/2}) = \ln(e^{1/2}) + \ln(e^{1/2}) = 2 \ln(e^{1/2})

From this we deduce \ln(e^{1/2}) = 1/2. Returning to the unfinished example, \ln(e^{3/2}) = 3*(1/2) = 3/2. It is not great leap to say that for any rational number x, we have

\ln(e^x) = x

This is important result; it is probably the definition of \ln(x) that you’re used to. The pieces are falling into place. The main remaining hurdle is to find the value of e and show it comes to what we expect.

Before that, we should mention how the above relation works for irrational numbers. Irrational numbers are squeezed in between the rational ones, and since the definition of the logarithm as the area under a curve is evidently smooth, the logarithm of an irrational number is squeezed in tightly as well. Ultimately, the above relation holds for all positive numbers. However, the fine details of real numbers are more involved than I would like to address here. (The logarithm of a negative number or of zero isn’t defined, at least not in the real numbers. What is a difficulty with doing so?)

Finally, we would like some way of determining what e is. Here is one way to do it. For small values of x, we can see that

\ln(1+x) \approx x

This follows from the extremely simple approximation below.

approximation

The red box is an approximation to the area of the green integral. The red box clearly has area x while the green integral is \ln(1+x). Thus

\ln(1+x) \approx x

It’s crude, but it works better and better as x becomes tiny. Multiplying both sides of the approximation by 1/x, we get

\frac{1}{x}\ln(1+x) \approx 1

We know how to rewrite the left hand side. It gives

\ln\left([1+x]^{1/x}\right) \approx 1

Since we have defined e by \ln(e) = 1, we finally see

e = \lim_{x\to 0} (1+x)^{1/x}

This is the common definition of e. At last we see that the reason that the integral of 1/x is \ln(x) is that all the properties of the two functions are exactly the same, and so they must be the same function.

I was looking up “serendipity” in the dictionary when I unexpectedly discovered serenity.

December 14, 2011

The blog has been dormant recently. I’ve put more effort into physics.stackexchange.com and Quora. (You can see my profiles on those sites here: Phys.SE, Quora)

I figure I can cross-post content that I like, though. Here’s my answer on Quora to “What is serendipity?”


Serendipity is when

  • your alphabet soup spells out “Eat me.”
  • there’s a rock in your shoe, and it completes your collection.
  • you take a summer job herding sheep and wind up falling in love (consequences notwithstanding).
  • you buy an old used book, and the margin contains a proof of Fermat’s last theorem.
  • you take your dog to the park and there’s an Ultimate tournament going on.
  • you’re on a camping trip and it starts raining. A moment before, you had been thinking your oatmeal was too dry.
  • you decide to try to hold out a couple extra days before doing laundry, and then a dread virus triggers the zombie apocalypse.
  • you get in an accident with a truck carrying avocados, and you just bought a huge bag of tortilla chips
  • right before you get up for a bathroom break, the FedEx guy arrives. Your cheating ex sent you more flowers.
  • you write a tongue-in-cheek Quora answer just as a lark, and hundreds of people upvote it, and then some guy gives you a boat.

Tricky Calculus Problem

July 28, 2011

Here’s a cute problem I heard from Moor Xu:

Let f(x) = x^6 - 9x^2 - 6x. f has exactly three critical points. Find the parabola that passes through these critical points.

I’ve been doing a daily problem of the day for my physics camp students. Questions and answers are posted here the day after we give them to the students. I haven’t been copying them over to this blog because many are repeats, and none are original. They might still be entertaining if you’re in to that sort of thing.

My Peers’ Birthdays

May 18, 2011

follow-up to My Friends’ Birthdays

The main conclusion I drew from examining my Facebook friends’ birthdays is that I didn’t have enough data to see the birth month effect – when your month of birth influences your success in a field because it decides your relative age to your peers early on in sports or school.

The birth month effect is real in some circumstances. Just now, I searched for “US junior baseball team” and found this roster.

In Outliers, Malcolm Gladwell explained that the cutoff date for youth baseball leagues in the US is July 31. (It’s now changed to May 1, so in ten years we can do this experiment over and see the effect.) Thirteen players on the roster were born in the half of the year directly following July 31 (August through January), and only five were born in the next half (February through July). With data like that, even a sample of eighteen people is enough to see the strong effects that birth month has on athletic success. The odds of such lopsidedness occurring by random chance are about 5%.

If 18 baseball players is enough to see a significant birth month effect in sports, then shouldn’t more than 100 Facebook friends have been plenty to see it in education?

In American education, there is no firm, uniform cut-off date like there is with baseball. Different states have different dates. Also, parents may have a choice about when to send their child to kindergarten if the child is born in a certain window. I was born in December in Maryland, where entering kindergarteners must be five years old by December 31. I could have been one of the youngest students in my grade, but my parents held me back, making me one of the oldest. Their stated reason was that they thought I’d appreciate being one of the first kids with a driver’s license come high school.

Mixed-up birth months, along with other obfuscating factors the reader may imagine, could easily make a real signal difficult to pick up, so I asked the Caltech registrar’s office for data on all the domestic Caltech students. They kindly obliged, with birth months tallied for the 5083 students enrolled since 1985. I was asked not to release the data directly, but I can report on its statistics.

Since September to December babies can be either old or young when entering kindergarten, let’s leave them out. The hypothesis is that entering Caltech students are more likely to be born in the January to April time frame than May to August. (If you want to be a stickler for experimental design, we could say that the null hypothesis is that students are equally likely to be in those categories.)

There were 3399 students whose birth months fell into one of these two ranges. If each student were a simple binomial variable with even probability we’d expect a standard deviation of 29 in the number of students in each range. We should also take into account that these periods aren’t perfectly equal in numbers of births. According to a Google result, a baby born anywhere from January to August has a 51.85% chance of being born in the May-August window, due partially to the three extra days and partially to higher birth rates. Thus, we expect that if domestic Caltech students have birth month patterns that mirror the American population at large, there should be 1762 +/- 30 students born in the May-August window. If there are fewer than 1700, we have evidence that Caltech students are less likely to be born in the summer.

The statistic is 1713 born in those months, compared to 1686 in January – April. The discarded period, from September to December, has 1684. There is no significant evidence to suggest that Caltech students are more likely to be born in any particular month.

This certainly doesn’t disprove the idea that your month of birth impacts your success in school, but the effect, if present, is not as powerful in education as it is in organized sports.

My Friends’ Birthdays

May 2, 2011

Malcolm Gladwell’s Outliers describes how elite hockey players in Canada are far more likely to be born in the first half of the year than the second. There’s a simple explanation – Canadian youth hockey leagues bin age group teams according to the calendar year of birth. Two young players, one born January 1, 2003 and the other December 31, 2003 are considered the same age and play in the same league.

Being born in at the beginning of the year makes you a few months older than most of your peers. When you’re eight year old, those few months equate to a big advantage in physical maturity. Being more mature, you perform better, get selected for elite teams, and receive better training. You get better and better while your peers born near the end of the year are left behind.

The data shown in the book are convincing. The phenomenon is seen not just in Canadian hockey, but in a host of other sports where a similar age cutoff exists, and when the cutoff date changes from January 1, the distribution of birth months changes, too. (Basketball in the US is one exception, presumably because kids learn on the streets regardless of their birth month, and don’t need to be selected for elite training until later on.)

Then Gladwell goes on to suggest that the same effect dominates academic achievement in the US.

Parents with a child born at the end of the calendar year often think about holding their child back before the start of kindergarten: it’s hard for a five-year-old to keep up with a child born many months earlier. But most parents, one suspects, think that whatever disadvantage a younger child faces in kindergarten eventually goes away. But it doesn’t. It’s just like hockey. The small initial advantage that the child born in the early part of the year has over the child born at the end of the year persists. It locks children into patterns of achievement and underachievement, encouragement and discouragement, that stretch on and on for years.

Recently, two economists — Kelly Bedard and Eliza­beth Dhuey—looked at the relationship between scores on what is called the Trends in International Mathematics and Science Study, or TIMSS (math and science tests given every four years to children in many countries around the world), and month of birth. They found that among fourth graders, the oldest children scored somewhere between four and twelve percentile points better than the young­est children. That, as Dhuey explains, is a “huge effect.” It means that if you take two intellectually equivalent fourth graders with birthdays at opposite ends of the cutoff date, the older student could score in the eightieth percentile, while the younger one could score in the sixty-eighth percentile. That’s the difference between qualifying for a gifted program and not.
pp 28

The first paragraph seems like a rather wild extrapolation, based solely on the second.

I wanted to know if I could see this birthday effect in some data I had readily available – that generated by my Facebook friends.

I have about 700 Facebook friends, many of whom were Caltech students. These people represent an academic elite, so if the birthday effect is extraordinarily strong, I ought to have friends whose birthdays come in a clump, assuming they are educated in the US.

I tallied the birth months of all my Facebook friends who are or were students at Caltech and who listed themselves as being from somewhere in the US. (I wound up throwing out a lot of people from the US because they didn’t list a home town, but I thought it was better to have a uniform data collection policy than to guess.) 110 people made the cut.

I made a plot of their birth months, and it looked like maybe there was some sort of signal in there. So then I made seven fake plots by randomly generating birth months from a flat distribution. Here are the eight plots. Can you tell which one is the real data?

One of these plots is real data from the birth months of students at one the world’s top universities. The other seven plots are as random as Python can make them. I challenge Malcolm Gladwell to tell me which one is which.

This challenge is a bit unfair. What I really ought to plot is not birth month, but age when starting kindergarten. These aren’t the same, largely because people born near the end of the year (like me) can wind up either old for their grade or young for it. Still, January babies are almost uniformly old for their grade in the US, and August babies are almost uniformly young. If the effect is as powerful as Gladwell suggests, we ought to see it at play here.

If birth months were evenly distributed and I took 110 data points, the expectation value for a each month is 9.2 and the standard deviation is 2.9. Since the standard deviation is pretty big compared to the expectation value, we would need a large signal in order to see an obvious effect in the data. So to make a strong case, I really ought to have more data.

Still, we actually expect whatever effect there is to be magnified when looking at this data. The reason is that, with Caltech students, we’re looking far out on the tail of the distribution of academic ability.

Here are two normal distributions that are the same except that one is shifted to the right.

The original gaussian, centered on zero, represents students born late in the year, and the shifted one represents students born near the beginning. (This is only supposed to be a heuristic, of course.)

I’ve added two vertical lines. The first vertical line shows a cut off for students who are “good at school”. There are about three times as many students from the shifted distribution that make it beyond this cut off.

The second line shows students who are “very good at school”. There are about ten times as many students from the shifted distribution that make it beyond this tougher cutoff.

Even though I don’t expect the age-selection effect to work in such a simple way, the main idea is simply that if you give one population a small advantage over the other, the effect becomes magnified when you look at the frequency of outliers. So, in the birthdays of my Caltech friends, I ought to see a pretty strong signal, if the basic effect exists.

So, for now I’d say that, lacking further data, either the effect is not very large, or it is not very simple, so that somehow it allows Caltech students form an exceptional bunch.

Admissions

April 29, 2011

Some meditations I wrote a while ago, at the end of my grad school admissions process. I’ll begin graduate studies in physics at Johns Hopkins this fall.


Dear Admissions Committees,

It’s hard to make admissions. Congratulations! Don’t you feel better now that you’re done for the year? That must be nice. All the possible admissions are lined up and you just choose the ones you want.

Well, now that you have admitted what you are going to admit, I have some things to admit, too. I didn’t admit them before because I wanted you to admit me. But now that we’re done with all that, I think we can just be honest with each other, at least a little.

I admit that I read Brian Greene when I was a kid. This is the first thing that you’re not supposed to admit. I’ve heard it many places – “Don’t tell them you were inspired by The Elegant Universe! Everyone tells that story. No one wants to hear about that.”

And I will admit it. No one wants to hear about that. No one wants to hear how I confronted those passages in the book that proclaim, with unapologetic authority, a string of certifiably insane conclusions about some twins who get a kick out of the improbable combination of spaceships, clocks, and meter sticks. No one wants to hear how, as Greene assaulted me with each new claim, I squirmed back and forth in my father’s armchair for hours, alternately flipping the book down to meditate on the dimples of our ceiling tiles and snatching back the pages for another greedy interrogation. I was an absurd blending of agitation, cogitation, and meaningful hand gestures as I attempted to conduct into harmony this mistuned orchestra of ideas. It was awe, I guess, but not the Grand Canyon, photos-from-Hubble, double-rainbow-all-the-way sort. It was the sort of awe Elmo would feel if you told him he’s actually a puppet, and some guy named Alonso had been controlling him by jamming a hand up his butt the last thirty years.

And man, you sure don’t want to hear how I sat on the edge of my chair in class the next day, bouncing my leg as if I had to pee, just because I wanted to get back home and see what was in the next chapter. You’d grimace, (really you would) if you were forced to stand by and listen to how I tried to relate everything in that book to the kids at school. Have you ever watched a skinny, pimply-faced nerd talking three times too fast as he tries to explain the diffraction limit to a running back? But listen – it didn’t matter. I was going to solve the MYSTERIES OF THE GOD DAMN UNIVERSE. You missed out, man. If you had only known to ask me, you could have learned all about how smart I was, when I was sixteen.

I admit that I am older now, and I’ve learned a lot. For example, blocks sliding down inclined planes. Dude, I am a total badass at that. The learned professors have taught me all sorts of stuff about it, and I can solve those block problems a lot of different ways. But here’s something not too many people admit – I think they’re fun.

Would you believe it? Blocks sliding down inclined planes are absolutely fascinating! Like, take this, for instance: you have a lumpy, three-dimensional hill. You take a ball and slide it down the hill, frictionlessly. Now, you go back up to the top and do the same thing over again – same ball and same hill – but this time you roll the ball down without slipping. And the question is: does the ball take the same path? Do you slide down a hill over the same route you roll down the hill? I admit it took me three months to figure this out. Three months! Not of constant work, but three months of germination before the answer sprouted. And when I gave a presentation on it to the physics club, I filled up three blackboards with math. All for a ball rolling down a hill.

Or take the brachistochrone. You remember that, of course. The brachistochrone is the hill you slide down to get from point A to point B as fast as possible. Finding it is a basic problem in the calculus of variations. But did you know it was solved by the Bernoullis in the seventeenth century, long before the calculus of variations existed?

If you take Snell’s law, it tells you how light bends when it goes from one medium to another – entering the water from the air, say. And by Fermat’s principle, it’s taking the fastest path. So you see, we can make a perfect analogy between an optical system and a mechanical one, both minimizing travel time, and Snell’s law applies equally well to both. If you start by understanding light, all the sudden you understand balls sliding down hills. And that’s how Bernoulli found the answer long before the techniques we now use even existed.

Did you ever realize that the brachistochrone shape changes depending on the size of the ball, or wonder what the answer is for tunnels going through the Earth? I admit an unhealthy fondness for such useless considerations as all that. And I admit that obsession with this trivia will probably compromise my usefulness as a single-minded researcher. I admit that I’m not ashamed of that, because blocks sliding down inclined planes are cooler than you ever imagined. That’s something I admit to thinking is worthwhile.

I admit that I don’t like lectures because ninety minutes is way too long to listen to you talking, and that I hate problem sets because, if you already know the answer, why don’t you just tell me already?
I admit that I am, paradoxically, both egotistical and insecure about my own ability. I admit that I have no idea what it actually means to do some meaningful work, and you’ve got a long road ahead of you if you’re going to try to turn me into a productive researcher. I admit that I’m practically computer illiterate, that I still don’t get what partition functions are for, and that I will probably daydream about quantum-mechanical midget porn during your colloquium.

Here. I will make my worst admission of all. I admit that I am not really a physicist; I’m just curious. I want to learn about the spin-statistics theorem, yeah. That sounds really interesting to me. But I also want to know why a hard-boiled egg stands up when you spin it, how a plant seed knows which way to grow, and why a circle saturates the isoperimetric inequality.

I admit that I don’t care if I’m the first to figure something out. I admit I don’t need to be a part of a great scientific establishment or to feel like I’m on a search for truth. I admit it’s enough, for me, just to be puzzled about something, to be squirming back and forth in my father’s armchair.

You wanna hear the next half of the Brian Greene story? Because it gets better. Six or seven years after my first encounter with that book, I’m at Caltech, learning such intricacies as Ehrenhfest’s theorem or retarded potentials (that’s a real thing!). And around this time, Brian Greene writes a new book, a kid’s book, about a teenager who goes flying around a black hole in a spaceship and gets sent a thousand years into the future.

Greene goes on a book tour, and his next stop is Caltech. I’m on the staff of the newspaper, so I send an email asking for an interview. An hour before the public talk begins, I’m finally seated in a little room backstage in Beckman auditorium, feeling my stomach know a little as I start my conversation with Brian Greene.

“It’s actually a cautionary tale that’s closer to what actually happens in science,” Greene tells me about Icarus. To be honest, his new book kind of sucks. He’s out of his depth, and I don’t care about the book that much. But I thought I should ask anyway. “What happens in science?” he continues, “Well, we go forward into the unknown, we learn new things, and sometimes that drags us, sometimes kicking and screaming, into a new reality.”

His response to my question about Icarus feels a little bit canned. Greene says,
“I have a strong sense from talking to people that for many, science is this abstract, cold, aloof body of knowledge that sometimes may make a difference in their lives if it yields a new medical technique or yields a new piece of technological gadgetry. But otherwise, science is something that people stay away from.

“My point is to try to convince people and open up to people the fact that science goes far beyond that. It can touch you in an emotional way. It can help you have a different connection to the world and the universe around you, whether you’re the next Einstein or whether you’re just someone who wants this to be a part of your life. As I’ll discuss tonight, I think that science is as important to a full life as music and art and literature and theater, and I think books like this can help people begin to recognize that.”

No! How can this be? Brian Greene doesn’t understand his own book. Let me explain. When I was sixteen, I didn’t just discover Brian Greene. I discovered Sibelius, too. And I used to lie flat on the carpet downstairs, listening to the second symphony over and over and feeling the blood in my veins. It was nothing like reading physics! And it shouldn’t be. It’s its own thing – a completely separate vital experience. The pursuit I learned about in Dad’s armchair should not be latched on to the back of “music and art and literature and theater”. Greene is talking about physics as Grand-Canyon-awe. I never even liked the Grand Canyon that much. I’m still Elmo. And now I’m forced to realize that the guy who showed me all this stuff in the first place is totally confused about it what it is.

All right, fine. I admit I still like Brian Greene. I haven’t read his new book or anything, but honestly it was a pretty good interview, and I like the guy. But I admit I had to go back after that, and re-examine some stuff. And what I learned is that Brian Greene was immaterial. The Elegant Universe was immaterial. It was special relativity itself that had mattered. I admit that was a revelation to me.

Let’s get down to it. I admit that I hate you for judging me, you admissions committees. Because how could you know? How could you decide what you do and don’t want to admit, just from that little admissions material I sent you? That’s none of it – nothing at all. It didn’t have me in it because it wasn’t supposed to. I wasn’t supposed to admit myself, and so how can you choose to admit me? That’s the thing when you’re in our position. Between you and me, even with all our admissions, there is really no way to know.

Visualizing Elementary Calculus: Statics

April 29, 2011

This post assumes a little physics, specifically the relationship between work and energy.

This series:

I – Introduction
II – Trigonometry
III – Differentiation Rules 1
IV – Graphs, Tangents, Derivatives
V – Optimization
VI – Statics

Equilibrium

In physics, we sometimes like to look at stuff that isn’t doing anything. This is called “statics”. It’s kind of boring after a while, which is why you would only take an entire course on it if you’re an engineer.

If something is stationary, there must be no net force on it. That means that if you move it around a little bit, the work done is zero and its energy doesn’t change. This is called the principle of virtual work. The result is that when things aren’t moving, they are generally at a minimum of potential energy. In theory they could be at a maximum or other stationary point, but these equilibrium states are unstable – the difference between a ball nestled securely at the bottom of a valley and perched precariously at the top of a mountain.

Tubes

Here’s a picture of a disappointing jug of milk.

The water rises to the same height in the handle and in the main body. It’ll do this even if you make a hundred little tubes, all with different shapes, even if they’re curved around in strange ways. How does the water in one tube know how high the water is in all the others?

The water must be at a height such that a small movement of water from the handle to the body would cause no change in potential energy. Since the potential energy is a function of height, the water must the be same height everywhere, else we would be able to release energy by moving some water from high to low.

Hill and Chain

There’s a lumpy hill with a chain lying on it. When will the chain be stationary, assuming the hill is frictionless?

The condition is that the potential energy of the chain shouldn’t change if you move it a little left or right. Assuming the chain is uniform, moving it a little bit to the right is identical to chopping a little bit off the left hand side and moving it all the way over to the right.

This chopping operation isn’t supposed to change the potential energy, so the left and right hand sides of the chain must be at the same height. That’s the condition for stability.

Unlike this water in the jug, this is an unstable equilibrium because if the chain is just a little bit off, it will fall towards the side that’s already further down, making that side drop down even further, and the chain will get further out of equilibrium.

Push Ups

I have often been asked why it is much easier to do a push up than to bench press your body weight. The motion of your arms is essentially the same, so shouldn’t the tasks be about equally difficult?

A push up, unlike a bench press, is a sort of lever. We can model it like this:

The horizontal brown stick is a board – your body as you do the push up. The triangle is the fulcrum – your feet. The gray ball represents your body weight. Your true weight is actually distributed from your feet to your head, but the ball represents an average, called your center of mass. The green arrow represents the force from your arms on your body.

A force is just a force, regardless of where it comes from, so instead of your arms, we’ll imagine the same force is generated by a mass on a pulley.

The blue circle is a pulley. There’s a rope tied to the end of the board that goes around the pulley and attaches to a platform with a weight on it.

In order for the system to be in equilibrium, the pulley needs to minimize its potential energy, so it must be at a stationary point. We would like \textrm{d}U/\textrm{d}\theta = 0, with U the potential energy of the system and \theta the angle of the board with the horizontal.

U changes when the weights change height. If we slant the board up by a small angle, the weight on the board will go up and the weight on the platform will go down. We need the resulting changes in potential energy to cancel each other.

Let the distance from the fulcrum to the weight on the board (the body weight) be L_B, and the body weight itself w_B. U_B = w_B h_B, with h_B the height of the weight. Then \textrm{d}U_B = L_B w_B \textrm{d}\theta. (Draw a picture to see this. Also, try finding some of the assumptions being used). Similarly \textrm{d}U_w = L_w w_w \textrm{d}\theta. These need to be equal, so

L_B w_B = L_w w_w

w_w = w_B \frac{L_B}{L_w}

Your center of mass is maybe 70% of the way to your shoulders, so doing a push up requires a force about 70% of your body weight.

Bridge

Sometimes it’s easy to work with energy, but other times it’s easier to work with forces. If we hang a chain between two points, we could find its shape by minimizing its energy, subject to the constraint that the length is constant. This requires the more-advanced calculus of variations.

On the other hand, we can still analyze the hanging chain in terms of force with elementary calculus.

That’s the chain, hanging between two posts. We’ll zoom in on just the red part.

\textrm{d}x and \textrm{d}y show the length and width of the entire segment. There are two tension forces, T_1 and T_2, acting on the segment, and their components are shown in the picture.

The segment doesn’t go left or right, so T_{1x} = T_2x. The segment doesn’t go up or down, either, so the vertical components of the tension must balance gravity. That means \textrm{d}T_y = \sqrt{\textrm{d}x^2 + \textrm{d}y^2}\lambda g, with g the acceleration due to gravity and \lambda the mass per unit length of the segment.

Tension must also point along the direction of the rope, so T_y/T_x = \textrm{d}y/\textrm{d}x.

Combining this algebraically gives

\frac{\textrm{d}T_y}{\textrm{d}x} = \sqrt{1 + (T_y/T_x)^2} g \lambda

The trick is then to figure out what function satisfies this equation. A trig function looks good because of the relation \sin(\theta) = \sqrt{1 - \cos^2\theta}. In fact, the solution is

\frac{T_y}{T_x} = \frac{\textrm{d}y}{\textrm{d}x}  = \sinh \left( \frac{g\lambda}{T_x}x \right) + C_1

y = \frac{T_x}{g\lambda} \cosh\left( \frac{g\lambda}{T_x}x \right) + C_1 x + C_2

This is called a catenary curve.

Exercises

  1. Rework the hanging chain problem, where the chain is a suspension bridge. Imagine the chain itself with no weight, and the bridge having constant density. This is equivalent to taking the mass of the segment to being simply \lambda g \textrm{d}x rather than \lambda g \sqrt{\textrm{d}x^2 + \textrm{d}y^2}. (Answer: a parabola)
  2. Find the shape of a hanging spring of zero rest length. In this case, the total tension is inversely proportional to the mass density. (Answer: also a parabola)

Follow

Get every new post delivered to your Inbox.