Archive for the ‘teaching’ Category

Unwinding: Physics of a spool of string

June 1, 2012

It’s been a long day. Let’s unwind with a physics problem.

This problem was on the pre-entrance exam I took before arriving at Caltech for my freshman year. I’ve seen it from time to time since, and here I hope to find an intuitive solution.

You have a spool of thread, already partially unwound. You pull on the thread. What happens?

Here it is in side view. The dashed circle is the inside of the spool and the green line is the thread. Take a minute to see if you can tell how it works. Does the spool go right or left?


The usual method is to work it out with torques. The forces you must account for are the force of tension from the string and the force of friction from the table.

Torques are actually a pretty easy way to solve this problem, especially if you calculate the torque around the point of contact between the spool and the table (since in that case friction has no moment arm and exerts no torque).

This method is direct, but it’s useful to find another viewpoint if you can.

Let’s first examine a different case where the string is pulled up rather than sideways.


In this case, even if the first situation was unclear, you probably know that the spool will roll off to the left. To see why, let’s imagine that the thread isn’t being pulled by your hand, but by a weight connected to a pulley.


I put a red dot on the string to help visualize its motion.

The physics idea is simply that the weight must fall, so the red dot must come closer to the pulley. Which way can the spool roll so the red dot moves upward?

When the spool rolls (we assume without slipping), the point at the very bottom, where it touches the table, is stationary. The spool’s motion can be described, at least instantaneously, as rotation around that contact point.

Googling, I found a nice description of this by Sunil Kumar Singh at Connexions. This image summarizes the point:

 Rolling motion  (rm1a.gif)

If the spool rolls to the right, as above, the point where the string leaves the spool (near point B), will have a somewhat downward motion. This will pull the red dot down and raise the weight. That’s the opposite of what we want, so what really happens is that the spool rolls to the left, the string rises, and the weight falls.

With this scenario wrapped-up (or unwrapped, I suppose), let’s return to the horizontal string segment.


Again, the weight must fall and so the red dot must go towards the pulley.

If we check out Mr. Singh’s graphic, we’re now concerned with the motion of a point somewhere near the bottom-middle, between points A and C. As the spool rolls to the right, this point also moves to the right. This is indeed what happens as the weight falls.

Notice that the red point actually moves more slowly than the spool as a whole. This means the spool catches up to the string as we move along – the spool winds itself up. If the inside of the spool is 3/4 as large as the outside (like it is in my picture), the spool rolls 4 times as fast the string moves, and so for every centimeter the weight falls, the spool rolls four centimeters.

Here’s a short video demonstration:

Giving Grades

February 10, 2012

Last semester, I was the TA for a small course on the physics of  waves aimed at biophysics majors. Since I was the only TA, I was completely in charge of grading homework and writing homework solutions.

I don’t like the idea of letter grades. They’re a pretty clear example of Goodhart’s law. As such, I especially don’t like arguing over grades, something pre-meds are apt to do.

So I tried out a strategy that worked pretty well. I announced at my first recitation section that due to the inevitable errors I would make while grading homework, I was adding 5% to each homework grade (except where that would take the score over 100%). This would hopefully even out any errors I made of the course of the semester.

However, if the students wanted to find all the grading errors I made and point them out to me, I would still be happy to add back in those missing points. Of course, if that were the case, it would be clear I wouldn’t need to give that student the extra 5%, since it was only there to compensate for errors, and the student planned on catching those errors themself.

By essentially bribing students with a small bonus, I managed to go the entire semester without playing picky-points with anyone. (Although I can’t say for certain that I would have had to play it without the policy, since this was my first course here.)

This semester I’m in a 200-student introductory course and the homework is graded automatically by computer. I do have to grade lab reports, though.

At the TA organizational meeting today, they asked whether we TAs prefer labs to be graded out of 30 points or 100 points. 30 points is the clear winner. The problem with a 100-point scale is that you’re so used to it, the number is instantaneously and unconsciously compared against your expectations, but a less-common scale throws that off, is if you are hearing the temperature in Celsius or the price in yen. (For someone with my US biases, of course.)

26/30 and 87/100 are the same grade, but when you see 26/30, you say, “oh, I lost four points.” When you see 87/100 you say, “a B, what the hell! Where’s that TA?”

That’s the theory, at least. It remains to be scene if this semester will go as smoothly as the last.

Visualizing Elementary Calculus: Differentiation Rules 1

March 27, 2011

The basic rules of differentiation are linearity, the product rule, and the chain rule. Once we start graphing functions, we’ll revisit these rules.

This Series
I – Introduction
II – Trigonometry
III – Differentiation Rules


The linearity of differentials means

\textrm{d}(\alpha u + \beta v) = \alpha \textrm{d}u + \beta \textrm{d}v

\alpha and \beta are constants, while u and v might change.

This looks obvious, but here’s a quick sketch.

First we’ll look at \textrm{d}(\alpha u). Construct a right triangle with base 1 and hypotenuse \alpha. Then extend the base by length u. This creates a larger, similar triangle. The hypotenuse must be \alpha times the base, so the hypotenuse is extended by \alpha u.

Then increase u by \textrm{d}u. This induces an increase \textrm{d}(\alpha u) in the hypotenuse.

We draw an original blue triangle with base 1 and hypotenuse alpha. Then it's extended to the dark green triangle, adding u to the base and alpha*u to the hypotenuse. Finally, we increment u by du and observe the effect.

The little right triangle made by \textrm{d}u and \textrm{d}(\alpha u) is similar to the original, so

\frac{\textrm{d}(\alpha u)}{\textrm{d}u} = \frac{\alpha}{1}


\textrm{d}(\alpha u) = \alpha \textrm{d}u

Next look at \textrm{d}(u + v). u+v is just two line segments laid one after the other. We increase the lengths by \textrm{d}u and \textrm{d}v and see what the change in the total length \textrm{d}(u+v) is.

The total change is equal to the sum of the changes.

\textrm{d}(u + v) = \textrm{d}u + \textrm{d}v

These rules combine to give the rule for linearity

\textrm{d}(\alpha u + \beta v) = \alpha \textrm{d}u + \beta \textrm{d}v

The Product Rule

The product rule is

\textrm{d}(uv) = u\textrm{d}v + v\textrm{d}u

To show this, we need a line segment with length uv.

Start by drawing u, then drawing a segment of length 1 starting at the same place as u and going an arbitrary direction.

Close the triangle. Extend the segment of length 1 by v, and close the new triangle. We’ve now extended the base by uv.

Construction of length u*v, by similar triangles.

Increase u by \textrm{d}u and v by \textrm{d}v. This results in several changes to uv.

The segment uv has a little bit chopped off on the left, since \textrm{d}u cuts into the place where it used to be.

uv is also extended twice on the right. The first extension is the projection of \textrm{d}v down onto the base. All such projections multiply the length by u, so the piece added is u\textrm{d}v.

Finally there is a piece added from the very skinny tall triangle. It is similar to the skinny, short triangle created by adding \textrm{d}u to u. The tall triangle is (1+v) times as far from the bottom left corner as the short one, so it is (1+v) times as big. Since the base of the short one is \textrm{d}u, the base of the tall one is (1+v)\textrm{d}u.

Combining all three changes to uv, one subtracting from the left and two adding to the right, we get

\textrm{d}(uv) = -\textrm{d}u + u\textrm{d}v + (1+v)\textrm{d}u = u\textrm{d}v + v\textrm{d}u

This is the product rule. We’ll give another visual proof in the exercises.

The Chain Rule

Suppose we want \textrm{d}\sin x^2. (There’s no particular reason I can think of to want that, but we have a limited milieu of functions at hand right now.)

We know \textrm{d}(\sin\theta) = \cos\theta\textrm{d}{\theta}. Let \theta = x^2.

\textrm{d}(\sin x^2) = \cos(x^2)\textrm{d}(x^2)

But we already know that \textrm{d}(x^2) = 2x\textrm{d}x, so substitute that in to get

\textrm{d}(\sin x^2) = \cos(x^2)2x\textrm{d}x

This is called the chain rule. A symbolic way to right it is

\frac{\textrm{d}f}{\textrm{d}t} = \frac{\textrm{d}f}{\textrm{d}x}\frac{\textrm{d}x}{\textrm{d}t}

Suppose you are hiking up a mountain trail. f is your height above sea level. x is the distance you’ve gone down the trail. t is the time you’ve been hiking.

\textrm{d}f/\textrm{d}t is the rate you are gaining height. According to the chain rule, you can calculate this rate by multiplying the slope of the trail \textrm{d}f/\textrm{d}x to your speed \textrm{d}x/\textrm{d}t.


  • Show that the linearity rule \textrm{d}(\alpha u) = \alpha \textrm{d}u is a special case of the product rule.
  • What is the derivative of A\sin\theta + C\cos\theta with respect to \theta? Take the derivative with respect to \theta of that. (This is called a “second derivative”.) What do you get? (Answer: -1 times the original function)
  • Use the product rule to prove by induction that the derivative of x^n is n x^{n-1} for all positive integers n.
  • Apply the product rule to x^nx^{-n} = 1 to prove that the “power rule” from the previous question holds for all integers n.
  • Look back at the arguments from the introduction. Draw a rectangle with one side length u and one side length v. Its area is uv. Use this to prove the product rule.
  • Apply the chain rule to (x^{1/n})^{n} = x to find the derivative of x^{1/n} with respect to x for all integers n (Answer: \frac{1}{n} x^{1/n -1})
  • Argue that the derivative of x^{p/q} = \frac{p}{q}x^{p/q - 1} for all rational numbers p/q.
  • Show that the derivative of a polynomial is always another polynomial. Is there any polynomial that is its own derivative? (Answer: no, except zero)
  • Combine the product rule with the chain rule to prove the quotient rule \textrm{d}\frac{u}{v} = \frac{u\textrm{d}v - v\textrm{d}u}{v^2}

A Non-mathematician’s Non-apology

March 26, 2011

After finishing this post about the derivative of the sine function, I decided to hunt around online to see how common its approach is.

It’s not common. Most sites take the derivative of sine by considering

\frac{\textrm{d}(\sin\theta)}{\textrm{d}\theta} = \lim_{\Delta\theta \to 0}\frac{\sin(\theta + \Delta \theta) - \sin(\theta)}{\Delta \theta}

and working from there.

Eventually, after wading through three pages of results, I found another write-up of the geometric argument from, of all places, a site called Biblical Christian World View. It is apparently the personal site of a guy who’s good at math and also thinks it makes sense to write things like,

I illustrated Biblical truths with mathematical expressions. For an example, I illustrated the Biblical truth, “With God, nothing is impossible” as “two negatives equate to a ringing positive.” In the arithmetic of negative numbers -(-7) = +7! Two negatives equal a positive.

So. There’s that.

But just a little further along the Google results I found one more presentation of the same idea. This one is from Victor J. Katz, a mathematician who wrote a book about the history of math, and was writing from the historical point of view.

His article is much better than mine. The proof is clearer and surrounded with tons of other insight.

Katz delightfully points out how great a term “arcsine” is – it’s the length of the arc associated with that value of the sine function. Then, at the end, he gives Leibniz’ original argument that y = \sin\theta satisfies \frac{\textrm{d}^2 y}{(\textrm{d}\theta)^2} = -y, and it’s crazy! Differentials are applied willy-nilly and manipulated algebraically in ways nobody does any more. I felt disoriented at first, adapting to this new way of thinking about calculus, and then wondered why I’d never seen it until now.

It’s true that there are a lot of old techniques no one uses, and that’s because now we have better ones. Indeed, modern analysis, with its deltas and epsilons, is much better, mathematically, than manipulating differentials in dubious ways. It’s rigorous and logical.

It’s also hard. I’ve been asked to teach delta-epsilon proofs to quite a few people, and I’ve never been able to get it across. I’m giving up on that for beginners. I am going to teach the geometry stuff, and I’m not going to feel guilty about it.

It is okay to learn a thing the wrong way the first time. That first pass is only there to get you used to the main ideas, and the main idea a calculus is applying derivatives, integrals, and series. It is not the mean value theorem.

Once you learn a rough version, you practice it in the field until you’re comfortable. Do some physics. Learn some differential equations. After all that, it’s nice to come back, study calculus again, and finally understand all that’s really going on.

Actually, I like it better that way. Lots of my college classes made me think, “Oh, wow – so that’s what was behind the curtain!” But if you had shown me all the wheels and gears up front, I’d have been too busy checking how each one fit into the next to see what they accomplished.

A case-in-point is linear algebra. I remember almost nothing from my freshman linear algebra course. It wasn’t a bad course, but it was rigorous, proving theorems from the axioms of vector spaces, and it was beyond the level I was ready for at the time.

A couple years later, I found I really did need to know linear algebra to get through quantum mechanics, so I watched Gilbert Strang’s video lectures, which are far more concrete.

They were wonderful. I understood what was happening. I could do all the calculations and answer all the conceptual questions.

Then, finally, I went back to read Sheldon Axler’s Linear Algebra Done Right, a book that goes back again to the axioms-of-a-vector-space point of view, and thought it was wonderful.

Keith Devlin disagrees. Devlin takes up multiplication, claiming one should not tell young children that multiplication is repeated addition. Multiplication is its own fundamental operation. (The field axioms treat multiplication and addition independently.)

I was taught multiplication as repeated addition as a child, and then retaught multiplication as an fundamental operation in college. Do you know how confused I was by that? None. Zero confusion ever. In fact I never even noticed the discrepancy until Devlin pointed it out. I thought about multiplication as repeated addition when it was convenient, and thought about it as multiplication when that was convenient, and never realized I was switching.

I do the same for the geometric and analytic modes of thinking about calculus now. When I’m solving a physics problem, I don’t even notice whether I’m doing calculus or algebra at a given moment – it’s all just problem solving.

Why, then, do introductory calculus classes spend a month learning limits? Better just to ignore them and press on to the good stuff. There will be time later for learning what the difference between “continuous”, “differentiable”, and “smooth” is – modern medical science is working new miracles all the time.

Visualizing Elementary Calculus: Trigonometry

March 26, 2011

Here we’ll find the derivatives of trigonometric functions. The goal is to reinforce the idea of \textrm{d} as a thing that means “a little bit of” and grant some new insight into why these derivatives are what they are. The first argument is based on the preface of Tristan Needham’s Visual Complex Analysis. I haven’t read the bulk of it, but the preface is good.

This series
I – Introduction
II – Trigonometry

The Sine Function

Let’s find \textrm{d}(\sin\theta) / \textrm{d}\theta. The sine function is the height of a right triangle in the unit circle. We’ll draw it, and add a little change in \theta. This induces a change in \sin\theta. The change in \theta is called \textrm{d}\theta and the change in \sin\theta is called \textrm{d}(\sin\theta).

We show the sine of an angle as the dark blue line. The change in the sine when we change the angle slightly is the light blue line.

The interesting part is \textrm{d}\sin\theta, so we’ll zoom in there in the next picture. Before we do, remember that the arc length along a piece of the unit circle is equal to the angle it subtends. This will tell us the length of the little piece of the circumference near \textrm{d}\sin\theta. Also remember that we’re imagining \textrm{d}\theta to get smaller and smaller, until the two radii in the picture are parallel. We get this:

The interesting region is blown up to large size. The black line d(theta) is part of the edge of the circle. The angles marked are congruent to theta.

The section of the circle is \textrm{d}\theta long. It looks like a straight line because we are zoomed in close, like the horizon at the beach. You can use some geometry to show that the angles marked are congruent to \theta.

Looking at the right triangle formed, we can use the definition of the cosine function to read off

\frac{\textrm{d}(\sin\theta)}{\textrm{d}\theta} = \cos\theta

which is the derivative of the sine function.

Motion on the Unit Circle

Another way to view these derivatives is to imagine a point moving around the outside edge of the unit circle with speed one. Its location as a function of time is (\cos t, \sin t).

Its velocity is tangent to the circle and length one. Let’s draw the velocity vector right at the point, and then also translate it to the origin.

The position of the point is the red vector r. Its velocity is the green tangent v, which has also been copied to the origin.

We want to know the coordinates of \vec{v}. That’s not too hard; \vec{v} is a quarter-circle rotation of \vec{r}. Draw in the components of \vec{r}, and rotate those components to get \vec{v}. The x-component of the position becomes the y-component of the velocity, and the y-component of the position becomes minus one times the x-component of the velocity.

The components of the position get rotated a quarter turn to make the components of the velocity.

The derivative of position is velocity, and so comparing components between the position and velocity vectors, we get

\frac{\textrm{d}(\cos\theta)}{\textrm{d}\theta} = -\sin\theta

\frac{\textrm{d}(\sin\theta)}{\textrm{d}\theta} = \cos\theta


  • Look back at the first derivation we gave that \textrm{d}(\sin\theta)/\textrm{d}\theta = \cos\theta. Rework it to find derivatives of the other five trig functions. You might want to note that one way to interpret \tan\theta and \sec\theta is

The tangent and secant of an angle are side lengths of a right triangle with "adjacent" side length one.


  • Look back at the argument about a dot moving around a circle. Consider a larger circle to find the derivative of 5\sin\theta with respect to \theta. (Answer: 5\cos\theta)
  • Suppose the dot moving around the edge of the circle is going three times as fast. What does this mean for the derivative of \sin(3 t) and \cos(3 t) with respect to t? Remember that the velocity must still be perpendicular to the position, but not necessarily unit length and more. (Answer: the derivative of \sin(3 t) with respect to t is 3\cos(3 t).
  • Suppose the dot is moving at a variable speed v(t) = t, so that it keeps getting faster. Then the y-coordinate of the position is \sin(\frac{1}{2}t^2). Again, the velocity is perpendicular to position, but its length is changing. What is the derivative of \sin(\frac{1}{2}t^2) with respect to t? (Answer: t\cos(\frac{1}{2}t^2)

Visualizing Elementary Calculus: Introduction

March 25, 2011

Recently I’ve been trying to be more geometrical when discussing elementary calculus with high school students. I don’t want to write an entire introduction to calculus, but the next few posts will outline some ways I think the geometric view can be helpful.

This series
I – Introduction
II – Trigonometry

You know about \Delta, which means “the change in”. For example, if w represents my weight, then -\Delta w represents the weight of the poop I just took.

Let’s say h is your height above sea level. \Delta h is the change in that height, but what change? The change when you climb the stairs? When you jump out of a plane? When you step on a banana peel?

When we think about change, we usually think about two things changing together. You get higher when you climb another stair on the staircase. h is changing, and so is s, the number of stairs climbed.

These two changes are related to each other. Say the stairs are 10 cm high. Then you gain 10 cm of height for each stair. We can write that as \Delta h = 10 {\rm cm} \hspace{.5em} \Delta s. We can also write it \Delta h / \Delta s = 10 \hspace{.5em}{\rm cm}. This says, “the height per stair is ten centimeters.”

This is the goal of calculus – to study the relationships between changing quantities. Let’s do a real example.

The Area of a Square

Let’s say we have a square whose sides lengths are x. Its area is x^2. What is the relationship between changes in its area and changes in the length of a side? Draw the square, then expand the sides some. The amount the sides have expanded is \Delta x. The new area that’s been added is \Delta (x^2).

We begin with the red square on the left, whose area is x^2. We add an extra amount Delta(x) to the sides, creating all the new green area.

From the picture we see

\Delta(x^2) = 2x\Delta x + (\Delta x)^2

This formula relates \Delta (x^2), the change in the area, to \Delta x, the change in the length of a side.

The Derivative of x^2

In the picture of the square, there is a little piece in the upper-right corner whose area is (\Delta x)^2. It is the smallest bit of area in the whole picture.

Look what happens when we make \Delta x even smaller.

We shrink Delta(x) and observe what happens to the different areas being added on.

In the first picture, \Delta x (no longer marked) is a quarter of x. (\Delta x)^2 is the dark green area, and it is one quarter as large as x \Delta x, the light green area. We see this because the dark patch fits inside the light one four times.

In the second picture, we shrink \Delta x to one eighth of x. All the green areas shrink, but the dark patch shrinks on two sides while the light patches shrink on only one. As a result, the dark (\Delta x)^2 is now only one eighth the size of the light x \Delta x.

If we continued to shrink \Delta x, this ratio would continue to decrease. Eventually we could tile the dark patch a million times into the light one. So, as long as \Delta x is very small, we can get a good estimate of the entire green area by ignoring the dark part (\Delta x)^2. Thus

\Delta(x^2) \approx 2x\Delta x

This approximation becomes better and better as \Delta x shrinks, becoming perfect as \Delta x becomes infinitesimally small.

When we want to indicate these infinitely small changes, we trade in the \Delta for a {\rm d} and write

\textrm{d}(x^2) = 2x \textrm{d}x

The terms \textrm{d}(x^2) and \textrm{d}x are called “differentials”. The equation expresses the relationship between two infinitely-small changes, one in x and one in x^2.

Frequently, we divide by \textrm{d}x on both sides to get

\frac{\textrm{d}(x^2)}{\textrm{d}x} = 2x

This is called “the derivative of x^2 with respect to x“.

Example 1: Estimating Squares

20^2 = 400. What is 21^2?

Here x = 20, and we’re looking at x^2. When x goes from 20 to 21, it changes by 1, so \textrm{d}x = 1. Our formula tells us

\textrm{d}(x^2) = 2x \textrm{d}x = 2*20*(1) = 40

Hence, x^2 increases by about 40, from 400 to 440.

The real value is 441. We got the change in x^2 wrong by about 2%. That’s because \textrm{d}x wasn’t infinitely small.

Let’s try again, this time estimating the square of 20.00458. Now \textrm{d}x = .00458, so

\textrm{d}(x^2) = 2 x \textrm{d}x = 2*20*.00458 = .1832

The estimate is 400.1832. The real value is 400.183221. We did much better, under-estimating the change by only 0.01% this time. Also, it was not much harder to do this problem than the last, but squaring out 20.00458 by hand would be a pain. We saved some work.

Example 2: How Far Is the Horizon?

The beach is a good place to think about calculus. If you look out at the ocean, the horizon appears perfectly flat. Nonetheless, we know the Earth is really curved. In fact, we can deduce the curvature of the Earth by standing on the beach and enlisting the help of a friend in a boat.

It works like this: You stand on the beach with your head two meters above the water. Your friend sails away until the boat begins to disappear from sight. The reason the bottom of the boat is disappearing is that it is hidden behind the curvature of Earth.

When the bottom of the boat disappears, measure the distance to some part of the boat you can still see. What’s the relationship between your height, the distance to the boat, and the radius of Earth?

A picture will help. We’ll call your height h and the distance to the horizon z.

You are the vertical stick on top, height h. The boat is the brown circle. It's at the horizon, a distance z away. The dotted line shows your line of sight. When the bottom of the boat begins disappearing, a right triangle forms.

Your height, the radius of Earth, and the distance to the horizon are related by the Pythagorean theorem to give

R^2 + z^2 = (R+h)^2

this is equivalent to

z^2 = 2Rh + h^2

As we have seen, if your height h is small compared to the size of the Earth (and it is), the term h^2 drops away and the distance to the horizon is

z = \sqrt{2Rh}

You can see about 5 {\rm km} at the beach, making the radius of Earth about 6,000 {\rm km}. (It’s actually 6378.1 {\rm km}).

Next we want to know how much further you can see if you stand on your tiptoes. That would be a small change \textrm{d}h to your height. It would let you see a small amount \textrm{d}z further. How is \textrm{d}h related to \textrm{d}z?

We already know

\textrm{d}(x^2) = 2x\textrm{d}x

So let x^2 = h, or x = \sqrt{h}, and we have

\textrm{d}h = 2\sqrt{h}\hspace{.3em}\textrm{d}(\sqrt{h})

But we also know

\sqrt{h} = \frac{z}{\sqrt{2R}}

so we can substitute that in to \textrm{d}(\sqrt{h}) and get

\textrm{d}h = 2\sqrt{h}\hspace{.3em}\textrm{d}\left(\frac{z}{\sqrt{2R}}\right)


\frac{\textrm{d}z}{\textrm{d}h} = \sqrt{\frac{R}{2h}}

This tells us how much further you can see if you get a little higher up. The interesting thing is it depends on h. The higher you go, the smaller \textrm{d}z. When you’re only two meters up, you get to see almost ten meters further out for every centimeter higher you go. However, if you’re 100m up on top a carousel, you get only 1 meter for each centimeter you rise.

It makes sense that the extra distance you see gets smaller and smaller the higher you go, and eventually shrinks down to zero. No matter how high you go, you can never see more than a quarter way around the globe.

(In reality, light bends due to refraction in the atmosphere, so you can sometimes see a bit further.)


Suppose we have a circle with radius r. It has a certain area (you undoubtedly know the formula already, but play along). Suppose we increase r by a small amount \textrm{d}r. What is the change \textrm{d}A in the area?

The original circle is dark blue with area A and radius R. The radius increases an amount dR, increasing the area by the light blue ring with area dA.

\textrm{d}A is the thin, light-blue ring. Imagine taking that ring and peeling it off the edge of the circle and laying it flat. We’d have a rectangle with width \textrm{d}R. Its length comes from the outside edge of the entire circle – the circumference. The circumference is 2 \pi R, so

\textrm{d}A = 2\pi R \textrm{d}R

We saw earlier that \textrm{d}(x^2) = 2x\textrm{d}x, so let x = R and we have

\textrm{d}A = \pi \textrm{d}(R^2)

Thus the quantities A and \pi R^2 change in exactly the same way. Since they also start out the same (both zero when R is zero), we have

A = \pi R^2

Next Post

We’ll look at trigonometry. Geometric arguments about the derivatives of trig functions are very simple ways of visualizing what’s going one, and are usually not introduced in a basic calculus course.


  • Draw a cube with sides x and show that \textrm{d}(x^3) = 3x^2\textrm{d}x. Thus the derivative of x^3 with respect to x is 3x^2.
  • Draw a line with length x and show that \textrm{d}(x) = \textrm{d}x, which is of course algebraically obvious. Thus the derivative of x with respect to itself is 1.
  • Draw a rectangle with width w and length c*w and show that \textrm{d}(c*w^2) = 2cw\textrm{d}w = c\textrm{d}(w^2). Thus, whenever you have the differential of a variable multiplied by a constant, the constant can pop outside. Where was this property used implicitly in this post?
  • Now that you know \textrm{d}(x^3) = 3x^2\textrm{d}x, let x^3 = u and find the derivative of u^{1/3} with respect to u. (Answer: \frac{1}{3} u^{-2/3})
  • What is \textrm{d}(x^3)/\textrm{d}(x^2)? Let u = x^2 and find the derivative of u^{3/2} with respect to u. (Answer: \frac{3}{2}u^{1/2}).
  • Examine \textrm{d}(x^4) by letting u = x^2, so we’re looking at \textrm{d}(u^2). Find the derivative of x^4 with respect to x. (Answer: 4x^3)
  • Draw an equilateral triangle with sides of length s. Increase the sides a small amount \textrm{d}s and relate this to the change in area \textrm{d}A. Does this agree with our previous findings?
  • Draw an ellipse with a fixed with semi-major axis a and semi-minor axis b. Starting with a unit circle, argue by thinking about stretching that the area of the ellipse is \pi ab. Increase a by a small amount \textrm{d}a and increase b proportionately. This adds a small area \textrm{d}A to the ellipse. Show that this area is \pi(a^2+b^2)/b\hspace{.3em}\textrm{d}a. Does this let us find the circumference of the ellipse by the same thought process as we used for the circle? (Answer: no). Why not?
  • Draw a sphere with radius R. Use the relationship between \textrm{d}R and \textrm{d}A to find the volume of a sphere, given its surface area is 4\pi R^2. Check your answer against this post.

My Brown Big Spiders

March 21, 2011

Professor: You have to learn to be able to play it blindfolded. The page, for God’s sake! The notes!

David: I’m sorry I was, uh, forgetting them, Professor.

Professor: Would it be asking too much to learn them first?

David: And-And then forget them?

Professor: Precisely.

from the movie Shine

If I want to find the volume and surface area of a sphere, I do it with calculus:

V = \int_{r = 0}^R\int^{2\pi}_{\phi = 0}\int_{\theta = 0}^\pi r^2\sin\theta \textrm{d}\theta \textrm{d}\phi \textrm{d}r  =  \frac{4}{3}\pi R^3


S = \int_{\theta = 0}^\pi\int_{\phi = 0}^{2\pi} R^2 \sin\theta\textrm{d}\theta\textrm{d}\phi = 4\pi R^2

This is correct, but I can’t use it with high school geometry students because they don’t know what an integral is, much less a Jacobian.

However, Archimedes came up with a beautiful way of discovering the volume and surface area of a sphere. He did it by relating the sphere to a known shape – a cylinder with a cone cut out of it.

He drew a picture like this:

Archimedes' illustration of the geometry of a sphere

On the left there’s a hemisphere with radius R. On the right, there’s a cylinder with radius and height both also R, so that the hemisphere would fit perfectly inside the cylinder. The cylinder has had a cone cut out from the top down tapering down to the center of the bottom. First, we’ll show that these two shapes have the same volume.

We imagine slicing the hemisphere horizontally at some certain height h. This would reveal a circle as seen in the picture. Call its radius r.

At the same height, we also slice the cylinder, leaving us with a disk. We’ll find the areas of this circle and disk.

The area of the circle is \pi r^2, which by the Pythagorean theorem is also \pi (R^2 - h^2).

Looking at the cylinder, the outer edge of the disk has radius R and the inner edge has radius h, so the area of the disk is also \pi (R^2 - h^2).

Because every horizontal slice of the hemisphere has the same area as the corresponding horizontal slice of the drilled-out cylinder, they must have the same volume. The volume of the cylinder is its original volume minus the volume of the cone, or \pi R^3 - 1/3 \pi R^3 = 2/3 \pi R^3. Hence, the volume of a full sphere is

V = 4/3 \pi R^3

Next, we’ll show that the hemisphere has the same surface area as the outside of the cylinder (the cone is now unimportant).

Take a slice of the outside of the cylinder at height h and of thickness \textrm{d}h. This forms a band around the cylinder whose area is

\textrm{d}S = 2 \pi R \textrm{d}h

Now slice the sphere at the same height with the same \textrm{d}h. This also forms a band. The band is a shorter distance around, but due to the slant of the edge of the circle, it’s also thicker. Let’s call the thickness of this band \textrm{d}x.

Slices of equal thickness dh at equal heights h on a cylinder and sphere.

The area of the band around the hemisphere is the circumference at height h multiplied by the thickness \textrm{d}x.

\textrm{d}S = 2\pi\sqrt{R^2 - h^2}\textrm{d}x

If we draw a tangent line on the sphere, it’s perpendicular to the radius. This gives us similar triangles.


\frac{\textrm{d}x}{\textrm{d}h} = \frac{R}{\sqrt{R^2 - h^2}}

Plugging back into the previous expression,

\textrm{d}S = 2\pi\sqrt{R^2 - h^2}*\textrm{d}h * \frac{R}{\sqrt{R^2 - h^2}}  = 2\pi R \textrm{d}h

So the band around the outside of the cylinder and sphere have the same surface area, so the entire shapes have the same surface area. That makes the surface area of a sphere

S = 4 \pi R^2

This is a really lovely argument. The problem is pretty hard, but the solution is simple. (I’m not sure if this is quite how Archimedes did it. To be honest I never even met the guy. I learned the idea from this animation).

I was reviewing solid geometry with a high school junior the other day, so I showed her this argument (but only the volume part). I was proud of myself for offering this little example of how interesting mathematical ideas can be. At least, I was as we began.

“It’s all so complicated!” she moaned a few minutes later when I asked her to identify a certain quantity in our sketch.

Complicated? I had thought the argument was remarkably simple – just draw a sphere and a cylinder next to each other and you’re practically done. What could be simpler? Somehow my student was getting entangled in brambles I couldn’t even see.

I did not draw quite the same picture for her that I drew earlier in this post. I didn’t want to give it all away, so I drew something more like this and asked for r:

Finding r is a simple application of something she knew well – the Pythagorean theorem. She didn’t see it, though, so I showed her this right triangle:

But then she didn’t see how long the new line I just drew was. It’s just R because it’s a radius of the sphere, but although she knew that all radii of a sphere have the same length, she couldn’t easily identify the two lines as radii and call up the relevant information. So I showed her that step, too.

After a bit more prodding, she wrote down r = \sqrt{R^2 + h^2}, a mistake that comes from applying the Pythagorean theorem incorrectly. She knows better, and should have found r^2 = R^2 - h^2, but by this point she was already flustered from her earlier mistakes, confused about what we were trying to do, self-conscious, and generally unable to approach the problem equanimously.

When she realized she had applied the Pythagorean theorem wrong, her frustration mounted, and moments later, at my next question, I was shocked with, “It’s all so complicated!”

Why did this happen? Why did I so horribly misjudge the difficulty of the exercise?

The other day I read this comment on an essay on teaching

I used to teach English as a second language. It was a mind trip.

I remember one of my students saying something like “I saw a brown big spider”. I responded “No, it should be ‘big brown spider'”. He asked why. Not only did I not know the rule involved, I had never even imagined that anyone would ever say it the other way until that moment.

Tutoring has been exposing my own brown big spiders – the little steps and bits of knowledge that I take for granted – for years. I’ve rarely stopped to notice it.

Just to follow each step in the Archimedes argument, you must make an enormous number of mathematical connections behind the scenes in your mind. Here’s a partial list:

  • A “sphere” is a round three-dimensional object like, a ball
  • Every point on the surface of a sphere is the same distance from the center
  • The “surface” of the sphere means its outside edge, or skin
  • A “point” is a little dot with no size at all. It simply marks a place.
  • You can represent three-dimensional figures in two dimensions with certain types of drawing.
  • The point of doing this drawing is to make things easier to visualize.
  • A “hemisphere” is half a sphere – the top half in this case
  • A “cylinder” is basically a tube with constant width.
  • The center of the bottom of the hemisphere is the same point as the center of the sphere it came from.
  • The height of the hemisphere is the same as the distance from the center to the edge horizontally.
  • This means that the cylinder drawn is twice as wide as it is tall.
  • The volume of a cone is one third the area of its base times its height.
  • The volume of a cylinder is its base times its height
  • The area of a circle is \pi times the square of its radius

And so on. I only stopped writing so that I’d eventually finish the rest of this post. Each item I added to that list sparked off several new ones I hadn’t considered.

Try writing your own list and you’ll quickly be overwhelmed by the exponentially-proliferating leaves on your conceptual tree. We didn’t even get close to things like the Cavalieri’s principle.

The items on my brown big spider list are not supposed to be mathematical facts so much as cognitive patterns the reader is required to have. For example, mathematically a point is not, “a little dot with no size at all,” as I called it. It’s a primitive notion and has no definition. The list still calls a point a dot, though, because the mathematically-accurate description isn’t helpful to a student, and isn’t they way most people think of it even when they’ve already learned geometry well.

When I started writing the list, I found myself wanting to say, “A sphere is a set of all points equidistant…”, but that’s no good. It uses the significant brown big spiders of “set” and “equidistant”, as well as the general idea of giving mathematical definitions, something most high schoolers don’t yet understand well. Then I wanted to say, “A sphere is a shape that’s symmetric with respect to rotations about any axis…” but this has all the same problems.

Ultimately, I chose “a sphere is a ball.” It’s imprecise, but it’s the way you think about a sphere before you’ve packaged the concept away so tightly you don’t need to think about it any more. Anyone who tells you a sphere is the two-dimensional manifold S^2 is someone who has forgotten how much they actually know about spheres. They’ve forgotten it in the good way, of course – the way David was supposed to forget the notes to Rachmaninoff. Unfortunately, I experience a crippling side effect when I forget things this way. I forget that other people haven’t yet forgotten them.

This forgetting is the psychological phenomenon of “chunking“. The most famous example involves chess players. Give expert chess players a position from a game between grandmasters and they can easily memorize the positions of thirty pieces. Give them pieces strewn randomly about the board and they’ll remember just a few – no more, in fact, than your average Joe who knows little more about chess than what the real name of the horsey is.

A position from a real game has lots of meaning, if you’re an expert. If you’re an expert you extract order from the position automatically, without consciously processing every detail. The entire task must seem quite simple to a grandmaster. Similarly, the experienced mathematician sees all the important properties of the sphere and the cylinder and the cone without having to list them out one by one, and the process is so automatic they don’t even realize they’re doing it.

In “Simple” Isn’t “Easy”, I learned not to judge the difficulty of new ideas by how simple they are, but by how familiar to the student. Despite this, I have continued to make a similar mistake when dealing with ideas the students have already learned.

“Learned” isn’t “chunked”. My student understood the meaning of “hemisphere” and the formula for the volume of a cone, but she still needed conscious effort to recall and wield those bits of knowledge. Each sat in its own corner in her mind, accessible only by dint of concerted effort, and certainly not ready to flow into a flood of beautiful ideas.

I was trying to dictate a soliloquy for her to transcribe, but I was assuming that because she could see the letters on her keyboard, should could touch-type. It turned out that the effort to hunt-and-peck was so great, all the artistry of the speech was lost.

I want to watch out for my brown big spiders in the future. I want to be more patient when they are discovered and more studious in cataloging, remembering, and working with them. Most of all, I want to look back later, and remember my students forgetting them.

‘Simple’ Isn’t ‘Easy’

November 7, 2010

You are probably aware that 3^{1/2} = \sqrt{3}. Sometimes when I’m tutoring I wind up teaching this to young students. Here is the story I use:

You already know that 3^4*3^2 = 3^6 for a very simple reason.

Forget the reason for a moment, and just focus on the rule. When you multiply exponents with the same base, you can add the powers.

That means

3^{1/2}*3^{1/2} = 3^1 = 3

Evidently, 3^{1/2} is a number such that if you multiply it by itself, you get three. But that is exactly the meaning of the square root! Hence 3^{1/2} = \sqrt{3}.

This is a very simple idea, but when I try it on students, it usually fails.

After going through the story, I ask what 16^{1/2} is. I’m hoping to hear “four”, but that’s not what happens. Sometimes they say it’s eight. Sometimes they say they don’t know. But the most common response is to go through the whole thing again. The student writes down

16^{1/2}*16^{1/2} = 16^1 = 16.

They stare it at for a while. Then they look up at me and say, “Is that right?” We discuss it a bit further to clarify. Circuitously, we stumble upon 16^{1/2}=4. After that we do a few more half-powers and they get it right. Then I ask what 8^{1/3} is. The student will write down

8^{1/3}*8^{1/3} = 8^{2/3}.

“It doesn’t work for that one,” they say. “You just get a 2/3 power, and we can’t do that.” So we talk about it some more, until after some time the student can go between roots and exponents.

Then I ask what 4^{3/2} is, but they struggle with this, too. Once that’s down we try for 6^{-1}, but that is also impenetrable (I usually hear that it’s -6). When I suggest trying to figure it out based on the rule of exponent addition, the student feels frustrated and defeated.

It’s curious that I have such difficulty teaching this idea. It is not too complicated or too difficult, even for a young child. It is far simpler than long division and far less abstract than “set the unknown variable equal to x”. The problem is not the sophistication of the idea, but a more fundamental error in communication. When I give my little presentation, the students simply have no idea what I’m doing.

An analogy: I’m teaching someone how to lift weights (this is very hypothetical). I take a dumbbell and I start doing some bicep curls. It’s only a 5-lb dumbbell, and the motion is very simple, so I figure the guy I’m teaching will get it for sure. I hand him the weight and say, “You try.”

When I hand over the weight and the student starts yanking it up and down. He purposely mimics the way I grunt in exertion and copies my facial expressions. He remembers how I looked over my shoulder to talk to him while I demonstrated the exercise, so he looks over his shoulder when trying it out. The weight ultimately does go up and down, but only with a great deal of extraneous commotion. I straighten him out with some effort, but when we move over to the bench press we’ll repeat the whole confused process.

The problem is that before we began, my student didn’t know what weight-lifting is. He didn’t know the point is to make your muscles stronger, or the counter-intuitive idea that to make your muscles stronger, you first have to tire them out by working them hard.

Similarly, my math students watch me do this strange algebraic exercise with exponents not knowing that the goal is to discover new things. They think, instead, that I was simply teaching a new procedure, as in, “This is how you solve problems where the exponent is one half.”

This is not really a big problem. Students can learn new things; that’s what being a student is about. The problem is that students’ ineptitude at this task frustrates me. At times, when watching a student struggle with a problem, I’ve felt ironic wonder at the student’s remarkable creativity – how do they find so many unexpected ways to get everything totally wrong? I wind up concluding that the student is “stupid”, and the student leaves the lesson with only the impression that they have somehow failed at a task they never even understood.

I make these grievous errors in judgment because I assume that since I’ve seen the student handle far more complicated tasks, they should master this one right away. That is not so. ‘Simple’ isn’t ‘easy’. Computing a determinant of a 4×4 matrix isn’t simple, but my students can blaze through it. Showing that the determinant will be zero by noticing that the last row is equal to first row is very simple, but I’ve never had a student use that method.

The things we’re good at are not what’s simplest, but what’s most familiar. The converse also holds: things that are unfamiliar are difficult, even if they’re simple. I personally find it much easier to solve geometry problems using coordinates, algebra, and calculus than using Euclidean geometry, even when the Euclidean approach may be just a few lines of sketching and finding a similar triangle.

When I first noticed that students were having a hard time with problems because they required unfamiliar thinking, and not because they were too hard or because the students were bad, I tried to remedy the situation with speeches. I would talk about how interesting it is to figure out where a formula comes from. I would say over and over that no, I don’t have all the formulas memorized, because as long as I know most of it, I can figure the rest out. I would prove my point by waiting until they embarked on a difficult calculation, and then solving it quickly in my head using some trick or other, supposedly demonstrating how useful it is to be able to approach a problem many different ways. Then I would describe how it’s done. “You’ll like this thing I’m about to show you,” I would say. “It’ll make your life easier.”

This backfired. It mostly led the students to believe that I either gained some ineffable voodoo skills in college or that I am in possession of an extraordinary native intellect that they could never hope to emulate.

I still don’t know quite how to handle the “simple isn’t easy problem”. I have become far more patient when trying to push students’ boundaries, and far less ambitious. I regret the many times I compromised a student’s chance at learning and my own at equanimity by failing to recognize “simple isn’t easy” in practice. I continue to search for simpler and simpler teaching stories, but I don’t spend enough time searching for ways to make the unfamiliar territory easier to navigate. I don’t know how complicated a task that is – to figure out how to build a stepladder to a new level cognition – but I know it isn’t yet easy.

The Role of Technology in Teaching

September 15, 2010

I occasionally use my iPhone while tutoring if I want to show someone a particular graphic – a photo of the Bay of Fundy or a map of Alexandria and Syene, for example.

Today I found a new way to use it during a math lesson. I asked my student whether the sine function is even or odd. He didn’t know what that meant, so I told him the algebraic rule (a function f is odd if f(x) = f(-x) for all x).

We graphed the sine function on his TI graphing calculator, and I told him, “Okay, now spin the calculator around 180 degrees to hold it upside down.” The graph looks the same. This is true for all odd functions because a 180 degree rotation is the transformation x \to -x, y \to -y.

Next we looked a the graph of cosine, but when we rotated that 180 degrees, it didn’t look the same any more. Cosine isn’t odd. Instead, I got out my phone, handed it to him, and told him to use the surface of the phone as a mirror to look at the calculator screen. The reflection of the graph looks the same as the original, as it does for all even functions, since this is the transformation x \to -x, y \to y.

I guess it shows that using too much technology is a turn-off. I hope this doesn’t reflect on me poorly.