Posts Tagged ‘quantum mechanics’

Dual Vector Spaces

December 23, 2008

Dual vector spaces pop up in relativity and quantum mechanics. Each time I’ve read about them, I’ve felt like I “got it” just enough to do the calculations. The details of how you go about constructing these things were a bit muddy, though. So, on the pretext that the best way to understand is to explain, I’ll try to outline the bare bones ideas below. Hopefully, I’ll keep fleshing this out with more interesting stuff as time rolls on.

I’ll use an unconventional notation for this post. That’s because the conventional notation is suggestive of what you know by the end of the argument, but I don’t want to begin at the end! The conventional notation (bra-ket in quantum mechanics or upper and lower indices in relativity) makes a lot of sense to the initiated, but is confusing when laying out the groundwork for the first time. After we get that groundwork down, I’ll backtrack and review the ideas using standard notation.

Disclaimer: I do not vouch for the accuracy of what appears below. This is just me, trying to define the things we use in physics in a way that I can understand. I just hope I don’t mutilate it too badly. Also, to any knowledgeable readers who wish to leave some feedback, I’d be grateful.


Let \textrm{V} be a vector space.

A one form f is a linear mapping f \colon \textrm{V} \rightarrow \textbf{F},

where \textbf{F} is the field over which \textrm{V} is defined. By “linear” I mean that

f(\alpha_1 \mathbf{v_1} + \alpha_2 \mathbf{v_2}) = \alpha_1 f(\mathbf{v_1}) + \alpha_2 f(\mathbf{v_2})

where \alpha_1, \alpha_2 are arbitrary elements of \textbf{F}

and \mathbf{v_1}, \mathbf{v_2} are arbitrary vectors in \textrm{V}.

Define addition of one forms by

\left[ f+g \right] \left( \mathbf{v} \right) = f \left( \mathbf{v} \right) + g \left( \mathbf{v} \right).

Addition of one forms is then commutative and associative, because addition of real numbers has those properties.

Define scalar multiplication of one forms in the obvious way:

\left[ \alpha f \right] \left( \mathbf{v} \right) = \alpha * f \left( \mathbf{v} \right).

Now note that

I \left( \mathbf{v} \right) = 0

is linear, and therefore a one form. Zero is the additive identity for the field, so I is the identity one form.

Each one form has a unique inverse

f^{-1} \left( \mathbf{v} \right) = -1*f \left( \mathbf{v} \right)

All this means that the one forms themselves form a vector space. (If you’re worried about some vector space prerequisite I didn’t mention, such as existence of a multiplicative identity, you can check for yourself that it’s in there. I’m not tricking you.)

Now we want to know the dimension of the space of one forms. We suspect that it’s the same as the dimension of \textrm{V}. To prove it, we’ll explicitly find a basis.

Due to linearity, the action of a one form on any vector in \textrm{V} is completely determined by the action of the one form on the basis vectors, \hat{\mathbf{v}}_i, of \textrm{V}. This is true because

f(\mathbf{v}) = f \left( \sum_{i=1}^n \alpha_i \hat{\mathbf{v}}_i \right) = \sum_{i=1}^n \alpha_i f(\hat{\mathbf{v}}_i).

For each basis vector, \hat{\mathbf{v}}_i of \textrm{V}, define a one form by

f_i \left( \hat{\mathbf{v}}_j \right) = \delta_i^j.

The above equation is defining the action of the one form f_i on each of the basis vectors \hat{\mathbf{v}}_j, which we already said is sufficient to completely specify the one form.

I claim that these f_i form a basis for the space of one forms.

Clearly they are independent, because assuming they are dependent gives, for some set of not-all-zero coefficients a_i,

\sum_{i=1}^n a_i f_i = I \Rightarrow \sum_{i=1}^n a_i f_i(\hat{\mathbf{v_j}}) = \sum_{i=1}^n a_i \delta_i^j = a_j = 0

which is a contradiction, because it was assumed that not all the coefficients were zero. The last equality in the above equation comes from I(v_j) = 0.

The f_i also span the space of one forms, because an arbitrary one form g can be written as

\begin{array}{rcl} g(\mathbf{v}) &=& g \left( \sum_{i=1}^n \alpha_i \hat{\mathbf{v}}_i \right) \\ { } &=& \sum_{i=1}^n \alpha_i g(\hat{\mathbf{v}}_i) \\ {} &=& \sum_{i=1}^n \alpha_i g(\hat{\mathbf{v}}_i) f_i(\hat{\mathbf{v}}_i) \\ { } &=& \sum_{i=1}^n g(\hat{\mathbf{v}}_i) f_i(\mathbf{v}) \end{array}

which implies

g = \sum_{i=1}^n g(\hat{\mathbf{v}}_i) f_i.

This means the f_i span the space, and are basis. Because there is one f_i for each \hat{\mathbf{v}}_i, the vector space \textrm{V} and the space of one forms have the same dimension.

Here comes the part that tricked me before. We’re going to change our definition of \textrm{V}. We invented it in the first place, so I suppose it’s ours to mess around with. We’ll describe the change in \textrm{V} by the change in the basis vectors \hat{\mathbf{v}}_i.

The change is that before, the basis vector \hat{\mathbf{v}}_i was just an element of a an abstract vector space. Now, we’re going to change that, to define that \hat{\mathbf{v}}_i is a linear operator on the space of one forms, whose action is \hat{\mathbf{v}}_i(f_j) = \delta_i^j. An arbitrary vector \mathbf{v} in \textrm{V} changes only in that it is the same linear combination of these new basis vectors.

This is a bit subtle. We’re changing the original vector space from just a nice little vector space sitting all alone by its logical self, into being the space of all linear transformations on the one forms. We already know that the set of all linear transformations on the one forms is indeed a vector space, because all the arguments we used before when showing that the one forms are a vector space can trivially cascade down to this redefinition of \textrm{V}.

The problem is that we’re redefining \textrm{V} in terms of the one forms, but the one forms themselves are being defined in terms of the old \textrm{V}. If we just kill off the old \textrm{V}, we no longer have one forms. And if we no longer have one forms, we no longer have our new \textrm{V}.

The solution to this cute little quandary is that really what we want to do is redefine both vector spaces at the same time. We want to define them so that each one is the set of linear, scalar-valued functions of the other. We began this discourse with an asymmetry between the vectors and the one forms. Now I’m saying we should abolish that asymmetry. Really, we should have skipped all the business with the original vector space from the beginning. We should have started off by defining two vector spaces, \textrm{U} and \textrm{V} to be dual spaces – each the set of linear scalar-valued functions of the other.

The reason I didn’t start out that way was that I thought it would be too confusing. I thought the definition would sound circular. In fact it’s not. It’s completely reasonable, but only once you know the sort of facts we worked so hard to find out. It can only work because the one forms make a vector space of the same dimension as the original vector space.

So let’s do it properly, from the beginning, in a different notation.


Let \textrm{U} and \textrm{V} denote vector spaces of the same dimension, where each space consists of the set of all functions that map elements of the other vector space to scalars in such a way that \mathbf{u}(\mathbf{v}) = \mathbf{v}(\mathbf{u}) for arbitrary \mathbf{u} \in \textrm{U}, \mathbf{v} \in \textrm{V}. Then \textrm{U} and \textrm{V} are dual vector spaces.

The trick is in seeing that this makes any sense – that such a thing is possible. That’s what all the nonsense above was about. It says that by starting with a vector space, either \textrm{U} or \textrm{V} in this case, you can build up its dual space by the procedure I described above.


Finally, we’ll discuss notation. In quantum mechanics, elements of \textrm{U} are called “bra vectors” and written as

\langle u \mid,

while the elements of \textrm{V} are called “ket vectors” and written as

\mid v \rangle.

The notation for one acting on the other (it doesn’t matter which acts on which) is written

\langle u \mid v \rangle.

In relativity, the elements of \textrm{U} are called “covectors” and are written as

\mathbf{u} = u_\alpha \mathbf{e}^\alpha,

where \mathbf{e}^\alpha is a basis covector and u_\alpha is its component. We are writing the expansion of the vector in the basis we chose, and we are implicitly summing over all \alpha.

The elements of \textrm{V} are called “vectors” and written

\mathbf{v} = v^\beta \mathbf{e}_\beta.

One vector acting on a covector (or covector acting on a vector) is written

u_\alpha \mathbf{e}^\alpha v^\beta \mathbf{e}_\beta = u_\alpha v^\beta \delta_\beta^\alpha = u_\alpha v^\alpha

You can take the first step in that equation by the definition of the basis vectors acting on each other. The second step is a sum of the \mathbf{v} components over the kronecker delta.

Finally, you may also be familiar with the notation in which elements of \textrm{U} are written as row vectors and elements of \textrm{V} as column vectors, which is the way it is usually done in a first course on linear algebra. However, the subtleties of dual spaces are rarely explored in such classes!