Dual vector spaces pop up in relativity and quantum mechanics. Each time I’ve read about them, I’ve felt like I “got it” just enough to do the calculations. The details of how you go about constructing these things were a bit muddy, though. So, on the pretext that the best way to understand is to explain, I’ll try to outline the bare bones ideas below. Hopefully, I’ll keep fleshing this out with more interesting stuff as time rolls on.
I’ll use an unconventional notation for this post. That’s because the conventional notation is suggestive of what you know by the end of the argument, but I don’t want to begin at the end! The conventional notation (bra-ket in quantum mechanics or upper and lower indices in relativity) makes a lot of sense to the initiated, but is confusing when laying out the groundwork for the first time. After we get that groundwork down, I’ll backtrack and review the ideas using standard notation.
Disclaimer: I do not vouch for the accuracy of what appears below. This is just me, trying to define the things we use in physics in a way that I can understand. I just hope I don’t mutilate it too badly. Also, to any knowledgeable readers who wish to leave some feedback, I’d be grateful.
Let be a vector space.
A one form is a linear mapping ,
where is the field over which is defined. By “linear” I mean that
where are arbitrary elements of
and are arbitrary vectors in .
Define addition of one forms by
.
Addition of one forms is then commutative and associative, because addition of real numbers has those properties.
Define scalar multiplication of one forms in the obvious way:
.
Now note that
is linear, and therefore a one form. Zero is the additive identity for the field, so is the identity one form.
Each one form has a unique inverse
All this means that the one forms themselves form a vector space. (If you’re worried about some vector space prerequisite I didn’t mention, such as existence of a multiplicative identity, you can check for yourself that it’s in there. I’m not tricking you.)
Now we want to know the dimension of the space of one forms. We suspect that it’s the same as the dimension of . To prove it, we’ll explicitly find a basis.
Due to linearity, the action of a one form on any vector in is completely determined by the action of the one form on the basis vectors, , of . This is true because
.
For each basis vector, of , define a one form by
.
The above equation is defining the action of the one form on each of the basis vectors , which we already said is sufficient to completely specify the one form.
I claim that these form a basis for the space of one forms.
Clearly they are independent, because assuming they are dependent gives, for some set of not-all-zero coefficients ,
which is a contradiction, because it was assumed that not all the coefficients were zero. The last equality in the above equation comes from .
The also span the space of one forms, because an arbitrary one form can be written as
which implies
.
This means the span the space, and are basis. Because there is one for each , the vector space and the space of one forms have the same dimension.
Here comes the part that tricked me before. We’re going to change our definition of . We invented it in the first place, so I suppose it’s ours to mess around with. We’ll describe the change in by the change in the basis vectors .
The change is that before, the basis vector was just an element of a an abstract vector space. Now, we’re going to change that, to define that is a linear operator on the space of one forms, whose action is . An arbitrary vector in changes only in that it is the same linear combination of these new basis vectors.
This is a bit subtle. We’re changing the original vector space from just a nice little vector space sitting all alone by its logical self, into being the space of all linear transformations on the one forms. We already know that the set of all linear transformations on the one forms is indeed a vector space, because all the arguments we used before when showing that the one forms are a vector space can trivially cascade down to this redefinition of .
The problem is that we’re redefining in terms of the one forms, but the one forms themselves are being defined in terms of the old . If we just kill off the old , we no longer have one forms. And if we no longer have one forms, we no longer have our new .
The solution to this cute little quandary is that really what we want to do is redefine both vector spaces at the same time. We want to define them so that each one is the set of linear, scalar-valued functions of the other. We began this discourse with an asymmetry between the vectors and the one forms. Now I’m saying we should abolish that asymmetry. Really, we should have skipped all the business with the original vector space from the beginning. We should have started off by defining two vector spaces, and to be dual spaces – each the set of linear scalar-valued functions of the other.
The reason I didn’t start out that way was that I thought it would be too confusing. I thought the definition would sound circular. In fact it’s not. It’s completely reasonable, but only once you know the sort of facts we worked so hard to find out. It can only work because the one forms make a vector space of the same dimension as the original vector space.
So let’s do it properly, from the beginning, in a different notation.
Let and denote vector spaces of the same dimension, where each space consists of the set of all functions that map elements of the other vector space to scalars in such a way that for arbitrary . Then and are dual vector spaces.
The trick is in seeing that this makes any sense – that such a thing is possible. That’s what all the nonsense above was about. It says that by starting with a vector space, either or in this case, you can build up its dual space by the procedure I described above.
Finally, we’ll discuss notation. In quantum mechanics, elements of are called “bra vectors” and written as
,
while the elements of are called “ket vectors” and written as
.
The notation for one acting on the other (it doesn’t matter which acts on which) is written
.
In relativity, the elements of are called “covectors” and are written as
,
where is a basis covector and is its component. We are writing the expansion of the vector in the basis we chose, and we are implicitly summing over all .
The elements of are called “vectors” and written
.
One vector acting on a covector (or covector acting on a vector) is written
You can take the first step in that equation by the definition of the basis vectors acting on each other. The second step is a sum of the components over the kronecker delta.
Finally, you may also be familiar with the notation in which elements of are written as row vectors and elements of as column vectors, which is the way it is usually done in a first course on linear algebra. However, the subtleties of dual spaces are rarely explored in such classes!