August 8, 2019

Dual Spaces

  1. Introduction
  2. The Dual Space
  3. The Double Dual Space

Introduction

Since I haven't posted for a while, I decided to break up my rants about homology with some posts on linear (and multilinear) algebra. In this post, we will (as usual) deal only with finite dimensional vector spaces. Since we care only about abstract properties of vector spaces and not about any specific vector space, I will talk generally about a vector space $V$ of dimension $n$ over a field $\F$ for the remainder of this post.

As we discovered previously, every finite dimensional vector space has a basis. That is, there exists a linearly independent collection $(e_1,e_2,\ldots,e_n)$ of vectors in $V$ for which any vector $v$ in $V$ can be expressed as a linear combination of these basis vectors. That is, for any $v\in V$ there exist scalars $(v^i)_{i=1}^n$ in $\F$ for which

$$v=\sum_{i=1}^n v^i e_i.$$

Note that in the expression above, I have moved the index on $v^i$ into the upper position, whereas in previous posts I would have written the same scalars as $v_i$. There is a good reason for this, and it is commonly seen in physics and differential geometry. The reason will become apparent shortly, but for now just realize that using superscripts for index placement is really no different than using subscripts, and it does not represent exponentiation. For instance, $v^2$ represents the second scalar in a list and not $v\cdot v$.

Having a basis for our vector space is nice for two main reasons:

  1. Any vector can be expressed in terms of the basis because the basis vectors span our vector space.
  2. There is no redundancy in this expression because the basis vectors are linearly independent.

Recall also that a linear map $T$ between vector spaces $U$ and $V$ is a function $T:U\to V$ for which

  1. $T(u_1+u_2) = T(u_1) + T(u_2)$ for any $u_1, u_2 \in U$.
  2. $T(au) = aT(u)$ for any $a\in F$ and any $u\in U$.

We learned that linear maps are completely determined by the way that they act on basis vectors. In fact, we can specify the images of the basis vectors and extend by linearity to obtain a linear map on the whole vector space.

Now let's turn everything on its head.

The Dual Space

Let's define the most important concept of this post:

Definition. Given a vector space $V$ over a field $\F$, its dual space, written $V^*$, is the set of all linear maps from $V$ to $\F$.

Of course, we are now talking about $\F$ as a vector space over itself, or else the idea of a linear map would make no sense.

This definition may seem intimidating at first, but it's really not that complicated. An element of the dual space is just a linear function which eats a vector and returns a scalar. Elements of the dual space are often called covectors or linear functionals.

Now, the fact that the dual space literally has the word "space" in its name is hopefully suggestive that it is itself a vector space. I suppose there might technically be multiple ways to turn this set into a vector space, but the canonical way is as follows:

  • The zero vector ${\bf 0}\in V^*$ is the zero map ${\bf 0}:V\to\F$ which maps every vector to the zero element of $\F$. That is, ${\bf 0}(v)=0$ for every $v\in V$.
  • Vector addition is just function addition. That is, if $s$ and $t$ are maps in the dual space, then $s+t$ is another map defined by $(s+t)(v) = s(v) + t(v)$.
  • Scalar multiplication is inherited directly. That is, if $a$ is a scalar in $\F$ and $t$ is a map in the dual space, then $at$ is another map defined by $(at)(v) = a\cdot t(v)$.
  • Additive inverses are given by scalar multiplication by $-1$. That is, if $t$ is a map in the dual space then $-t=(-1)\cdot t$.

It is hopefully evident that all of the above maps are linear, and that with these definitions, the dual space satisfies the axioms of a vector space. I will not check these properties here because it is not difficult or instructive to do so.

Now, the next natural question to ask is how the dual space $V^*$ is related to the original space $V$. The answer is not immediately obvious. There is no canonical mapping which takes any vector and picks out a specific covector which it is related to. That's not to say we can't form a bijection from $V$ to $V^*$, it just wouldn't have much meaning, and there is not an obvious candidate for a favored bijection.

In fact, creating such a bijection would require first choosing a basis for $V$. Mathematicians do not consider this to be a natural correspondence, since it relies on picking some arbitrary basis, so we say there is no natural or canonical correspondence between $V$ and $V^*$.

However, if we try to figure out the dimension of the dual space, the picture begins to become a little clearer. Before we proceed, we'll need the following definition:

Definition. Given $n\in\N$, the Kronecker delta is the function $\delta:\Z_n\times \Z_n\to\R$ defined by

$$\delta^i_j =
\begin{cases}
0 & \text{if } i\ne j, \\
1 & \text{if } i = j.
\end{cases}$$

The weird upper and lower index notation in place of a more traditional notation for function arguments, such as $\delta(i, j)$, does have a purpose which will soon become apparent.

Alright, we're now well armed to make a very bold claim:

Theorem. If $(e_i)_{i=1}^n$ is a basis for a finite dimensional vector space $V$, then $(e^i)_{i=1}^n$ is a basis for $V^*$, where each basis covector $e^i:V\to\F$ is defined on basis vectors by $$e^i(e_j)=\delta^i_j$$ and defined on all of $V$ by extending by linearity.

Aside. Before we try to prove this, let's take a look at what these basis covectors really are. Since $(e_i)_{i=1}^n$ is a basis for $V$, we can write any vector $v\in V$ as $v=\sum_{j=1}^n v^j e_j$. Applying the $i$th basis covector to $v$ and using its linearity, we get

$$\begin{align}
e^i(v) &= e^i\left(\sum_{j=1}^n v^j e_j\right) \\
&= \sum_{j=1}^n v^j e^i(e_j) \\
&= \sum_{j=1}^n v^j \delta^i_j \\
&= v^i.
\end{align}$$

That is, $e^i$ is the linear map which picks out only the $i$th component of $v$ and discards the rest. It is in some sense a projection map onto the $i$th coordinate.

In order to understand why the complicated-looking sum above broke down into such a simple expression, recall the defining property of the Kronecker delta function. For almost every $j$, the Kronecker delta was identically zero and so those $v^j$ terms do not contribute to the sum. The only term for which it wasn't zero was the $i$th term, and so we were left with only $v^i$. This ability of the Kronecker delta function to simplify ugly sums is almost magical, and we will see it over and over again.

Proof. In order to show that $(e^i)_{i=1}^n$ is a basis for $V^*$, we need to show that it is linearly independent and that it spans $V^*$.

We argue first that it is linearly independent. Suppose $\sum_{i=1}^n a_i e^i=0$ for some scalars $(a_i)_{i=1}^n$. Then for any vector $v=\sum_{j=1}^n v^j e_j$ in $V$,

\begin{align}
\left(\sum_{i=1}^n a_i e^i\right)(v) &= \left(\sum_{i=1}^n a_i e^i\right)\left(\sum_{j=1}^n v^j e_j\right) \\
&= \sum_{i=1}^n\sum_{i=j}^n a_i v^j e^i(e_j) \\
&= \sum_{i=1}^n\sum_{i=j}^n a_i v^j \delta^i_j \\
&= \sum_{i=1}^n a_i v^i \\
&= 0.
\end{align}

Since the $v^i$ were arbitrary, the only way this can only be true is if the scalars $(a_i)_{i=1}^n$ are identically zero. Thus, $(e^i)_{i=1}^n$ are linearly independent.

We argue next that the covectors $(e^i)_{i=1}^n$ span the dual space. To this end, suppose $s$ is any covector in $V^*$. For any vector $v=\sum_{j=1}^n v^j e_j$, we have from the linearity of $s$ that

\begin{align}
s(v) &= s\left(\sum_{j=1}^n v^j e_j\right) \\
&= \sum_{j=1}^n v^j s(e_j).
\end{align}

Define $s_j = s(e_j)$ for each $1\le j \le n$. Then

\begin{align}
\left(\sum_{j=1}^n s_j e^j\right)(v) &= \left(\sum_{j=1}^n s_j e^j\right)\left(\sum_{i=1}^n v^i e_i\right) \\
&= \sum_{j=1}^n\sum_{i=1}^n s_j v^i e^j(e_i) \\
&= \sum_{j=1}^n\sum_{i=1}^n s_j v^i \delta^j_i \\
&= \sum_{j=1}^n s_j v^j \\
&= \sum_{j=1}^n v^j s(e_j) \\
&= s(v).
\end{align}

We've thus shown that any covector can be written as a linear combination of the covectors $(e^i)_{i=1}^n$, and thus they span the dual space.

It follows that $(e^i)_{i=1}^n$ forms a basis for $V^*$, as desired.

We call the basis defined in the proof above the dual basis, and it is the basis we usually work with when talking about the dual space. Note that the dual basis depends on the basis we chose for the original vector space, so every basis for $V$ has a corresponding dual basis for $\dual{V}$. However, it is of course not the only basis we could choose. It is just particularly convenient for our purposes because of the way things tend to simplify via the Kronecker delta.

Hidden in the result above is the fact that $\dim V^*=\dim V$. That's because we've exhibited a basis for $V^*$ consisting of $n$ covectors, which is the definition of the dimension of a vector space. So the dual space always has the same dimension as the original vector space (as long as its finite dimensional)! Which is pretty cool I guess. 😎

The following result will be useful to us later.

Lemma. Suppose $V$ is a finite dimensional vector space over a field $\F$ and let $v\in V$. If $s(v) = 0$ for every covector $s\in\dual{V}$, then $v = 0$.

Proof. We proceed by contraposition, supposing that $v\ne 0$ and arguing that there exists a covector $s\in\dual{V}$ for which $s(v)\ne 0$.

Choose any basis $(e_i)_{i=1}^n$ for $V$. Then we may write $v=\sum_{i=1}^n v^i e_i$ for some scalars $v^1, v^2, \ldots, v^n\in\F$. Since $v\ne 0$, one of its components (with respect to our chosen basis) must be nonzero. That is, there exists $k\in \{1, 2, \ldots, n\}$ for which $v^k\ne 0$.

We choose $s=e^k$, the $k$th dual basis vector. Notice that, by linearity of $s$,

\begin{align}
s(v) &= e^k(v) \\
&= e^k\left( \sum_{i=1}^n v^i e_i \right) \\
&= \sum_{i=1}^n v^i e^k(e_i) \\
&= \sum_{i=1}^n v^i \delta^k_i \\
&= v^k \\[.5em]
&\ne 0.
\end{align}

Thus, we have demonstrated there exists a covector $s$ for which $s(v)\ne 0$, and the result follows by contraposition.

The Double Dual Space

The following definition is exactly as you might expect.

Definition. Given a vector space $V$ over a field $\F$ and its dual space $\dual{V}$, its double dual space, written $\ddual{V}$, is the dual space of $\dual{V}$.

By now, things may seem hopelessly abstract. If the dual space was the space of all linear functions from $V$ to $\F$, that would make the double dual the space of all linear functions from the space of linear functions from $V$ to $\F$ into $\F$. As if that wasn't complicated enough, there's no end in sight. Am I ever going to stop? Or am I going to next construct the triple dual space, the quadruple dual space, ad infinitum?

It turns out we don't need to keep going, because as we will soon see, $\ddual{V}$ is essentially just $V$. We will need the above lemma to prove it.

Theorem. Every finite dimensional vector space $V$ over a field $\F$ is canonically isomorphic to its double dual space $\ddual{V}$.

Proof. Recall first that canonically isomorphic means we can find a isomorphism which does not depend on a choice of basis. So proving the "canonical" part consists only of not choosing a basis to arrive at the result.

Since the dual space of any finite dimensional vector space shares its dimension, it follows that

$$\dim \ddual{V} = \dim \dual{V} = \dim V.$$

Thus, the rank-nullity theorem tells us that a linear map $T:V\to \ddual{V}$ is an isomorphism if it is injective, which greatly simplifies the proof.

Define a map $T:V\to \ddual{V}$ by

$$\big(T(v)\big)(s) = s(v)$$

for any vector $v\in V$ and any covector $s\in \dual{V}$.

Let's pause to make some sense of this. Since $T$ takes $V$ into its double dual, the image $T(v)$ of any vector $v\in V$ will be a linear map in $\ddual{V}$, which itself takes a covector $s\in\dual{V}$. That's why we're defining $T(v)$ by how it acts on a covector $s$.

We will argue that $T$ is an isomorphism. Let's show first that $T$ is a linear map. To this end, suppose $v, v_1, v_2 \in V$ and $a \in \F$. Then because of the linearity of $s$,

$$\begin{align}
\big(T(v_1 + v_2)\big)(s) &= s(v_1 + v_2) \\
&= s(v_1) + s(v_2) \\
&= \big(T(v_1)\big)(s) + \big(T(v_2)\big)(s),
\end{align}$$

and

$$\begin{align}
\big(T(av)\big)(s) &= s(av) \\
&= as(v) \\
&= a\big(T(v)\big)(s).
\end{align}$$

We'll show next that $T$ is injective (and thus bijective by our earlier dimension argument). We will do so by showing that its kernel is trivial. So suppose $v\in\ker T$. Then by definition,

$$\begin{align}
\big(T(v)\big)(s) &= s(v) \\
&= 0
\end{align}$$

for all covectors $s\in\dual{V}$. It follows then from the above lemma that $v=0$, since the above holds for any choice of covector $s$. Therefore, $\ker T = \{0\}$ and so $T$ is injective (and thus bijective).

We have shown that $T$ is a bijective linear map, and we have done so without explicitly choosing a basis for any of the vector spaces involved, so it follows that $T$ is a canonical isomorphism, as desired.

So it turns out that $V$ and $\ddual{V}$ can be used almost interchangeably. But using the double dual space gives us a nice kind of duality (pardon the pun), in that we can think of covectors as maps which act on vectors, and we can think of vectors as maps which act on covectors. Physicists often do this without realizing that they are technically working with cocovectors instead of vectors, but that's fine because the isomorphism makes it work.

I'll leave this here for now. Next time I'll talk about multilinear maps and tensor products!