August 13, 2019

Tensor Products and Multilinear Maps

If you're the sort of person who cowers in fear whenever the word "tensor" is mentioned, this post is for you. We'll pick up right where we left off last time in our discussion of the dual space, and discover how tensor products are a natural extension of the ideas developed in that post.

As before, throughout this post we will assume that $V$ is a finite dimensional vector space over a field $\F$. Furthermore, since there are going to be tons of sums in this post, and they will all have $n=\dim{V}$ terms, I will use the shorthand $\displaystyle\sum_i$ to mean $\displaystyle\sum_{i=1}^n$.

We saw last time that the dual space $\dual{V}$ of a vector space $V$ was another vector space which consisted of all linear maps from $V$ to $\F$. That is, an element of the dual space (called a covector or linear functional) is a map which takes a vector and outputs a scalar.

We also saw that, thanks to the natural isomorphism between a vector space $V$ and its double dual space $\ddual{V}$, we can view a vector as a map which takes a covector and outputs a scalar.

Tensor products give us a way to combine vectors and covectors and build more interesting maps from them. Let's look first at the simplest type of tensor product and work upward from there.

Definition. Given two covectors $s:V\to\F$ and $t:V\to\F$ in $\dual{V}$, their tensor product is the function $s \otimes t:V\times V\to\F$ defined by

$$(s \otimes t)(u, v) = s(u)t(v)$$

for all vectors $u,v\in V$.

We haven't really done anything groundbreaking here. We took two covectors, $s$ and $t$, and combined them to make a new map $s\otimes t$ which takes two vectors, $u$ and $v$, and outputs the product of $s(u)$ and $t(v)$, which is, of course, a scalar.

Example. Suppose $V=\R^2$ and define $s,t:\R^2\to\R$ by

$$\begin{align}
s(x,y) &= x + y \\
t(x,y) &= 2x
\end{align}$$

for any vector $(x,y)$ in $\R^2$.

It is trivial to check that $s$ and $t$ are linear maps, and so they are both covectors in $\dual{V}$. That means we can construct their tensor product as defined above. This will be a map which takes two vectors in $\R^2$ and combines them to output a single real number. Let's see how this works.

We can construct $s\otimes t:\R^2\times\R^2\to\R$, defined by

$$\begin{align}
(s\otimes t)(x_1,y_1,x_2,y_2) &= s(x_1,y_1)t(x_2,y_2) \\
&= (x_1 + y_1) (2x_2) \\
&= 2x_1x_2 + 2y_1x_2,
\end{align}$$

where $(x_1,y_1)$ and $(x_2,y_2)$ are any vectors in $\R^2$.

For concreteness, let's see how it acts on two specific vectors, $(1,2)$ and $(3,4)$.

$$\begin{align}
(s\otimes t)(1,2,3,4) &= 2(1)(3) + 2(2)(3) \\
&= 6 + 12 \\
&= 18.
\end{align}$$

We'd like to understand the properties of tensor products, so the first question we might ask is whether they are linear maps. Unfortunately, this question doesn't really make sense. We've defined linear maps as maps from one vector space to another which satisfy the properties of additivity and homogeneity. The tensor product we've defined instead takes the cartesian product of vector spaces into another vector space.

We can rectify this by creating an extension of the idea of a linear map.

Definition. Given three vector spaces $U,V,W$ over the same field $\F$, a bilinear map is a function $T:U\times V\to W$ which is linear in each argument. That is,

Additivity in the First Argument
For any vectors $u_1,u_2\in U$ and $v\in V$, we have that $$T(u_1+u_2, v) = T(u_1, v) + T(u_2, v).$$

Additivity in the Second Argument
For any vectors $u\in U$ and $v_1, v_2\in V$, we have that $$T(u, v_1 + v_2) = T(u, v_1) + T(u, v_2).$$

Homogeneity in Each Argument
For any vectors $u\in U$, $v\in V$ and any scalar $a\in\F$, we have that $$T(au, v)=T(u, av) = aT(u, v).$$

Just as the set of linear maps from one vector space to another is itself a vector space, it turns out that the set of all bilinear maps also forms a vector space once we make the following natural prescriptions:

The zero vector is the zero map ${\bf 0}:V\to F$.

Vector addition is just function addition.

Scalar multiplication is inherited directly.

Additive inverses are given by scalar multiplication by $-1$.

You'll notice that this is exactly what we did to turn the dual space into a vector space, and that's because it's the natural thing to do.

Happily enough, it turns out that the tensor product of two covectors will always be a bilinear map!

Theorem. Given two covectors $s,t\in\dual{V}$, their tensor product $s\otimes t$ is bilinear.

Proof. We need only verify that the properties of a bilinear map hold. To see that $s\otimes t$ is linear in the first argument, note that, from the linearity of $s$,

$$\begin{align}
(s\otimes t)(u_1+u_2, v) &= s(u_1+u_2)t(v) \\
&= \big(s(u_1) + s(u_2)\big)t(v) \\
&= s(u_1)t(v) + s(u_2)t(v) \\
&= (s\otimes t)(u_1, v) + (s\otimes t)(u_2, v)
\end{align}$$

for any vectors $u_1, u_2, v\in V$. Linearity in the second argument holds by a completely analogous argument.

To see that $s\otimes t$ is homogenous in each argument, we note that

$$\begin{align}
(s\otimes t)(au, v) &= s(au)t(v) \\
&= as(u)t(v) \\
&= a(s\otimes t)(u,v) \\
&= s(u)at(v) \\
&= s(u)t(av) \\
&= (s\otimes t)(u, av).
\end{align}$$

So tensor products are actually very well behaved maps! Even more is true though:

Theorem. Let $B$ denote the space of all bilinear maps from $V\times V\to \F$. If $(e_i)_{i=1}^n$ is a basis for $V$ with dual basis $(e^i)_{i=1}^n$ for $\dual{V}$, then $(e^i\otimes e^j)_{i,j=1}^n$ is a basis for $B$.

Proof. We will show first that $(e^i\otimes e^j)_{i,j=1}^n$ is linearly independent. Suppose

$$\sum_i\sum_j a_{ij} e^i \otimes e^j = 0$$

for some scalars $(a_{ij})_{i,j=1}^n$ in $\F$. Then for any two vectors $u=\displaystyle\sum_k u^k e_k$ and $v=\displaystyle\sum_l v^l e_l$ in $V$,

$$\begin{align}
\left(\sum_i\sum_j a_{ij} e^i \otimes e^j\right)\left(\sum_k u^k e_k, \sum_l v^l e_l\right) &= \sum_i\sum_j a_{ij} e^i \left(\sum_k u^k e_k\right) e^j \left(\sum_l v^l e_l\right) \\
&= \sum_i\sum_j a_{ij} \sum_k u^k e^i(e_k) \sum_l v^l e^j(e_l) \\
&= \sum_i\sum_j a_{ij} \sum_k u^k \delta^i_k \sum_l v^l \delta^j_l \\
&= \sum_i\sum_k a_{ij} u^i v^j \\
&= 0.
\end{align}$$

But since $u$ and $v$ were arbitrary, $u^i$ and $v^j$ are arbitrary and thus the above can only be certain if $a_{ij}=0$ for all $i$ and $j$. Thus, $(e^i\otimes e^j)_{i,j=1}^n$ is linearly independent.

Next we will show that $B = \span (e^i\otimes e^j)_{i,j=1}^n$. Choose any bilinear map $T:V\times V\to\F$ in $B$, and define

$$T_{ij} = T(e_i, e_j)$$

for all $i$ and $j$. Then for any two vectors $u=\displaystyle\sum_k u^k e_k$ and $v=\displaystyle\sum_l v^l e_l$ in $V$,

$$\begin{align}
T(u,v) &= T\left(\sum_i u^i e_i, \sum_j v^j e_j\right) \\
&= \sum_i\sum_j u^i v^j T(e_i, e_j) \\
&= \sum_i\sum_j T_{ij} u^i v^j \\
&= \sum_i\sum_j T_{ij} \sum_k u^k \delta^i_k \sum_l v^l \delta^j_l \\
&= \sum_i\sum_j T_{ij} \sum_k u^k e^i(e_k) \sum_l v^l e^j(e_k) \\
&= \sum_i\sum_j T_{ij} e^i \left(\sum_k u^k e_k\right) e^j \left(\sum_l v^l e_l\right) \\
&= \left(\sum_i\sum_j T_{ij}e^i\otimes e^j\right)\left(\sum_k u^k e_k, \sum_l v^l e_l\right).
\end{align}$$

Since we can write any $T\in B$ as a linear combination of tensor products $e^i\otimes e^j$, it follows that $(e^i\otimes e^j)_{i,j=1}^n$ spans $B$, completing the proof.

So not only is every tensor product bilinear, it also happens that every bilinear map can be written as a linear combination of tensor products! Furthermore, we've exhibited a nice basis for the space of bilinear maps which is inherited from the dual basis. Lastly, the argument above shows that if $n = \dim V$, then the dimension of $B$ is $n^2$.

Since the space of bilinear maps is the same as the space of tensor products of two covectors, we usually write $\dual{V}\otimes\dual{V}$ to denote $B$, and call it a tensor product space.

Now, as ugly as the goop above was, we've still only dealt with the simplest form of tensor product. The following definitions extend what we have done so far.

Definition. Given vector spaces $V_1, V_2, \ldots, V_k, W$ over the same field $\F$, a multilinear map is a function $T:V_1\times V_2\times \cdots \times V_k\to W$ which is linear in each argument.

Definition. Given $j$ vectors $v_1, \ldots, v_j \in V$ and $k$ covectors $s_1, \ldots, s_k \in \dual{V}$, their tensor product is the function

$$\bigotimes_{i=1}^j v_i \bigotimes_{i=1}^k s_i: \prod_{i=1}^j \dual{V} \prod_{i=1}^k V\to\F$$

defined by

$$\left(\bigotimes_{i=1}^j v_i \bigotimes_{i=1}^k s_i\right) (t_1, \ldots, t_j, u_1, \ldots, u_k) = \prod_{i=1}^j v_i(t_i) \prod_{i=1}^k s_i(u_i)$$

for any covectors $t_1, \ldots, t_j$ and any vectors $u_1, \ldots, u_k$.

We say that the rank of this tensor is (j, k).

Just like with bilinear maps, multilinear maps $\prod_{i=1}^j \dual{V} \prod_{i=1}^k V\to\F$ form a vector space $M$ in the obvious way. Furthermore, this vector space is generated by the obvious choice of basis vectors. That is, if $(e_i)_{i=1}^n$ is a basis for $V$ with dual basis $(e^i)_{i=1}^n$ for $\dual{V}$, then

$$\bigotimes_{l=1}^j {e_{i_l}} \bigotimes_{l=1}^k {e^{i_l}}$$

is a basis for $M$. We thus write

$$M = \bigotimes_{i=1}^j V \bigotimes_{i=1}^k \dual{V}$$

and refer to it as a rank (j, k) tensor product space. I won't prove any of this, though, because the proof for a rank $(0, 2)$ tensor was already ugly enough! The proofs aren't difficult, but they are time and space-consuming.

There are three important things to remember:

A rank $(j, k)$ tensor takes $j$ covectors and $k$ vectors and outputs a scalar.
A rank $(j, k)$ tensor product space has dimension $n^{j+k}$, where $n=\dim V$.
The double dual isomorphism allows us to think of a vector as a linear map which takes a covector and outputs a scalar. This is how we interpret $v_i(t_i)$ in the definition above.

I'll end this post here, but we'll be talking more about tensors very shortly!