Matrix multiplication and tensorial summation convention

Question

I'm reading this introduction to tensors: https://arxiv.org/abs/math/0403252, specifically rules concerning summation convention (ref. page 13):

Rule 1. In correctly written tensorial formulas free indices are written on the same level (upper or lower) in both sides of the equality. Each free index has only one entry in each side of the equality.

Rule 2. In correctly written tensorial formulas each summation index should have exactly two entries: one upper entry and one lower entry.

Rule 3. For any double indexed array with indices on the same level (both upper or both lower) the first index is a row number, while the second index is a column number. If indices are on different levels (one upper and one lower), then the upper index is a row number, while lower one is a column number.

I have a doubt on applying these rules to matrix multiplication though. Let $A$ and $B$ be matrices and let's represent their elements as $A_{ij}$ and $B^{jk}$. If $C=AB$, then

$$C_i^k = A_{ij}B^{jk}$$

where $j$ is summed over. But on the LHS, $k$ clearly represents the column index, and $i$ the row index. You can even check it yourself by considering $A$ and $B$ as $2\times 2$ matrices. According to rule 3 though, it's supposed to be the opposite since $i$ is the subscript and $k$ the superscript.

Is there a way to resolve the inconsistency between the $3$ rules here, or am I missing something? Because if this convention doesn't even apply to something as simple as matrix multiplication, then it seems pretty useless.

score 3 · Accepted Answer · answered Jun 10 '18 at 09:08

Matrices are not tensors. A matrix is just a rectangular block of numbers. By contrast, tensors are geometrical objects; you can specify a tensor by taking a coordinate system and giving its components, but the tensor exists independently of those components. A tensor is to a matrix like a triangle is to a list of the coordinates of its points.

However, for tensors of low rank, it's possible to write tensor manipulations in terms of familiar matrix operations on their components. Because of this, some sources even go so far as to say a tensor is the same thing as a matrix, though I think this is misguided.

Specifically, a tensor of rank $(r, s)$, in components, would have $r$ indices up and $s$ indices down. It can be thought of as a linear map which takes in $s$ vectors and returns $r$ vectors. Therefore,

a rank $(1, 0)$ tensor $v^i$ is a vector, since it takes in nothing and returns a vector. Its components can be thought of as forming a column vector.
a rank $(0, 1)$ tensor $w_i$ is a covector, since it takes in a vector and returns a number. Its components can be thought of as forming a row vector.
a rank $(1, 1)$ tensor $A^i_{\ \ j}$ of rank $(1, 1)$ is a linear map from vectors to vectors, so its components can be thought of as a matrix.

For example, a covector can act on a vector to return a number, and in components that is $$w(v) = w_i v^i$$ which is the familiar multiplication of a row vector and column vector. Also, a rank $(1, 1)$ tensor can act on a vector to return a vector, which in components is $$(A(v))^i = A^i_{\ \ j} v^j.$$ Finally, the composition of two rank $(1, 1)$ tensors is $$C^i_{\ \ k} = A^i_{\ \ j} B^j_{\ \ k}.$$ Thus, for these three types of tensors only, tensor operations can be written in terms of matrix multiplication. The general rule is that a "column" index is associated with an upper tensor index.

It's useful to be able to use both notations when they both work, but tensor index manipulations are more general, since the matrix picture completely breaks down for higher-rank tensors.

The other problem with the matrix picture for tensors is that, even when it works, very few sources will line up the indices as I did. For example, if you're reading a computer science or typical intro linear algebra textbook, the indices will not line up. And that's fine, because they are just considering rectangles of numbers; their matrices have no tensorial meaning whatsoever, so there's no reason to adhere to the tensor conventions. Also, if you're working in Euclidean space, you can always raise and lower the indices on tensors without changing the values of the components. Hence books about applied physics (e.g. fluid dynamics, engineering) will not line up the tensor indices because it makes no difference in computing the result.

You said that the general rule is that a "column" index is associated with an upper tensor index. Does that mean rule $3$ as given in the linked notes is wrong? Because if that's the case, even for normal matrix multiplication, rules 1 and 2, together with the revised rule 3 (superscipt column and subscript row) become consistent. — Shirish Kulhari, Jun 10 '18 at 09:18
@ShirishKulhari I've given a set of rules that is self-consistent, but the issue is that, as I said in my last paragraph, lots of people use the index notation in different ways. I suspect that the "rule 3" in your source is not meant to be a nice, self-consistent rule, but just meant to describe what some people do. You can't go wrong if you follow my rules, which are the standard in theoretical physics, but "rule 3" might help you decipher what other people are trying to say. — knzhou, Jun 10 '18 at 09:26
@ShirishKulhari Under my conventions, even the first part of rule 3 is not correct. A tensor with both indices up or both indices down should not be thought of as a matrix at all, since it is not a linear transformation acting on vectors. So already at that point your source is being a bit more flexible. — knzhou, Jun 10 '18 at 09:30

score 1 · Answer 2 · answered Jun 10 '18 at 09:13

Ignore rule 3. As you noticed it creates contradictions, so it has to be replaced by a better rule.

First of all, you shouldn't write $C_i^k$. Here lies the problem. Indexes have an order, and it's easy to see it in tensors like $A_{ij}$ or $B^{jk}$. It should be seen in $C$ too. So the correct notation is: $$C_i\,^k = A_{ij} B^{jk}$$ In matrix representation the first index always represents rows and the second always columns.

This representation works well for order 1 and 2 tensors, but begins to be ugly for bigger orders.

Matrix multiplication and tensorial summation convention

2 Answers2

Linked