What is the connection between a mathematician and physicist's definition of a tensor?

Question

I study mathematics but I have a deep interest in physics as well. I have taken a course in smooth manifolds where a tensor is defined as an alternating multilinear function. Recently I have learned about electrodynamics and how Maxwell's equations can be written in relativistic form. We introduce the "(anti-symmetric) 2-tensor" $F_{\mu\nu}$, which from I have understood so far, has the benefit that it allows us to easily calculate how the fields transform under arbitrary Lorentz transformations. (As an aside question, is there really any other benefit)? I've understood how $F_{\mu\nu}$ is derived, but I've been stuck on why/how physicists call this object a tensor.

How can an object such as $F_{\mu\nu}$ be seen as an bilinear function?

This answer of mine on Math SE discusses this connection and might be of interest to you. Have a look also at the other answers therein. — Massimo Ortolano, Feb 13 '23 at 10:15
Heh heh. Some time ago I attended a conference at which a Big Name Physicist and a Big Name Mathematician gave several talks. And they explained it this way. The BNP wanted indices because he thought of a tensor as a matrix. The BNM wanted arrows because he thought of it as a map. "Put some indices on there so I can understand." "Put some arrows on there so I can understand." — Boba Fit, Feb 13 '23 at 19:12
Would it be that physicists don't bother defining it?! :) :) runs away .. — Fattie, Feb 13 '23 at 20:18
@BobaFit - I don't know if when you refer to the second BNP and BNM you mean they were talking about themselves or about history. But I guess if it's an historical account we would need to replace BNM with Grossman and BNP with Einstein... :) — Amit, Feb 13 '23 at 20:22
Do you not care, that's the connection between a mathematician's and physicist's definition?
Either way, can you not first Post your idea of each of those and only then ask about their differences? — Robbie Goodwin, Feb 13 '23 at 21:59

Amit · Accepted Answer · 2023-02-25T19:50:39.853

where a tensor is defined as an alternating multilinear function

I think you may be confusing the general concept of tensors, with the specific case of volume forms which indeed by definition are always alternating. But if you drop the "alternating", this would be completely correct.

Now, perhaps the confusion arises from the fact that in physics we may be a bit "sloppy" at times and represent something like the electromagnetic Faraday tensor as a matrix:

\begin{equation} \left\{ F^{\mu \nu} \right\} = \begin{pmatrix} 0 & -E^1 & -E^2 & -E^3 \\ E^1 & 0 & -B^3 & B^2 \\ E^2 & B^3 & 0 & -B^1 \\ E^3 & -B^2 & B^1 & 0 \end{pmatrix} \end{equation}

While in fact this isn't a great way to represent a bilinear function. We know that a matrix is a reasonable representation for a $(1,1)$ tensor, since it maps a vector (which is a $(1,0)$ tensor) to another vector. Suppose that $A$ is a $(1,1)$ tensor, then:

$$A^{i}_jV^j = U^i$$

However, a matrix isn't a very good way to represent a bilinear function like $F^{\mu\nu}$. A bilinear function either maps a co-vector to a vector, a vector to a co-vector, or a pair (two vectors or two co-vectors) to a scalar:

$$F^{\mu\nu}V_{\mu}U_{\nu} = r$$

where for example we may assume $r\in\mathbb{R}$. I apologize for the non-physicality of the example, this is only for illustration purposes :)

(You can find much more about this index notation if you're interested and how it's related to the more straightforward notation of a multilinear function. Suffice to say that this is just a more economical notation for familiar operations from (multi)linear algebra).

But apart from some differences in notation, tensors in physics are exactly the same objects as they are in math: multilinear maps. Perhaps most importantly: in physics those multilinear maps often depend on physically significant parameters, such as position in spacetime. A good example of that would be the metric tensor in relativity. So while at a point on the spacetime manifold, the metric tensor will indeed act like a multilinear map, we would still identify it as the same tensor at another point on the manifold, despite this dependence. This is related to the fact that the metric tensor is properly defined as a tensor field on the manifold, via the related notion of the fiber bundle, which you may be familiar with.

Why is a matrix a bad representation of a bilinear function? For your example you could write (IMO, quite fruitfully) $V^T F U = r$? — user2617, Feb 13 '23 at 18:00
@user2617 - That's true, but I find that rather unpleasant, almost "broken notation" because the transpose should be reserved for the dual vectors (co-vectors) and here we are just using the transpose on a vector, just to force it into this definition. But all in all, It's just a matter of taste. — Amit, Feb 13 '23 at 18:05
The resistance to a matrix rep is partly the following. Any particular matrix rep assumes a particular basis. The desire is to write things in a way that does not depend on the basis, and so get results that automatically do not depend on the basis. It can be done in a matrix rep, it's just harder in some cases. — Boba Fit, Feb 13 '23 at 19:20
I have been bugged for years about why we should use tensors rather than (what physicists appear to actually do most of the time!) a simple matrix. The idea that we wish to represent things in a basis-independent way really matches with my quantum-mechanics intuition, so this is an excellent and illuminating answer/comment. — Matt Hanson, Nov 19 '23 at 13:43

score 13 · Answer 2 · answered Feb 13 '23 at 13:08

13

How can an object such as $F_{μν}$ be seen as an bilinear function?

By realising that $F_{\mu\nu}$ are the components of a two-form $$F=\frac12F_{\mu\nu} \text{d}x^\mu \wedge \text{d} x^\nu\,,$$which is an element of the antisymmetric product of the cotangent space and as such is a bilinear function taking two vectors (elements of the tangent space) and returning a number.

answered Feb 13 '23 at 13:08

Toffomat

4,138

As someone who comes from mathematics, this is a very nice and straightforward answer. It deserves more upvotes. – CBBAM Feb 13 '23 at 22:10
@CBBAM It may deserve more upvotes from mathematicians, but the audience here is mostly physicists. So the lack of any explanation of how this might have implications or utility in describing an aspect of reality might have been a better vote-getting approach. I'm mostly a spectator, but while I did understand of the accepted answer and it comments, the expression in this answer is pretty much meaningless to my brain. – DWin Mar 09 '23 at 15:12

score 11 · Answer 3 · answered Feb 13 '23 at 13:57

11

Most physicists working in general relativity call the components of a tensor, a tensor. Then they have requirements for the transformation of these components under change of basis to guarantee that these indeed satisfy the properties of the components of an (abstract) tensor.

Frederic Schuller's lectures on GR give a very nice presentation from a mathematical perspective, but he makes the connections to the physicist's notation/shorthand at a number of places, e.g.,

"and now comes something everybody has been waiting for who knew tensors before, namely components of tensors"

and

"... so this makes GR accessible to the masses"

answered Feb 13 '23 at 13:57

Ben H

1,290

1

Schuller's lectures are great! He follows the text "Modern Differential Geometry for Physicists" by Chris Isham fairly closely, for those that prefer to read. – Alex Jones Feb 14 '23 at 21:31
Thanks! I didn't know that. – Ben H Feb 15 '23 at 00:12
1

Thanks for pointing that out @AlexJones. I also found a lot of similarities between the content in some of his lectures and "Analysis, Manifolds and Physics" by DeWitt and Bruhat. Also, at the end of his Lecture notes for the "math anatomy of physics" course, there is an extensive list of reference books. – Amit Feb 15 '23 at 08:55

score 8 · Answer 4 · answered Feb 13 '23 at 10:14

8

Mathematicians tend to favour the intrinsic definition of a tensor, which defines the tensor as a multilinear function or a member of the tensor product of two or more vector spaces.

Physicists tend to favour the extrinsic definition of a tensor, in which a tensor is an array of components which transform in a specific way under a change of co-ordinate system or reference frame. This is a more concrete definition, whereas the mathematicians definition is more abstract.

Having said that, the mathematician's definition and the physicist's definition are equivalent, and lead to the same type of object with the same properties.

answered Feb 13 '23 at 10:14

gandalf61

52,505

1

A physicist's tensor is equivalent to a tensor field in mathematics – Noiralef Feb 13 '23 at 13:53
1

The components of a tensor field. – Ben H Feb 13 '23 at 14:01
1

@Noiralef Although some physicists may informally use "tensor" as a short-hand for tensor field, this is strictly speaking incorrect. A tensor field assigns a tensor to each point in some spatial domain - it is not itself a tensor, any more than a scalar field is a scalar. – gandalf61 Feb 13 '23 at 15:33
@gandalf61 It is strictly speaking incorrect, but in my experience very common. In any case, the problem I have is that the physicist's analogue of the mathematician's definition of a tensor would be just that "a tensor is an array of components". The part about the behavior of these components under coordinate transformations makes only sense if there are coordinates in the first place, that is, if we are talking about a tensor field (or, at least, a tensor living in some tangent space of a manifold). – Noiralef Feb 14 '23 at 02:19

score 8 · Answer 5 · answered Feb 13 '23 at 12:53

Let me give an example of a rather roundabout derivation which uses this fact.

Suppose that instead of doing the usual theorist’s $c=1$ trick, I wish to fully work out a strange unit system where $$ \begin{align} \nabla \cdot E&=c\rho,&\nabla\times E &=-\mathring B\\ \nabla\cdot B&=0,&\nabla\times B&=J + \mathring E \end{align} $$ where $\mathring A = \dot A/c.$ One can check that the units align properly with $\nabla\cdot J+ c\mathring\rho=0$being a valid continuity equation and curling the curl one finds e.g. $$\overset{\,\scriptsize\circ\circ}B -\nabla^2 B=\square B=\nabla\times J$$ which has the right d'Alembert operator with the right wave velocity, so it looks very promising, is just not Gaussian/CGS or SI.

Well, there is a lot of work and cross-checking in rebuilding all of your knowledge from the ground up, and relativity would be a good guide. We start with the standard definition of vector potential, $\nabla\cdot B=0$ implies $B =\nabla\times A$ for some $A$, which means $\nabla\times(E+\mathring A)=0$ which we use to say $E = -\mathring A - \nabla\varphi$ for some $\varphi$.

Defining $\lambda = \nabla\cdot A + \mathring\varphi$, the other two Maxwell equations say$$\square\varphi = c\rho +\mathring\lambda,\\ \square A= J -\nabla\lambda,$$and our gauge freedom means that mapping $A\mapsto A +\nabla \psi$ while $\varphi\mapsto \varphi-\mathring \psi$ preserves $E, B$ while mapping $\lambda\mapsto \lambda -\square\psi$, which we can solve for zero to force $\lambda \mapsto 0$, the Lorenz gauge. Then since $(c\rho, J) = J^\bullet$ is a 4-vector and $\square$ is covariant, we find that the appropriate 4-potential is $A^\bullet=(\varphi, A)$, no division or multiplication by $c$. Use the $({+}\,{–}\,{–}\,{–})$ metric, $A_\bullet = (\varphi, -A),$ we are ready to form the field tensor.

We have $F_{\mu\nu}=\partial_\mu A_\nu -\partial_\nu A_\mu$ which means $$F_{\bullet\bullet}=\begin{bmatrix}0&-E_x&-E_y&-E_z\\ E_x&0&B_z&-B_y\\ E_y&-B_z&0&B_x\\ E_z&B_y&-B_x&0\end{bmatrix}.$$ Now, you ask in what sense this tensor is a bilinear function, well, there are two answers to that, one is just, it is a matrix, so obviously it is a bilinear function. That is if we take two 4-vectors and write them as raw linear algebra column vectors $\mathbf u,\mathbf v$, and regard this matrix as $\mathbf M$, then this implements the bilinear function $$F(u, v) = \mathbf u^T \mathbf M \mathbf v$$ which is also a Lorentz-invariant scalar. Note that this is a raw transpose of the column, no components are negated here because those transforms are already absorbed into the above matrix.

The second answer is a bit more physical, this should have the function of connecting a 4-velocity to a 4-force. The 4-force can be regarded as a covector in the sense that it takes a small 4-displacement and tells you how much work is done on that displacement. This leads to a slightly redundant situation because of course the displacement that we would want is in the direction of the four velocity of the charged particle that we are tracking, so we find that we actually want $F(v, v)$ for the same 4-velocity when all is said and done, but since F is antisymmetric that will inevitably be zero! In 4D the “length” of the 4-velocity is actually fixed, so no “4-work” is ever truly done. Nevertheless we might want to take the dual of the 4-force and think of it as a change in 4-momentum per unit proper time, obviously the change has to be “perpendicular” to the 4-momentum per the above but that doesn't make it zero.

So let's do that. We find that we want a force $${\mathrm dp^\bullet\over\mathrm d\tau} \propto \begin{bmatrix}1&&&\\ &-1&&\\ &&-1&\\ &&&-1\end{bmatrix} \begin{bmatrix}0&-E_x&-E_y&-E_z\\ E_x&0&B_z&-B_y\\ E_y&-B_z&0&B_x\\ E_z&B_y&-B_x&0\end{bmatrix} \begin{bmatrix}\gamma c\\ \gamma v_x\\ \gamma v_y\\ \gamma v_z\end{bmatrix},$$ and we know that we want for example ${\mathrm dp^x\over\mathrm dt}=\gamma^{-1} {\mathrm dp^x\over\mathrm d\tau}=qE_x$ for a purely electrostatic case. Thus we have, $$ {\mathrm dp^\mu\over\mathrm d\tau}=-\frac{q}{c}\eta^{\mu\nu}F_{\nu\sigma} v^\sigma,$$ and we thus come to the final discovery that the proper version of the Lorentz force law in these units is exactly the same as in CGS, $$ F = q \left( E +\frac vc\times B\right).$$ There's probably an easier way to see that, but it's a nice application of the bilinear properties of the electromagnetic force tensor.

Another application of this fact is that $E^2-B^2=\frac12 F_{\mu\nu}F^{\nu\mu}$ is a Lorentz invariant scalar field, and exploiting a hidden symmetry in such antisymmetric tensors (Hodge star?), so is $E\cdot B \propto F_{\mu\nu} (\star F)^{\nu\mu}.$ Both of these results come from the coordinate invariance of trace, which can roughly be stated as the slightly more geometric, “any [m, n]-tensor can be expressed as a finite sum of outer products of vectors and [m–1, n]-tensors, or a finite sum of products of covectors and [m, n–1]-tensors.” Then the procedure to “contract” an [m, n]-tensor is to choose two indices to contract, express it as a finite sum of outer products of [m–1,n–1]-tensors times a vector times a covector, feed all of those vectors to covectors to create invariant scalars, a scalar times a tensor is a tensor and a sum of tensors of the same shape is a tensor, so there you go.

This is a great answer with a lot of interesting information. Can you please clarify whether the fact that "any [m,n] tensor can be expressed (...)" is related to index contraction? That is, we can't define index contraction without this? Also, does that hold for both pure and non-pure tensors? — Amit, Feb 13 '23 at 15:16
@Amit (1) You can define something similar to index contraction, but your “tensor” space is not a “tensor product” of vector and covector spaces which limits interpretability and your “co-covectors” are probably not “vectors” and a lot of other “infinite-dimensional weirdness” happens besides, so you really do need a basis to understand things... See quantum mechanics for the standard example of these problems. — CR Drost, Feb 15 '23 at 20:31
(2) Purity is not a property of a tensor but of the method used to derive it. For example when you derive the Christoffel symbols you get an indexed set of scalars which can be reassembled into a tensor—BUT, if you derive Christoffel symbols for a different coordinate system, and create a new tensor, the two tensors are not equal. In that precise sense the derivation yields a “not a tensor” thing. Meanwhile for the other big case (cross product) you can just specify an orientation tensor as an objective thing in your tensor space, purity marks a sensitivity to which orientation you chose. — CR Drost, Feb 15 '23 at 20:44
Thank you very much @CRDrost, indeed I need to take a closer look at some of these subjects. But regarding the notion of purity, just to be sure we're on the same page here, I was talking of the following definition: if we have $\xi\in U\otimes V$, and there exist $u\in U$, and $v\in V$ such that $\xi=u\otimes v$ then only is $\xi$ a pure/simple tensor. So if it's the same purity you're referring to, I can't see right now how that is not a property of a tensor... — Amit, Feb 15 '23 at 20:49
... Unless of course what you mean to say, if I try to apply it to this example, is that if we define $W = U \otimes V$ then $\xi$ now is pure with respect to $W$, and in that sense the basis of the tensor determines the purity but not the tensor itself? — Amit, Feb 15 '23 at 20:56
Ah, I see now. Yes in at least a restricted context those “pure” tensors are projections and you can (I think) detect that a tensor is a projection without reference to a specific basis... Might need an orientation tensor to do it? But yes the claim is that impure tensors are all sums of pure tensors. — CR Drost, Feb 15 '23 at 21:02

Ryder Rude · Answer 6 · 2023-02-14T02:06:01.670

In the mathematician definition, any array of numbers is a vector. This is fine because the arrays obviously form a vector space and you can talk about linear functions on them (tensors) and changes of basis. All of the tools of linear algebra apply.

Physicists have reserved the word "vector" for a special case of the above : The partial derivative operators defined on the scalar functions on a manifold. The partial derivative operators defined at a point on a manifold obviously do form a vector space, according to the mathematician definition. The sum of two partial derivatives just corresponds to another partial derivative. So they're closed under addition. All of the tools of linear algebra apply. You can define tensors on these vectors.

Now, a co-ordinate transformation is like choosing a different basis for this specific vector space. When you transform the co-ordinates, the new co-ordinate system $(x',y',z')$ comes equipped with a new basis of partial derivatives in these directions $\partial _{x'}$, $\partial_{y'}$, and $\partial_{z'}$. The partial derivative basis vectors corresponding to the older co-ordinate system : $\partial _x$, $\partial _y$ and $\partial _z$ will be related to these by the Jacobian matrix of the transformation.

The above is why physicists say that vectors are only those arrays that change their basis according to the Jacobian matrix under a change of co-ordinate transformation. A co-ordinate transformation is, almost by definition, a change of basis of this specific vector space.

If you form an array out of $(temperature, pressure,mass)$, it is a vector space in the mathematical sense, but it won't change its basis under co-ordinate transformation, because co-ordinate transformations are, almost by definition, a change of basis of the very specific vector space of partial derivative operators.

What is the connection between a mathematician and physicist's definition of a tensor?

6 Answers6