In my study of general relativity, I came across tensors.
First, I realized that vectors (and covectors and tensors) are objects whose components transform in a certain way (such that the underlying reality is one and observers in different reference systems observe the same thing differently).
This is all very intuitive.
However, I have read that the modern way to learn these concepts is to think of vectors as multilinear functions of covectors (similarly for tensors -functions of vectors and covectors- and covectors -functions of vectors-). The problem is that this is not intuitive to me, and I don't understand why this viewpoint has been adopted.
What is the advantage of thinking of vectors as functions over thinking of them as objects that have independent existence but whose components transform in a particular way?
What is the right mindset for understanding the mathematical concepts of general relativity?

- 201,751
-
An appeal to intuition: a vector v is like a stick with one end at the origin. How long is your stick? You can measure it with a tape measure M (M for meter) that points in the same direction: you put M next to v and get a number: M(v) = 2. So a covector M is like a tape measure, and v is 2 M units (meters) long. What if I want to change units? I need to select a different tape measure. Say I switch to F = 3.3 M (a measurement in feet is 3.3 the same measurement in meters). Now if I measure v I get F(v) =~ 6.6. So v is 6.6 F units ("feet") long. – Amos Joshua Jun 21 '21 at 08:47
-
2Now say you have an unknown tape measure U but all its markings have disappeared. How can you tell how “long” U is? Why, use it to measure v, whose length you know in terms of M or F: Say we find U(v) = 79.2 = 12F(v). So U = 12F (again, assuming they’re all pointing in the same direction). Looks like U is measuring inches. So while you can use a tape measure (covector) to “measure” a stick (vector), you can also use a stick (vector) to measure a tape measure (covector). Also, if you double your units, sticks shrink by half whereas tape measures get twice as long (covariant vs contravariant) – Amos Joshua Jun 21 '21 at 08:47
-
If you have ever studied linear algebra, those two interpretations of tensors are just like the relation between matrices and linear maps. A linear map can be expressed by a different matrix for every basis. Tensors work the same. And if you have never studied linear algebra, learning a bit about matrices and linear maps could be the easiest path to be able to understand tensors. – Pere Jun 21 '21 at 11:04
-
I think you have answered your own question - "the underlying reality is one and observers in different reference systems observe the same thing differently". The coordinate-free definition of tensors formalizes what this "underlying reality" is. – Kostya_I Jun 22 '21 at 06:29
6 Answers
However, I have read that the modern way to learn these concepts is to think of vectors as multilinear functions of covectors
This is actually not quite true, though the distinction is subtle.
In the perspective you describe, one starts with the vector space $V$ as the fundamental structure. It doesn't matter how you construct it - it could be the tangent space to a manifold (as in GR), or the element of some complex Hilbert space (as in QM), or the space of polynomials in some variable.
Covectors are then elements of the algebraic dual space $V^*$, consisting of linear maps from $V$ to the underlying field $\mathbb K$ (usually $\mathbb R$ or $\mathbb C$). $(p,q)$-tensors are multilinear maps which eat $p$ covectors and $q$ vectors and spits out a $\mathbb K$-number.
The linear maps from $V^*\rightarrow \mathbb K$ are not elements of $V$, but rather elements of $V^{**}$, the algebraic dual of $V^*$. However, given any vector $X\in V$, we can define a unique map $f_X\in V^{**}$ to be the one which eats a covector $\omega$ and spits out $$f_X(\omega) := \omega(X)$$
This association $X\mapsto f_X$ between $V$ and $V^{**}$ is one-to-one, and so natural and obvious that we tend to think of elements of $V$ as simply being elements of $V^{**}$. That is what is meant by the statement that vectors can be thought of as functions of covectors - in reality, a vector $X$ is not a function of a covector, but rather can be uniquely associated to a function of a covector $f_X$ in a very natural way.
For finite-dimensional $V$, this association is also surjective, so there is a one-to-one pairing between elements of $V$ and elements of $V^{**}$, which makes it even more reasonable to say that elements of $V$ simply are elements of $V^{**}$. In infinite-dimensional spaces, such as those encountered in QM, this isn't true.
In saying this, my intention is not to merely engage in pointlessly self-indulgent mathematical technicality.
What is the advantage of thinking of vectors as functions over thinking of them as objects that have independent existence but whose components transform in a particular way?
You can certainly think of vectors that way. The key point is that they have a basis-independent existence, which can be captured by (i) endowing their components with the right transformation properties, or (ii) talking about them with no reference to a particular basis at all. In most circumstances, I prefer the latter approach when possible, but that's more a personal preference than an indictment of the former.
It's also worth noting that thinking of tensors as objects which eat covectors and vectors and spit out numbers is nothing you aren't already intuitively familiar with, in the sense that you know that contracting all the indices in an expression yields a scalar.
To me, it also makes understanding the transformation properties of tensor components far cleaner. For example, the metric tensor $g$ is a map which eats two vectors and spits out a number. If I pick some basis $\{\hat e_\mu\}$ and plug $\hat e_\mu$ and $\hat e_\nu$ into the slots of $g$, I get $g(\hat e_\mu,\hat e_\nu) \equiv g_{\mu\nu}$.
That's what the components $g_{\mu\nu}$ are - they are the result of feeding elements of a chosen basis to the tensor $g$. Not only does this make it obvious that the components of a tensor are basis-dependent, but it makes it clear what to do when changing basis from $\hat e_\mu \mapsto \hat \epsilon_\mu = R^\nu_{\ \ \mu} \hat e_\nu$:
$$g'_{\mu\nu} \equiv g\big(\hat \epsilon_\mu,\hat \epsilon_\nu\big) = g\big(R^\alpha_{\ \ \mu}\hat e_\alpha,R^\beta_{\ \ \nu} \hat e_\beta\big) = R^\alpha_{\ \ \mu} R^\beta_{\ \ \nu} g(\hat e_\alpha,\hat e_\beta) \equiv R^\alpha_{\ \ \mu}R^\beta_{\ \ \nu}g_{\alpha\beta}$$
where we freely used the multilinearity of $g$.
This may all be perfectly obvious and natural to you if you have a lot of experience manipulating components, but for me it provides a very clean and natural understanding of the objects I'm manipulating.

- 69,036
-
1One more thing about the metric, you can also think of it a bijection of the tangential space to the co-tangential space. (like if you partially apply it on a vector, you end up with an element of the dual space). – lalala Jun 20 '21 at 21:06
-
1@lalala Yes - though it should also be said that the metric isn't so special in this regard, and that any non-degenerate bilinear map from $V\times V\rightarrow \mathbb K$ would do; the symplectic form on the phase space of Hamiltonian dynamics is an example of an alternative, as in that context a metric is generally not present. – J. Murray Jun 20 '21 at 21:14
-
Thanks for the elaborate explanation @J.Murray . Do you perhaps know of a resource, possibly one with some exercises, so I can hone my abilities in tensor calculus/algebra? Thanks in advance. – Çatlı Jun 21 '21 at 14:04
-
My apologies for commenting before I had read the full post. A very bad habit. – Paul Sinclair Jun 22 '21 at 02:30
-
@PaulSinclair No worries - it's a good point. I considered including a note about the continuous (bi)dual space, but found myself getting bogged down in technicalities about (semi)reflexivity and ultimately decided the answer would be cleaner if I left the topology for another day :) – J. Murray Jun 22 '21 at 02:52
Here's an analogy.
Consider two friends. Both friends have memorized the steps in the process of long division and use them routinely, but only one friend understands that division is defined to be the inverse of multiplication. Viewing the step-by-step process of long division as the "definition" of division is possible, but the abstract definition is simpler, more intuitively satisfying, and ultimately more empowering.
That's the advantage of the coordinate-free/component-free definition of a vector field, which in turn is used to define other tensor fields without using coordinates or components. The fact that the definition doesn't involve coordinates or components can be used to derive how (and why!) a tensor field's components in one coordinate system must change when we change the coordinate system. Sure, you could just memorize those transformation rules and use them as the "definition," but the abstract definition is simpler, more intuitively satisfying, and ultimately more empowering.

- 54,035
I don't think there is any particular advantage in thinking of vectors and tensors as functions - it just gives you another tool in your toolbox.
Starting with co-vectors, they are often introduced in linear algebra courses as linear functions that map the vectors in a vector space to the underlying field of scalars. If we change our co-ordinate system - our rules for representing vectors and co-vectors as a list of components - then scalars do not change. So if the co-vector is a physically meaningful function (and so not dependent on an arbitrary choice of co-ordinates) then its components must transform in the "opposite" way to vector components when we change co-ordinates.
In the same way, we can think of a vector as a linear function that maps co-vectors to the field of scalars, and the components of a physically meaningful vector must transform in the opposite way to co-vector components when we change co-ordinates. This gives a reason why vectors and co-vectors that represent physically meaningful quantities transform in the way that they do.
Extending this "functional" thinking to tensors, we now have four different ways of thinking about a (1,1) tensor:
- As a matrix of component values that transforms in a certain way when we change co-ordinates.
- As a linear function that maps a vector to a vector.
- As a linear function that maps a co-vector to a co-vector.
- As a linear function that maps a vector and a co-vector to a scalar.

- 52,505
In math, there is a strong bias toward defining the smallest number of entities possible. If you can definte an entity from another one, then you do so.
With this in mind, covectors naturally have definitions rooted in the differentials of functions from a particular manifold to euclidean space. With this in mind, then, once one has defined co-vectors, if vectors are then defined as functions of co-vectors, then you don't need any additional entities to define them, and all of them are ultimately rooted in just having defined the existence of functions to Euclidean space, and there are fewer independent entities to carry around in your framework.
The other thing, at least to me, when I was learning, was that that formula for how covectors and vectors transform differently, and why I should care about their transformation laws, was deeply unintuitive to me, in a way that understanding vectors and covectors as differentials and directional derivatives was not, and so, starting with the "coordinate free" version of this was helpful to me in terms of understanding "why we are doing all of this".

- 41,373
- 2
- 74
- 143
"What is the advantage of thinking of vectors as functions over thinking of them as objects that have independent existence but whose components transform in a particular way?"
While I think the other answer here are better, there is another perspective on this that I thought you might find useful, which is that (in a sense) by talking about "components" you are already thinking of a vector as a function of covectors without realising it.
The "components of a vector" are functions of a vector. The object "the x component of" takes a vector and returns a number. That's a function. We could write it as "x-component-of(v) = 3" to show it as a function explicitly. In fact, it's a covector. So here we have covectors being seen as functions of a vector.
Now we reverse this and consider a vector v and ask what its x, y, z, etc. components are. We are now looking at it as "v(x-component) = 3". When we say v(x-component) = 3, v(y-component) = 4, v(z-component) = 5 to specify a particular vector, we are doing so by applying an operation to three component directions and stating what numbers are returned.
We can switch from thinking of it as "the-x-component-of(v)" to "v's(x-component)" so automatically that we don't notice the subtle difference. But in using components as properties of and queries about vectors, we are implicitly treating them as functions mapping either a vector or component direction to a coordinate number.
I'll direct you to a post I made on a question of a more technical nature, detailing exactly the mathematical definition of a tensor that was most helpful for my understanding.
The short of it is that saying "tensors are certain linear algebraic objects that transform in a certain way" is deeply unsatisfying when one wishes to go anywhere beyond a topical understanding of the subject. It does not give any intuition as to why that method of transformation ought to be preferred, and it entirely useless in developing the theory of manifolds (tensors' appearance there is exactly why they show up in GR).
Another critique of the "transformation" formalism is that it is a basis-dependent way of describing basis-transcendent phenomena. If you are to write down the definition of a tensor the conventional, physicist way, you need to choose a basis, but in the mathematical understanding, this is unnecessary.
To the title of your question, as I mention in the post I linked above that the understanding of vectors as tensors is almost entirely useless. It is more accurate to say that a vector describes a tensor, and it usually turns out the resulting tensor is wholly uninteresting (i.e. is unrelated to the starting vector, or any properties of interest about the space).

- 288