91

In mathematics we introduce many different kinds of notation, and sometimes even a single object or construction can be represented by many different notations. To take two very different examples, the derivative of a function $y = f(x)$ can be written $f'(x)$, $D_x f$, or $\frac{dy}{dx}$; while composition of morphisms in a monoidal category can be represented in traditional linear style, linearly but in diagrammatic order, using pasting diagrams, using string diagrams, or using linear logic / type theory. Each notation has advantages and disadvantages, including clarity, conciseness, ease of use for calculation, and so on; but even more basic than these, a notation ought to be correct, in that every valid instance of it actually denotes something, and that the syntactic manipulations permitted on the notation similarly correspond to equalities or operations on the objects denoted.

Mathematicians who introduce and use a notation do not usually study the notation formally or prove that it is correct. But although this task is trivial to the point of vacuity for simple notations, for more complicated notations it becomes a substantial undertaking, and in many cases has never actually been completed. For instance, in Joyal-Street The geometry of tensor calculus it took some substantial work to prove the correctness of string diagrams for monoidal categories, while the analogous string diagrams used for many other variants of monoidal categories have, in many cases, never been proven correct in the same way. Similarly, the correctness of the "Calculus of Constructions" dependent type theory as a notation for a kind of "contextual category" took a lot of work for Streicher to prove in his book Semantics of type theory, and most other dependent type theories have not been analogously shown to be correct as notations for category theory.

My question is, among all these notations which have never been formally proven correct, has any of them actually turned out to be wrong and led to mathematical mistakes?

This may be an ambiguous question, so let me try to clarify a bit what I'm looking for and what I'm not looking for (and of course I reserve the right to clarify further in response to comments).

Firstly, I'm only interested in cases where the underlying mathematics was precisely defined and correct, from a modern perspective, with the mistake only lying in an incorrect notation or an incorrect use of that notation. So, for instance, mistakes made by early pioneers in calculus due to an imprecise notion of "infinitesimal" obeying (what we would now regard as) ill-defined rules don't count; there the issue was with the mathematics, not (just) the notation.

Secondly, I'm only interested in cases where the mistake was made and at least temporarily believed publically by professional (or serious amateur) mathematician(s). Blog posts and arxiv preprints count, but not private conversations on a blackboard, and not mistakes made by students.

An example of the sort of thing I'm looking for, but which (probably) doesn't satisfy this last criterion, is the following derivation of an incorrect "chain rule for the second derivative" using differentials. First here is a correct derivation of the correct chain rule for the first derivative, based on the derivative notation $\frac{dy}{dx} = f'(x)$:

$$\begin{align} z &= g(y)\\ y &= f(x)\\ dy &= f'(x) dx\\ dz &= g'(y) dy\\ &= g'(f(x)) f'(x) dx \end{align}$$

And here is the incorrect one, based on the second derivative notation $\frac{d^2y}{dx^2} = f''(x)$:

$$\begin{align} d^2y &= f''(x) dx^2\\ dy^2 &= (f'(x) dx)^2 = (f'(x))^2 dx^2\\ d^2z &= g''(y) dy^2\\ &= g''(f(x)) (f'(x))^2 dx^2 \end{align}$$

(The correct second derivative of $g\circ f$ is $g''(f(x)) (f'(x))^2 + g'(f(x)) f''(x)$.) The problem is that the second derivative notation $\frac{d^2y}{dx^2}$ cannot be taken seriously as a "fraction" in the same way that $\frac{dy}{dx}$ can, so the manipulations that it justifies are incorrect. However, I'm not aware of this mistake ever being made and believed in public by a serious mathematician who understood the precise meaning of derivatives, in a modern sense, but was only led astray by the notation.

Edit 10 Aug 2018: This question has attracted some interesting answers, but none of them is quite what I'm looking for (though Joel's comes the closest), so let me clarify further. By "a notation" I mean a systematic collection of valid syntax and rules for manipulating that syntax. It doesn't have to be completely formalized, but it should apply to many different examples in the same way, and be understood by multiple mathematicians -- e.g. one person writing $e$ to mean two different numbers in the same paper doesn't count. String diagrams and categorical type theory are the real sort of examples I have in mind; my non-example of differentials is borderline, but could in theory be elaborated into a system of syntaxes for "differential objects" that can be quotiented, differentiated, multiplied, etc. And by saying that a notation is incorrect, I mean that the "understood" way to interpret the syntax as mathematical objects is not actually well-defined in general, or that the rules for manipulating the syntax don't correspond to the way those objects actually behave. For instance, if it turned out that string diagrams for some kind of monoidal category were not actually invariant under deformations, that would be an example of an incorrect notation.

It might help if I explain a bit more about why I'm asking. I'm looking for arguments for or against the claim that it's important to formalize notations like this and prove that they are correct. If notations sometimes turn out to be wrong, then that's a good argument that we should make sure they're right! But oppositely, if in practice mathematicians have good enough intuitions when choosing notations that they never turn out to be wrong, then that's some kind of argument that it's not as important to formalize them.

Mike Shulman
  • 65,064
  • 9
    There must have been cases where someone defined $y$ as an implicit function of $x$ via an equation $f(x,y)=0$ and then wrote something like $dy/dx=(df/dx)/(df/dy)$ (which "follows" from treating apparent fractions as fractions, but of course gets the sign wrong). Whether this has ever made it past a blackboard into a preprint is less certain. – Steven Landsburg Aug 09 '18 at 17:26
  • 16
    I don't know but we lost a Mars probe due to mixing up newtons with pounds: http://articles.latimes.com/1999/oct/01/news/mn-17288 – GH from MO Aug 09 '18 at 17:27
  • 19
    In his 2000 article Nick Laskin apparently got the notation $\nabla^\alpha f$ for the fractional derivative wrong and played with it as if it was a local operator. This error still persists among some physicists, for example, it is still there in Laskin's 2018 book; see here for further links. Not sure if this qualifies, so I leave this as a comment. – Mateusz Kwaśnicki Aug 09 '18 at 17:50
  • I think the only reason why we combinatorialists have (almost) never gotten wrong results out of this is that we tend to check our conjectures on the computer before proving them... As for incomplete proofs, however, combinatorics seems positively teeming with them. – darij grinberg Aug 09 '18 at 17:53
  • 1
    Often it's a matter of controversy -- e.g., I don't believe any argument that uses symmetric-functions raising operators (and I know I'm not alone with that), but several authors use them without worrying. – darij grinberg Aug 09 '18 at 17:59
  • 2
    @darijgrinberg Can you explain more about this example for non-combinatorialists? – Will Sawin Aug 09 '18 at 18:33
  • 4
    @WillSawin: It's tricky. For an example of a paper heavily using raising operators, see https://arxiv.org/abs/1008.3094v2 . It tries to build them on a rigorous foundation (§2.1--2.2), by letting them act on Laurent polynomials instead of them acting on symmetric functions; but it soon slips back into pretending that they act on symmetric functions themselves (e.g., the first computation in §2.4 relies on associativity of that "action"). I have long thought about asking here on MO if there is a good way of making sense of these operators; but I'm afraid that the answers will not ... – darij grinberg Aug 09 '18 at 18:49
  • 4
    ... necessarily be any clearer than the original sources, seeing that this is a matter of confusion rather than a specific question. Adriano Garsia has his own interpretation of raising operators (A. M. Garsia, Raising operators and Young's rule), who suggests that the operators should act on tableaux rather than on symmetric functions (see the sentences after equation 3.3); I'm not sure to what extent his suggestions can be used as a replacement for the uses of raising operators ... – darij grinberg Aug 09 '18 at 18:49
  • 4
    ... in Macdonald polynomial theory. Macdonald himself leaves the task of making the concept rigorous to the reader in his book (like he does to so many other things). I'm not sure to what extent this is an instance of what this question was asking for: it could well be that the proofs aren't as much wrong as merely presented without some necessary context. – darij grinberg Aug 09 '18 at 18:52
  • 4
    It would be useful if @darijgrinberg collected his string of comments into an answer, even if the answer does not exactly address the question. – Andrej Bauer Aug 09 '18 at 21:05
  • 1
    Yes, @darijgrinberg that would be useful, especially if it included enough explanation for a non-combinatorialist about what a raising operator is to understand the issue. – Mike Shulman Aug 09 '18 at 21:30
  • 2
    @AndrejBauer and MikeShulman: This will need a question, not an answer... I hope to get to it soon (next week?). – darij grinberg Aug 09 '18 at 22:48
  • 1
    As Gauss said, in regard to Wilson's Theorem, "In our opinion, truths of this kind should be drawn from notions rather than from notations." – Gerry Myerson Aug 09 '18 at 23:33
  • 4
    Would errors in early umbral calculus be of interest? – Bill Dubuque Aug 10 '18 at 01:55
  • 5
    I'm a little shy about going into details, but one that has bitten me is that a Lie group with a (non-bi-invariant) Riemannian metric has two different "exponential maps", both denoted $\exp$... – Nate Eldredge Aug 10 '18 at 02:15
  • Probably one really does just have to give up on the idea of regarding $\frac{\mathrm d^2y}{\mathrm dx^2}$ as a fraction, but I think that what your argument really shows is that one can't regard $\mathrm dx^2$ as a square. (The partial-derivative argument given by @StevenLandsburg is the one that convinces me that any attempt to treat derivatives as fractions is going to go wrong.) – LSpice Aug 10 '18 at 12:06
  • @Number Yes they would, at least to me. I was introduced to umbral calculus by Rota's version of it, so I've not seen the classical literature. – Robert Furber Aug 10 '18 at 17:24
  • 1
    @LSpice this is very tangential to the question, but there is a systematic meaning of differentials according to which it is valid to treat dx as a square, whereas as far as I can tell there is nothing (whether or not it is a square) that you can divide a "second differential" $d^2y$ by to get the second derivative. – Mike Shulman Aug 10 '18 at 18:06
  • 2
    @LSpice For instance, if you take the differential of $dy = f'(x) dx$ using the product rule naively, you get $d(dy) = d^2y = (f''(x) dx)dx + f'(x) d(dx) = f''(x) (dx)^2 + f'(x) d^2x$. There is an $f''(x)$ in there, and $(dx)^2$ really is a square, but there's an extra term so that you can't just divide by it to get the $f''(x)$ out. Moreover, the extra term is exactly what makes the chain rule argument come out right instead of wrong. (This sort of "second differential" can be made precise using, among other approaches, iterated tangent bundles.) – Mike Shulman Aug 10 '18 at 18:08
  • 1
    @MikeShulman, I knew (in a distant way) about iterated tangent bundles, but didn't know that they provided a framework in which $\mathrm dx^2$ was an honest square. Thanks! – LSpice Aug 10 '18 at 20:01
  • I bet there must be some good examples involving the $o(f(n))$ and $O(g(n))$ notation. – Joel David Hamkins Aug 11 '18 at 21:29
  • 1
    @darijgrinberg: Related issue: plethystic notation – Alexander Woo Aug 11 '18 at 23:01
  • 1
    @MikeShulman : Your clarification of what you're looking for comes perilously close to a proof that no example can exist. If some notation has been systematically developed but does not accurately reflect the underlying mathematics, then surely this reflects incomplete understanding of the mathematics on the part of the developers of the notation, and you would categorize it as a non-example ("of fluxion type"). Unless maybe the math is perfectly well understood but a "bug" slips into the notation? This could happen in computer programming but it's hard to imagine with hand-crafted notation. – Timothy Chow Oct 30 '18 at 02:21
  • 1
    @TimothyChow Your faith in "hand-crafted notation" is greater than mine. Especially when the notation is highly complicated and the proof of its correctness is highly nontrivial, as for string diagrams and type theory. – Mike Shulman Oct 30 '18 at 03:18
  • 1
    @MikeShulman : I would be inclined to classify those examples as "computer programming" even if they are not fully implemented as such. If this is what you're interested in, then I think you should be asking not about "notation" but about "symbolic calculus." – Timothy Chow Oct 30 '18 at 15:57
  • 1
    @TimothyChow I suppose we could argue about terminology until doomsday. I would only point out that most people who make use of string diagrams use them on paper without involving any computers. – Mike Shulman Oct 30 '18 at 22:05
  • 1
    @TimothyChow In the abstract, I might agree with you that these complicated systems should be called something more substantial than "notation". But the point is that there are people who call them just "notation", and I'm looking for data to support or refute the argument that they should be taken more seriously than being considered "just notation". – Mike Shulman Oct 30 '18 at 23:10
  • 1
    @LSpice : This is late, but since I just replied to a new comment of yours farther down: The simplest argument that $\mathrm d^2y/\mathrm dx^2$ can't really be a ratio is that $\mathrm d^2y/\mathrm dy^2=0$, so $\mathrm d^2y=0$, so $\mathrm d^2y/\mathrm dx^2=0$. This is all about the numerator, not the denominator; that's why, as Mike said, there's no way to consistently get the second derivative by dividing the second differential, no matter what you propose to divide it by. – Toby Bartels Oct 05 '19 at 18:59
  • Frege's notation is arguably an instance. He doesn't state general comprehension explicitly, but it is built into his notation in a way that causes inconsistency. I can post an answer explaining this if you like. – Joel David Hamkins Nov 14 '22 at 15:40
  • @JoelDavidHamkins Would you say that the underlying mathematics that Frege was doing was precise and correct from a modern perspective? – Mike Shulman Nov 15 '22 at 01:57
  • Yes, certainly, although it was primarily a philosophical project in mathematical foundations. His system was inconsistent, but consistent fragments have been saved. The neologicists celebrate "Frege's theorem." – Joel David Hamkins Nov 15 '22 at 02:33
  • 1
    @JoelDavidHamkins In that case, yes, I would like to see an answer explaining it! – Mike Shulman Nov 15 '22 at 23:53
  • OK, I'll try to post this within the next few days. – Joel David Hamkins Nov 16 '22 at 00:26

8 Answers8

84

Here is an example from set theory.

Set theorists commonly study not only the theory $\newcommand\ZFC{\text{ZFC}}\ZFC$ and its models, but also various fragments of this theory, such as the theory often denoted $\ZFC-{\rm P}$ or simply $\ZFC^-$, which does not include the power set axiom. One can find numerous instances in the literature where authors simply define $\ZFC-{\rm P}$ or $\ZFC^-$ as "$\ZFC$ without the power set axiom."

The notation itself suggests the idea that one is subtracting the axiom from the theory, and for this reason, I find it to be instance of incorrect notation, in the sense of the question. The problem, you see, is that the process of removing axioms from a theory is not well defined, since different axiomizations of the same theory may no longer be equivalent when one drops a common axiom.

And indeed, that is exactly the situation with $\ZFC^-$, which was eventually realized. Namely, the theory $\ZFC$ can be equivalently axiomatized using either the replacement axiom or the collection axiom plus separation, and these different approaches to the axiomatization are quite commonly found in practice. But Zarach proved that without the power set axiom, replacement and collection are no longer equivalent.

  • Zarach, Andrzej M., Replacement $\nrightarrow$ collection, Hájek, Petr (ed.), Gödel ’96. Logical foundations of mathematics, computer science and physics -- Kurt Gödel’s legacy. Proceedings of a conference, Brno, Czech Republic, August 1996. Berlin: Springer-Verlag. Lect. Notes Log. 6, 307-322 (1996). ZBL0854.03047.

He also proved that various equivalent formulations of the axiom of choice are no longer equivalent without the power set axiom. For example, the well-order principle is strictly stronger than the choice set principle over $\text{ZF}^-$.

My co-authors and I discuss this at length and extend the analysis further in:

We found particular instances in the previous literature where researchers, including some prominent researchers (and also some of our own prior published work), described their theory in a way that leads actually to the wrong version of the theory. (Nevertheless, all these instances were easily fixable, simply by defining the theory correctly, or by verifying collection rather than merely replacement; so in this sense, it was ultimately no cause for worry.)

  • 3
    @EmilioPisanty I'm sorry, but I don't quite see the error to which you refer. I am not an expert in grammar, but "This is an apple that I am stating is red" seems fine to me, grammatically. – Joel David Hamkins Aug 11 '18 at 17:02
  • 1
    This is not an example of incorrect notation. It is an example in which a correct notation made people realize that they didn't agree on what ZFC is (i.e. what set of axioms is, exactly, agreed upon to be the set of axioms of ZFC). I assume here that by ZFC you meant some standard finite set of axiom (schemata) and not the whole of Th(ZFC). – Qfwfq Aug 11 '18 at 21:13
  • 9
    Everyone agrees what ZFC is, as a theory. What is not agreed is the meaning of this subtraction. The notation leads one incorrectly to think it is meaningful. – Joel David Hamkins Aug 11 '18 at 21:15
  • 17
    It's like talking about "the vector space $\mathbb{R}^3$ without the vector (1,1,0)." – Joel David Hamkins Aug 11 '18 at 21:23
  • 8
    I think it's more like talking about "a 3-dimensional real vector space without the vector (1,1,0)". – Tom Church Aug 12 '18 at 04:45
  • 5
    @TomChurch I don't really follow your comment. To my way of thinking, the analogy is that we have a well-known very specific space, ZFC, with several commonly used generating sets, and someone says ZFC with the power set, which is a generator common to those generating sets. The point is that removing a generator isn't well-defined as a process on the space, but only as a process on the generating sets. – Joel David Hamkins Aug 12 '18 at 11:13
  • Isn't a "theory," by definition, a particular generating set, though? That seems to be what Qfwfq is bringing up. – Daniel McLaury Aug 12 '18 at 16:14
  • 3
    @DanielMcLaury There is variant usage. Logicians commonly say that there can be different axiomatizations of the same theory, and under this terminology, a theory is any set of sentences that is closed under logical consequence. Alternatively, logicians allow arbitrary sets of sentences, but in this kind of usage, theories are typically considered only up to logical equivalence. I would find it strange for a set theorist to say that the different axiomatizations of ZFC constitute different theories, as opposed to different ways of axiomatizing the same theory. – Joel David Hamkins Aug 12 '18 at 17:29
  • 16
    So the analogy is not with the set obtained by removing $(1,1,0)$ from $\mathbb{R}^3$ (which is unambiguously $\mathbb{R}^3 \setminus {(1,1,0)}$) but rather with the (presumably $2$-dimensional) vector space obtained (allegedly) by removing $(1,1,0)$ from $\mathbb{R}^3$. – Toby Bartels Aug 20 '18 at 06:25
  • 1
    Assuming that such a geometric analogy is really needed, it would consist of removing an equation rather than removing a point or a set of points. E.g, if one takes the set of 4-tuples satisfying $w^2+x^2+y^2+z^2=w^4+x^4+y^4+z^4=1$, what does "removing the equation $w^2+x^2+y^2+z^2=1$" mean?... – YCor Nov 18 '22 at 07:14
50

This might not quite count, but if you start with a principal $G$-bundle $f:P\rightarrow B$, there are two natural ways to put a $G\times G$ structure on the bundle $P\times G\rightarrow B$ given by $(p,b)\mapsto f(p)$. Because it is standard notational practice to denote such a bundle by simply writing down the map $P\times G\rightarrow B$, there is nothing in the notation to distinguish between these structures, and therefore the notation leads you to believe they're the same.

By following this lead, Ethan Akin "proves" in the 1978 JPAA paper $K$-theory doesn't exist (https://doi.org/10.1016/0022-4049(78)90032-4) that the $K$-theory of $B$ is trivial, for any base space $B$. He reports that it took three Princeton graduate students (including himself) some non-trivial effort before they found the error.

This might meet the letter of your criterion by virtue of having made it into print, but probably violates the spirit because the author had already discovered the error, and indeed the whole point of the paper was to call attention to it.

David Roberts
  • 33,851
  • 6
    That's a nice example, and an amusing paper to read! But I won't count it even by the letter of my criterion, since it doesn't seem to have been "believed publically", i.e. Ethan found the error before publishing it -- and presumably even before he found the error, he knew there had to be an error somewhere. – Mike Shulman Aug 09 '18 at 18:04
  • 1
    This is interesting. When I was first learning about fibre bundles I actually remember wondering if this notational shortcut which you mention had ever caused any significant errors in the literature. – Hollis Williams Aug 30 '21 at 14:38
43

This will probably not be considered a serious mistake, but maybe it counts:

According to Dray, Manogue if you ask the following question to scientist:

Suppose the temperature on a rectangular slab of metal is given by $T(x,y)=k(x^2+y^2)$ where $k$ is a constant.

What is $T(r,\theta)$?

A: $T(r,\theta)=kr^2$
B: $T(r,\theta)=k(r^2+\theta^2)$
C: Neither

most mathematicians choose B while most other scientists choose A.

(I don't know if this experiment was ever done on a large scale. I do know some people who studied mathematics and have tried to argue that A is the right answer.)

This question is called Corinne's Shibolleth in this article of Redish and Kuo, where it is discussed further.

  • Well I would argue that this mistake is not due to the notation, but rather due to how science student are taught mathematics. – Our Aug 10 '18 at 09:56
  • 2
    @onurcanbektas I won't directly object to your point of view. But if you start to wonder why science students are taught in such a strange way, you'll have to dig deeper. Personally I suspect that the problem goes back to the abuse of notation $y=y(x)$, which startet with Jacobi (as far as I can tell). And if you read Jacobi, you'll see that he was very much thinking about notation. I've also written about this here. – Michael Bächtold Aug 10 '18 at 10:12
  • 20
    I'm not sure if this is a mistake so much as a different convention. To a mathematician, $T$ is the name of the function, and it doesn't matter what arguments I feed it. To a scientist, $T(x, y)$ is the name of the function, and $T(r, \theta)$ is a different function, so that there is no way to know what $T(1, \pi)$ means without further context. (An example of context: $T(r, \theta)\bigr|_{(r, \theta) = (1, \pi)}$.) It's not clear to me that either of these is objectively wrong. (If you say "always be more explicit", then I think mathematicians fall down in many other cases.) – LSpice Aug 10 '18 at 12:08
  • 9
    In my experience (and the linked article has a footnote alluding to this) mathematicians may use the physicists notation as well when working with manifolds: Given a manifold $X$ and a function $T$ defined on $X$, one might choose coordinates $(x, y)$ on some open set $U$ and then write $T(x, y) = k(x^2 + y^2)$ to describe $T$ on $U$. But choosing different coordinates with $x = r\cos\theta$, $y = r \sin \theta$ one then has to accept $T(r, \theta) = kr^2$. (I would argue that writing $T(x, y)$ and $T(r, \theta)$ is not good notation to begin with; simply $T = k(x^2+y^2) = kr^2$ seems better.) – Eike Schulte Aug 10 '18 at 13:28
  • 30
    I think this is just plain wrong. I cannot imagine a realistic situation where a mathematician (other than a freshman answering to a professor who is known for silly 'gotcha' questions) would really think of (B). After all, the author meant something when they chose $(r,\theta)$ for their notation, right? Moreover, I find little objectionable about this notation: mathematically, $T$ is a function on a manifold, which has several standard (and standardly denoted) coordinate charts. It is defined in one chart and then calculated in the other. – Kostya_I Aug 10 '18 at 13:30
  • 16
    So the reason why scientist 'are taught in such a strange way' is that all physically meaning quantities are functions (or vector/tensor fields, etc.) on manifolds, usually specified by picking a chart. To be pedantic, one could introduce coordinate maps and phrase the question as: $T(\varphi(x,y))=k(x^2+y^2)$. What is $T(\psi(r,\theta))$? This is not done, as it is impractical: it blows up formulae while carrying no useful information. The composition with coordinate maps is already implied by the choice of letters. – Kostya_I Aug 10 '18 at 13:50
  • 6
    @Kostya_I: Interesting perspective. "After all, the author meant something when they chose (r,θ) for their notation, right?" Apparently physicist says "yes", mathematician says "no". Per Redish/Kuo: "In other words, physicists assign meaning to the variables x, y, r, and θ—the geometry of the physical situation relating the variables to one another. Mathematicians, on the other hand, may regard x, y, r, and θ as dummy variables denoting two arbitrary independent variables." – Daniel R. Collins Aug 10 '18 at 15:53
  • 7
    Here's another problem one could pose: "Solve $ax+b=0$." The answer could be "$a = -b/x$". After all, who is to say that "$x$" is the unknown? – Timothy Chow Aug 10 '18 at 15:53
  • 17
    As far as the original question is concerned, I do kind of think that A is the "right answer" and that people choosing B are making a mistake because of the notation. – Timothy Chow Aug 10 '18 at 16:01
  • 2
    @TimothyChow: Could you expand? Which notational mistake are people who chose B making? – Michael Bächtold Aug 10 '18 at 17:00
  • 9
    @MichaelBächtold : They're interpreting $T$ to be a function with two arguments, whereas $T$ is intended to be a function with a single argument (namely, a point in the plane), with $T(x,y)$ referring to the value of $T$ at the point whose Cartesian coordinates are $x$ and $y$ and $T(r,\theta)$ referring to the value of $T$ at the point whose polar coordinates are $r$ and $\theta$. Fans of answer B might be happier if $T$ were defined as $T(p) = x(p)^2 + y(p)^2$ and then we were asked for an expression for $T(p)$ in terms of $r(p)$ and $\theta(p)$, but that notation would be clunky. – Timothy Chow Aug 10 '18 at 17:53
  • 6
    @TimothyChow: I think this interpretation amounts to what Eike suggested in his last line, and I agree with it. But since the notation $f(x)$ has only one established meaning in modern mathematics, I’d argue that people choosing A are making the notational mistake. – Michael Bächtold Aug 10 '18 at 18:16
  • 3
    Huh, this reminds me how annoying I find it when a referee asks me to change "stochastic process $X_t$" to "process $X$" or "process $(X_t)$", but at the same time they find it perfectly OK to write "a function $\sin x$". Coming back to Corinne's Shibolleth, why there is no answer "D: both A and B"? :-) – Mateusz Kwaśnicki Aug 10 '18 at 18:19
  • 10
    @MichaelBächtold : If this sort of thing showed up in an actual scientific paper, which it very well might, then 99 times out of 100 the intended meaning would be A, so interpretation B would certainly be a mistaken interpretation. In such a circumstance, I would not say that the notation is mistaken; at most, I would say that it's an "abuse" of notation, and mathematicians are not above abusing notation either. – Timothy Chow Aug 10 '18 at 19:32
  • @TimothyChow: I agree with that. I would only add that in principle there are ways of avoiding this abuse of notation, without making the notation more cumbersome. One was suggested by Eike. The other one would be to introduce a new notation like $T[x,y]$ instead of $T(x,y)$, to denote that the “observable” T is henceforth to be expressed in terms of $x,y$. As I wrote in the answer I linked to in my first comment, computer scientist have something quite similar called ascription. – Michael Bächtold Aug 10 '18 at 19:46
  • 3
    @MateuszKwaśnicki, do you actually know that your referees are OK with "the function $\sin x$"? I'm not! (I also often badly want to make this complaint when refereeing, but I don't: I do two passes over a paper when I referee it, one in which I gripe to myself about everything that bothers me to get it out of my system, and another in which I come back and decide what's important enough to put in my report.) – LSpice Aug 10 '18 at 19:57
  • 1
    @LSpice: Well, of course I do not know what they think. I generally prefer readability to utmost formal rigour, so instead of "a function $f : \mathbb{R} \ni x \mapsto x^2$" or "a function $f$ defined by $f(x) = x^2$", I would write "a function $f(x) = x^2$". In any case, this is off-topic here. – Mateusz Kwaśnicki Aug 10 '18 at 20:10
  • 1
    @DanielR.Collins, I think this quote has little to do with reality. While mathematicians 'may' use $i$ and $\pi$ as placeholders for arbitrary complex numbers, or $(r,\theta)$ for Cartesian coordinates of a point, in practice that never happens. Maybe apart from teaching students that it is formally possible. – Kostya_I Aug 10 '18 at 20:14
  • 9
    @MichaelBächtold, I don't agree that the notation $T(x,y)$ has only one established meaning. How about the answer "$T(r,\theta)$ is the field of fractions in formal variables $r,\theta$ over T"? Just like (A), it answers the question in a formally correct way, conforming to established notational convention, while completely ignoring the context. It is clear from the statement that $T$ is not a field or ring, but it equally clear that it is not a function of two real variables. So, the answer (B), which assumes that it is, is incorrect. – Kostya_I Aug 10 '18 at 20:44
  • I meant "Just like (B)", of course. – Kostya_I Aug 10 '18 at 20:53
  • 19
    One more story in the same vein: Our chemistry students asked me once for help, as they had been asked the following question at their final exam in probability: "The parameters of the normal distribution are: (a) $(0,1)$; (b) $(\mu, \sigma)$; (c) $(\mu, \sigma^2)$; (d) $(\alpha, \beta)$." I found this question really amusing. – Mateusz Kwaśnicki Aug 10 '18 at 21:14
  • 1
    @Kostya_I: For what it's worth, the paper I was reading today used $\pi$ as the prime-counting function. Probably the time before that it was used as the parameter for population proportion. The simple fact is, if you don't define it explicitly in the current piece of writing, then it's poorly-defined. – Daniel R. Collins Aug 11 '18 at 02:45
  • 12
    As a computer programmer, my initial reaction was to answer "B", personally. I just looked at it and went "Okay, it's a function with an input of two numbers, that's getting two numbers passed into it." – nick012000 Aug 11 '18 at 03:42
  • 9
    @Kostya_I : so what you’re saying is that we go through all the trouble of teaching students the meaning of the notations $T\colon \mathbb{R}^2 \to \mathbb{R}$ and $T(x,y)$, only to tell them later that if they applied it correctly to arrive at B they were posed a trick question? Because as far as I know, the alternative semantics of $T(x,y)$ which people are using to arrive at A are not clearly laid out in any book. Or do you know one? – Michael Bächtold Aug 11 '18 at 06:03
  • 3
    @Michael, 'trick question'? Quite the contrary. Assuming that this is a meaningful problem, the only sensible answer is (A). It is to answer (B) that one has to assume that this is a trick question, having no actual math contents, with phrasing and notation intentionally chosen to confuse and distract. I do agree though that the question should be phrased more clearly if intended for students. But the point is: can you point out a situation other than an artificial teaching one where the intended answer would be (B)? – Kostya_I Aug 11 '18 at 08:31
  • 1
    As for 'alternative semantics', I am not sure what you mean. Semantics always depend on context. Here the notation T is overloaded - the same letter denotes several functions with different domains (physicists actually stress that domains are different by calling them $(x,y)$ plane and $(r,\theta)$ plane). This is common in math: e. g., what is the domain of exp? It could be numbers, matrices, operators, elements of a Lie group, etc., and easily more than one of them in the same equation. No confusion arises: which function we mean is indicated by what we plug in. Just as in our example. – Kostya_I Aug 11 '18 at 08:51
  • 13
    @Kostya_I: It was you who suggested this was a 'gotcha' question. If you take the modern definition of function serious and consider the notation $T(x,y)$ in that context, then the only correct answer is B. Since for you it's obvious that A is correct, but you don't see that there must be "non-standard" semantics of $T(x,y)$ involved in arriving at A, I conclude that you're not using the modern definition. "What is the domain of exp?" I would never say it's the $x$-line or $y$-line or $\theta$-line. Could you define what that even means, and in which sense these lines differ? – Michael Bächtold Aug 11 '18 at 09:23
  • 4
    Has this notational ambiguity ever led to an actual mistake in published work? It’s a neat shibboleth, and the discussion spawned in comments is great fun; but as far as the original question is concerned, it seems to be just a potential answer so far, not an actual one. But it does appear to have genuine potential for having caused actual mistakes — so does anyone know of any such? – Peter LeFanu Lumsdaine Aug 11 '18 at 17:15
  • 10
    @Kostya_I: you say "which function we mean is indicated by what we plug in." So what if I ask you to plug in some specific numbers? E.g. what is $T(1.2, 1.5)$? – Tom Leinster Aug 11 '18 at 18:04
  • To save the formal correctness of the notation and at the same time not be clunky, one could put a little space before the parentheses when a change of "chart"/variables is left implicit, for example: $T(x,y)=T;(r,\theta)$. – Qfwfq Aug 11 '18 at 21:36
  • 4
    @Qfwfq: Does this really help? Even after reading what you wrote, I had to read it again before I saw the "little space". We really aren't expecting to notice such typographical niceties, I think. – Carl Offner Aug 12 '18 at 00:42
  • @MichaelBächtold, $(x,y)$-plane and $(r,\theta)$-planes are two different copies of Euclidean plane. As such, they are isomorphic, but not canonically so. When you plugging $(r,\theta)$ into the function specified in $(x,y)$, you are implicitly using this isomorphism. But since it's not canonical, there's no single situation when it's useful. You may view this as a formalisation of what physicists mean by "you cannot add quantities of different dimensions" etc. – Kostya_I Aug 12 '18 at 09:01
  • As for 'modern definition of function', $T$ is arguably not defined to be a function of two real variables. It is defined as a physical quantity, and no physical quantity is a function of several variables - they only become such if a coordinate chart is chosen. @TomLeinster, for "what is $T(1.2,1.5,)$", the answer is given above by LSpice. BTW, I really don't see how switching to $T=k(x^2+y^2)=kr^2$ would be of any help. On the left is a function (on a manifold) and on the right are symbolic expressions, how can they be equal? – Kostya_I Aug 12 '18 at 09:20
  • 4
    @Kostya_I: the question doesn't say "The temperature is $T$ ..." rather it suggest that $T(x,t)$ is the temperature. If you switch notation to $T=k(x^2+y^2)$ you are interpreting $T,x,y$ as maps from a manifold to $\mathbb{R}$ and the equation makes perfect sense. – Michael Bächtold Aug 12 '18 at 09:27
  • @MichaelBächtold, indeed, it does make perfect sense for me now, thanks. But does it also mean that we should never use expressions like $\partial x/\partial r$? – Kostya_I Aug 12 '18 at 09:55
  • @Kostya_I: "we should never use expressions like $\partial x/\partial r$?" Are you asking this because you read the Jacobi post, or are you also asking about $dx/dr$ notation? I just answered an older question about the latter here. – Michael Bächtold Aug 12 '18 at 10:55
  • 2
    @Kostya_I You can use $\partial f/\partial r$ just fine when you interpret $r$, $\theta$ as coordinate functions. You just have to be aware that it depends on the full set ${ r, \theta }$ and not $r$ alone. (Basically, $(\partial/\partial r, \partial/\partial \theta)$ is the dual basis to $(\mathrm{d}r, \mathrm{d}\theta)$ and that depends on both $r$ and $\theta$.) Of course, this dependency is hidden by the notation, but in praxis you use disjoint names for all sets of coordinates most of the time anyway. (And $x$ is just a (maybe local) function on the manifold, so you can use it for $f$.) – Eike Schulte Aug 12 '18 at 12:01
  • I'd say that $r$ can have three different meanings: a function on a manifold, a label for a coordinate in a chart (as in $\partial x/\partial r$), and a function of coordinates in another chart (as in $\partial r/\partial x$.) A traditional way to proceed is to acknowledge this abuse of notation and live with that. A perfectly rigorous way would be to give three different names to these three objects; nobody does it as it would be confusing and impractical. A third option, as @EikeSchulte suggests, is to insist that $r$ is a function on a manifold, so it is never a function of $(x,y)$... – Kostya_I Aug 12 '18 at 19:03
  • ... and $\partial r/\partial x$ is not a partial derivative, but a notation for a vector field $\partial/\partial x$ applied to $r$. This is already strange, since if it's denoted like a duck, walks like a duck, etc., why is it not a duck? But what's worse is that since we cannot just define $\partial/\partial \varphi$ for an arbitrary $\varphi$, we have to insist that $x$ is a special kind of function - a 'coordinate function'. This looks like an ad hoc solution to salvage an unsound notation, and essentially not much different from saying that $x$ also denotes a coordinate label. – Kostya_I Aug 12 '18 at 19:20
  • @Kostya_I: all those three meanings are the same and coincide with what Eike suggests: a coordinate in a chart is nothing else than a function $r:U\to \mathbb{R}$ on a manifold $M$ (defined on a subset $U$ of the manifold), and if you make yourself clear what it means to be "a function of something", then such a function $r$ can also be a function of other coordinate charts. So if $(x,y)$ are coordinates on the manifold, then such an $r$ will always be a function of them. – Michael Bächtold Aug 12 '18 at 19:26
  • The real difficulty here seems to be due to the confusing terminology: to be a function in the modern sense is not the same as to be a function of something in the original sense of the word. (Or actually it is, but that will only cause more confusion. That is why I prefer to use the word map for modern functions.) – Michael Bächtold Aug 12 '18 at 19:28
  • @MichaelBächtold, I don't see how you can claim that $r$ is 'nothing else than a function', and still make sense of $\partial /\partial r$. Coordinate functions carry more structure than 'just functions'. Otherwise, you seem to imply that there's a notion of partial derivative for "a function of something" that is different from a notion of a partial derivative of a map from $\mathbb{R}^n$ to $\mathbb{R}$. I wonder what this notion is and of what use it is. – Kostya_I Aug 12 '18 at 20:51
  • @Kostya_I I suspect we might be talking past each other, and this comment section is not the best place to clarify this. If you want we can continue this in chat. – Michael Bächtold Aug 12 '18 at 21:18
  • @Kostya_I I guess you’re right: Requiring $r$ to be a “coordinate function” in the sense that it belongs to some “invisible” set of coordinates is just as much abuse of notation as using $r$ to denote different things depending on context. So it comes down to preference in the end. Now of course, everybody would like their preferred way to be the standard way … – Eike Schulte Aug 13 '18 at 08:04
  • 3
    Never seen such a bunch of comments to one answer yielding moreover such a substantial discussion! – Wolfgang Aug 14 '18 at 19:49
  • @Carl Offner: it was a way to keep using the effect of the abuse of notation while keeping your conscience clean at the same time: you are writing down something that is essentially identical to the abuse of notation, so if you understand it that's fine, but if you have some doubts then the typography will notify you that a composition with a change of variables is understood. – Qfwfq Aug 17 '18 at 15:02
  • @Kostya_I : Indeed, you should never use such an expression as $\partial{x}/\partial{r}$! Or rather, you should never use it out of the blue (with only $x$ and $r$ previously introduced) but only as an abbreviation after you've given enough additional context to understand its meaning. This is a real problem in, for example, thermodynamics, where heat capacity is defined as $\partial{Q}/\partial{T}$, the partial derivative of heat with respect to temperature. (I'm abstracting away the issue that $Q$ isn't meaningful, which isn't important since $\mathrm{d}Q$ is reasonably well-defined.) – Toby Bartels Aug 20 '18 at 06:46
  • … To interpret $\partial{x}/\partial{r}$, you have to know what other variables are being held constant. As you can see at https://en.wikipedia.org/wiki/Heat_capacity there are two common precise meanings of heat capacity, one where pressure is held constant and one where volume is held constant. In some contexts, it's obvious which is meant, but in others, one had better say! I don't know if confusion between these has ever led to an error in print, but if it has, then that's an answer to Mike's question where the usual notation for partial derivatives is incorrect and at fault. – Toby Bartels Aug 20 '18 at 06:53
  • … Note that if $T\colon \mathbb{R}^2 \to \mathbb{R}$ is a map (as Michael would say), then there is no corresponding ambiguity in the meaning of $D_1 T$ and $D_2 T$, which you might also write (adopting a common convention) as $\partial{T}/\partial{x}$ and $\partial{T}/\partial{y}$. But this $T$ can't actually be a temperature, but at best a map that, when applied to the coordinates of a point in some particular coordinate system laid on a flat object, yields the temperature at that point. Personally, I never mix the notations and always use $D_1 T$ and $D_2 T$ for maps. – Toby Bartels Aug 20 '18 at 07:03
  • 1
    [Note: not a continuation of my previous comment, for once!] Eike is right that we should just write $T = k(x^2 + y^2) = kr^2$ and be done with it. I blame neither the scientists nor the mathematicians for this, but rather the mathematics teachers, who want to teach elementary Calculus (and increasingly elementary Algebra) as a theory of functions like Drey & Manogue's $T$ when it should really be taught (for applied purposes) as a theory of variable quantities like Eike's $T$. In terms of modern mathematics, a variable quantity is formalized as a map on an abstract manifold, but … – Toby Bartels Aug 20 '18 at 07:11
  • 1
    … to go into that level of abstraction in an elementary course would be as out of place as it would be to go into a construction of the set of real numbers and prove the ordered-field axioms and other basic properties. (Which is to say, maybe for honour students, or for the French, but not for ordinary people, not even math majors at first.) But before the 20th century, Calculus was a theory of variable quantities (as Algebra usually still is, at least at first), as described in the question https://mathoverflow.net/questions/84221/variable-centric-logical-foundation-of-calculus – Toby Bartels Aug 20 '18 at 07:17
  • 1
    … To really use notation like $T = k(x^2 + y^2)$, you need to have notation like $T|{x=1.2,y=1.5}$, but if $T(1.2,1.5)$ is going to be ambiguous, then it's really not any worse, and better than $T(x,y)|{x=1.2,y=1.5}$. Note that (in a context where it's clear that $y$ is to be held fixed, and assuming that $k$ has been constant all along), $\partial{T}/\partial{x}|_{x=1.2,y=1.5}$ works the same way; there is nothing like $\partial{T(1.2,1.5)}/\partial{x}$ that, when read literally, is asking for a partial derivative of a constant (in this case, of $3.69k$). [And now my comments are done!] – Toby Bartels Aug 20 '18 at 07:23
  • @TobyBartels, I'm not quite sure how to read the penultimate sentence of your comment, but isn't the usual "parenthetical notation" for evaluation of partial derivatives $(\partial T/\partial x)(1.2, 1.5)$ (arguably unambiguous since we already understand a coördinate system just by writing $\partial x$), or probably more usually $\dfrac{\partial T}{\partial x}(1.2, 1.5)$? – LSpice Oct 04 '19 at 16:41
  • @LSpice : That's not literally meaningful at all, and I have seen $\frac{\partial T(1.2,1.5)}{\partial x}$, especially in handwriting. In practice, if course, we understand it; in this example, it's $\partial T(x,y)/\partial x\rvert_{x=1.2,y=1.5}=D_1T(1.2,1.5)=2.4k$. But why not $\partial T(y,x)/\partial x\rvert_{y=1.2,x=1.5}=D_2T(1.2,1.5)=3.0k$? Only because we know that it's always $T(x,y)$ and never $T(y,x)$, which is a convention that we can reasonably adopt in this context but not across all of mathematics. – Toby Bartels Oct 05 '19 at 18:13
  • To really remove all ambiguity, I should have written $(\partial T(-,-)/\partial x)_y$ whenever I wrote $\partial T(-,-)/\partial x$ in my previous comment; the subscript indicates what quantity is being held fixed. (This is a point that @Kostya_I was making earlier.) – Toby Bartels Oct 05 '19 at 20:39
31

Ramanujan's notebooks are an interesting case study. As discussed in detail in Chapter 24 ("Ramanujan's Theory of Prime Numbers") of Volume IV of Bruce Berndt's series Ramanujan's Notebooks, Ramanujan made a number of errors in his study of $\pi(x)$, the number of primes less than or equal to $x$. It is hard to say for sure that these errors are specifically due to Ramanujan's notation rather than some other misconception, but I think a case can be made that his notation was a contributing factor. For example, Berndt writes:

It is not clear from the notebooks how accurate Ramanujan thought his approximations $R(x)$ and $G(x)$ to $\pi(x)$ were. (Ramanujan always used equality signs in instances where we would use the signs $\approx$, $\sim$, or $\cong$.) According to Hardy, Ramanujan, in fact, claimed that, as $x$ tends to $\infty$, $$\pi(x)-R(x) = O(1) = \pi(x) - G(x),$$ both of which are false.

One could therefore argue that Ramanujan's careless use of equality signs contributed to his overestimating the accuracy of his approximations. On the other hand, one could also argue that Ramanujan's mistake was more fundamental, traceable to his inadequate understanding of the complex zeros of the zeta function.

Ramanujan also used (in effect) the notation $d\pi(x)/dx$, and one could argue that some of his misunderstandings were traceable to not having a proper definition of the notation $d\pi(x)/dx$ and yet assuming that it denoted a definite mathematical object with specific properties. Ramanujan was aware of the need for some justification because Hardy voiced his objections, and attempted to defend his notation (in this context, $n=\pi(x)$):

I think I am correct in using $dn/dx$ which is not the differential coefficient of a discontinuous function but the differential coefficient of an average continuous function passing fairly (though not exactly) through the isolated points. I have used $dn/dx$ in finding the number of numbers of the form $2^p3^q$, $2^p+3^q$, etc., less than $x$ and have got correct results.

However, as Berndt explains, Ramanujan's defense is inadequate. For more discussion, I recommend reading the entire chapter.

Timothy Chow
  • 78,129
  • 1
    In particular, for $3$-smooth numbers Ramanujan's heuristics made him believe that his approximation was within $O(1)$ of the true value. Hardy and Littlewood later proved first error terms of the form $O(x^\vartheta)$ and $O(\log x)$ depending on Diophantine properties of the logarithms of integers, then showed that in fact the error term is always unbounded! –  Aug 10 '18 at 09:30
  • 4
    This is interesting, but if I understand it correctly, not I think an answer to the original question. It seems to me an instance of a single mathematician using a notation incorrectly, rather than a notation itself being incorrect (and leading to wrong results). – Mike Shulman Aug 10 '18 at 17:47
11

I am aware of a few articles discussing certain Macdonald polynomials in the introduction (and motivation) in the introduction, and then proceed to study properties of another family of polynomials.

The particular polynomials that are studied are not the same ones as in the Macdonald polynomials in the introduction (only similar), but the exact same notation/symbol is used for both these two families of polynomials.

10

Edward Nelson’s proof of the inconsistency of Peano arithmetic (and weaker systems) had an error which he saw only after Terence Tao refined some notation. Specifically, Tao reformulated Chaitin’s theorem from

Given a theory $T$, there exists an $\ell$ with the property that, if $T$ is consistent, then there does not exist an $x$ such that $T$ can prove $K(x)>\ell$

to

Given a theory $T$, there exists an $\ell(T)$ with the property that, if $T$ is consistent, then there does not exist an $x$ such that $T$ can prove $K(x)>\ell(T)$

In particular, this allowed Tao to note that $\ell(T) < \ell(T’)$ is possible even when $T’$ is a restricted version of $T$, contrary to an implicit assumption in Nelson’s proof.

8

In module theory, there is a choice of which side the scalars acts on. Then, there is also the choice of which side the endomorphisms of the module act on.

Let $M$ be a right module, with scalars coming from a ring $k$, and let $E={\rm End}(M_k)$. If we let $E$ act on the opposite side as $k$, so on the left, then we have a very nice associativity-like compatibility of actions, given by $$e(m\alpha)=(em)\alpha,$$ for any $e\in E$, $m\in M$, and $\alpha\in k$.

On the other hand, if we have $k$ and $E$ act on the same side, then the compatibility rule for the actions is ugly and harder to keep track of. I've seen wrong proofs because authors have endomorphisms act on the same side as scalars, but incorrectly remember the weird compatibility/multiplication rules.

Something similar happens with group actions, where if the action happens on the "wrong" side, then some inverses have to be added, and a natural associativity kind of rule instead has the order of the elements all messed up. (I've talked to some group theorists who were not even aware that their ugly group action rules could be fixed by acting on the other side.)

There are a couple historical reasons for this. First, we are used to function composition on the left, but also like working with left modules. So, having endos on the right leads to a right composition rule.

Second, many people start learning about modules using scalars from a commutative ring (e.g., vector spaces). This sometimes leads to problems as people get too comfortable moving scalars around willy nilly, even though this only works (generally) for commutative rings. Generally, we can view any right $R$-module $M_R$ as a left $R^{\rm op}$-module. Some authors are simply unaware of this "opposite ring" issue, since it magically disappears when working with commutative rings. I've seen some authors make errors in proofs by claiming some sort of simultaneous left and right module structure over a general ring $R$, when in fact it only works if $R$ is commutative. Also, one has to be careful to not confuse the endo ring of $M_k$ with the endo ring of $_kM$ (even when $k$ is commutative)! This is where I think this answers the OP, because the "understood" way to manipulate the syntax (of just moving scalars to the other side) is a misunderstanding of what is really happening (both moving and "oppositivizing").

In summary: Any module notation where one does not carefully keep track of which side scalars and endomorphisms act, and which doesn't put endos on the opposite side, inadvertantly hides naturality and often leads to hard-to-remember multiplication rules, making errors in proofs much easier.

Pace Nielsen
  • 18,047
  • 4
  • 72
  • 133
  • I'm a little confused about your accidental isomorphism. Why does it fail to be natural? Isn't the automorphism $(-)^{\mathrm{op}}$ of the category $\rm CRing$ naturally isomorphic to the identity functor? (Actually, isn't it just equal to the identity functor?) Is there some other kind of naturality that fails? – Mike Shulman Nov 15 '22 at 01:55
  • 1
    @MikeShulman Yes, in the category of commutative rings. But rings, and particularly endomorphism rings, are not generally commutative. So, depending on the situation, care is called for. – Pace Nielsen Nov 15 '22 at 03:59
  • Right, but a noncommutative ring is not isomorphic to its opposite at all. What confuses me is your statement that "a commutative ring is accidentally isomorphic to its opposite ring. I've seen some authors make errors in proofs by claiming that this accidental isomorphism somehow lifts to a natural map," in which you were talking about commutative rings. – Mike Shulman Nov 15 '22 at 23:52
  • @MikeShulman I've edited that paragraph, to try to clarify what was meant. If it is still confusing, just let me know. – Pace Nielsen Nov 16 '22 at 00:06
  • Sorry I'm confused, if M is left R module and E is R linear endomorphism isn't the compatibility like the same--it's just f(ax)=af(x) which is just the definition of linear? Do you mean if someone puts the endomorphism action on the right? – davik Nov 16 '22 at 02:07
  • @davik If $M$ is a left $R$-module and $E$ is its endomorphism ring, and we write $E$ on the left, then yes the compatibility map becomes the awkward "commute through" equality $e(rm)=r(em)$, rather than the "move parentheses" equality $(rm)e=r(me)$ that occurs when $E$ is written on the right. On the left, $E$ uses the usual left-hand composition rule $(ee')m=e(e'm)$, but on the right it is more natural to use the right-hand composition rule $m(e'e)=(me')e$ so that again we just move parentheses. – Pace Nielsen Nov 16 '22 at 15:36
  • @davik If we follow your suggestion, to write endomorphisms on the left of left modules, then ${\rm End}(_R R)$ is naturally isomorphic to $R^{\rm op}$, rather than $R$ (which is what we get if we write endos on the right). – Pace Nielsen Nov 16 '22 at 15:37
  • Thanks for the correction. I'm not convinced that this is an issue of notation being wrong, though, but of people just not being aware of a subtle mathematical issue. – Mike Shulman Nov 16 '22 at 15:43
  • @MikeShulman But wasn't "not being aware" what you asked for? You define wrongness as follows (bold mine): "And by saying that a notation is incorrect, I mean that the "understood" way to interpret the syntax as mathematical objects is not actually well-defined in general, or that the rules for manipulating the syntax don't correspond to the way those objects actually behave." – Pace Nielsen Nov 16 '22 at 16:18
  • I also said "I'm only interested in cases where the underlying mathematics was precisely defined and correct, from a modern perspective, with the mistake only lying in an incorrect notation or an incorrect use of that notation." It seems to me that confusing left and right actions of a ring is a purely mathematical error, which may be exacerbated by poor notation (like many mathematical errors), but is not fundamentally about notation. – Mike Shulman Nov 17 '22 at 05:03
  • @MikeShulman Ok, I think I'm understanding now. Just to clarify, when you say "...or an incorrect use of that notation" it now appears to me that you don't mean a personal misuse of the notation that is based on a misunderstanding of how to use the notation, but rather I now think you mean that the use of the notation was the syntactically accepted use, but that the accepted syntactic manipulations do not always conform to the objects being modelled by the notation. Is that close? If so, doesn't that make the notation ill-defined in the first place? – Pace Nielsen Nov 17 '22 at 05:33
  • Yes, that's right. – Mike Shulman Nov 18 '22 at 01:04
  • As I said, "by saying that a notation is incorrect, I mean that the 'understood' way to interpret the syntax as mathematical objects is not actually well-defined in general, or that the rules for manipulating the syntax don't correspond to the way those objects actually behave." – Mike Shulman Nov 18 '22 at 01:05
  • @MikeShulman In that case, Part III of my new answer might meet this criterion. Set collection notation was not well-defined in general (being inconsistent, due to Russell's paradox), but it was used by prominent mathematicians and believed (initially) to refer to real objects of study. – Pace Nielsen Nov 18 '22 at 01:08
1

I'm not entirely sure what the difference between wrong notation and wrong "underlying mathematics" is, so I'm going to present a few different examples, and hopefully this clarifies the question. (Perhaps one of these examples provides an answer to the OP; maybe the one about set collection?)

I. Incorrect transfer of notation

The notion of prime factorization was studied by the early Pythagoreans and Euclid in his Elements. Over time, a notation/terminology for this behavior was built up. So if someone wrote “Let $a=p_1p_2 \dotsm p_k$ be the prime factorization of $a$.” then it was clear what was meant. This factorization was unique, up to order, and up to unit multiples, and couldn't be extended further.

When more general rings of integers began to be studied, the notation of prime factorization was transferred to this new setting. In that context, an incorrect proof of Fermat's Last Theorem was given. Why was it incorrect? Because the notation only applies to UFDs.

This specific incorrect use of notation has had ripple effects in the field of abstract algebra, and has led many algebraists to be very careful when assuming behaviors under different assumptions. We don't want to repeat that fundamental mistake.

In modern mathematics we see the effect being that many statements have many very precise hypotheses.

Other examples of "incorrect transfer" abound. As a noncommutative ring theorist, I see this all the time, when looking at transfers from the commutative setting.

II. Imprecise reference to modeled objects

Nowadays we take the notation "Let $f$ be a function" for granted. But it actually took a while for mathematicians to formalize exactly what a function is (and for that formalization to be almost universally accepted). A function could have been all sorts of different things, including Dirac's delta "function".

I also view Joel's answer, of ZFC-P, to be of the "imprecise reference" type. The original understanding of that notation was not precise enough to ground it in a specific theory.

III. Inconsistent notation

Some notation refers to nonexistent objects. For instance, in set theory Kunen's inconsistency theorem tells us (among other things) that there is no nontrivial elementary embedding $j\colon V\to V$. The notation $j$ I just introduced refers to no actual object.

This is a feature of mathematics, not a bug. It is common in many branches of mathematics to introduce notations for things that later turn out to not exist at all. There is nothing being modeled by the notation, and yet the notation is a useful nonsense!

One of the famous examples of an inconsistent notation, that was initially believed by prominent mathematicians to be consistent, was that of arbitrary collection. As Russell and others noticed, the collection $\{x\, :\, x\notin x\}$ of all sets that aren't members of themselves seems innocuous enough, but it is self-contradictory.

IV. Misleading or overused notation

One of the easiest traps to fall into, regarding notation, is when that notation itself is misleading.

For instance, mathematicians often overuse the equals sign "=", having it mean different things in different contexts. Occasionally, it is even used for a non-symmetric relation symbol, such as in $x=O(x)$, even though the symbol "=" itself is visually symmetric. We are trained to treat symmetrically written relation symbols as implicitly symmetric.

In a separate answer, I mentioned how modules over commutative rings possess a similar issue. If $k$ is a commutative ring, and $M$ is a module over that ring, then we want to say $_k M=M_k$, because either module structure gives us the "same" information. Yet, a left module is different than a right module, and their endo rings are different, in meaningful ways.

Matt F.'s answer also seems to fall in this category. It is dangerous to remove parameters from a notation.

Per Alexandersson's answer is another instance of this phenomenon. Using the same notation in the same article for two different types of objects is misleading.

V. Unformalizable notation

Some errors in proofs come from incorrect uses of language. (Here, I'm treating language itself as a notation for the mathematical concepts under study.) Some concepts we think are well-defined, since we use them in our everyday conversations, are in fact not. This is brought home very well with some of the classic liar paradoxes, such as "This sentence is false." Typically, many of these paradoxes are resolved by noting a failure to recognize the "use/mention distinction" in language.

Pace Nielsen
  • 18,047
  • 4
  • 72
  • 133
  • 1
    An exception to the law of visual symmetry, which has always bugged me: the divisibility symbol $\mid$, as in $2 \mid 6$, is visually symmetric, but we had better not write $6 \mid 2$ (at least if our underlying ring is $\mathbb Z$). My Discrete-Mathematics students often confuse $a \mid b$ with $a/b$, and so write, e.g., $6 \mid 2$ or, even worse, $6 \mid 2 = 3$. I have long wanted something like a less clunky version of $\vert!!!!!\rightarrow$, e.g., $2 \mathrel{\vert!!!!!\rightarrow} 6$ but $6 \mathrel{\vert!!!!!\leftarrow} 2$. – LSpice Nov 18 '22 at 01:36
  • I wouldn't count II, III, or IV. Re II, I'm interested in "cases where the underlying mathematics was precisely defined and correct, from a modern perspective", and here the notion of function was not precisely defined. Re III, introducing a notation for an object that turns out later to be impossible is not wrong, it's just the way variables and quantifiers work; and I would say that Russell's paradox is about the mathematics of unrestricted comprehension, not the notation for it. And IV seems like individual misuse of a notation, rather than the notation itself being incorrect. – Mike Shulman Nov 18 '22 at 02:06
  • @LSpice I agree, but it's hardly alone in that. The subtraction sign $-$ is also visually symmetric. – Mike Shulman Nov 18 '22 at 02:07
  • Your example I is more interesting. My first instinct was to say that this is also a mathematical mistake, not a notational one: people were implicitly assuming that all rings of integers had unique factorization. But then I realized that any sort of incorrect notation could be argued to be a mathematical mistake too: "people were implicitly assuming that the notation was well-defined and correct"! – Mike Shulman Nov 18 '22 at 02:12
  • But I still don't really feel like "let $a=p_1p_2\dots p_n$ be the prime factorization of $a$" can be called "a notation" in its own right. It's not a new system of syntax; it's just just applying ordinary preexisting mathematical notation to an object that's being implicitly assumed to exist. – Mike Shulman Nov 18 '22 at 02:12
  • @MikeShulman, re, but $-$ is an operation, not a relation (or, I suppose, it is a ternary rather than a binary relation). I think, but could easily be wrong, that symmetric symbols for non-symmetric binary relations are rather rare. – LSpice Nov 18 '22 at 03:40
  • @MikeShulman "and I would say that Russell's paradox is about the mathematics of unrestricted comprehension, not the notation for it" I would argue it is about both the mathematics and the notation. The notational calculus you get when using notations like ${x, :, x\notin x}$ is incorrect, as is the underlying mathematical theory of unrestricted collection. Godel's completeness theorem tells us that any consistent theory (in FOL) has a model, and vice versa. You can always change definedness issues with the syntax into definedness issues with the objects of study. – Pace Nielsen Nov 18 '22 at 04:00
  • I would not call a notational calculus incorrect just because the mathematical theory that it denotes is inconsistent. If the notations correctly interpret to objects of that theory, the notation is correct, even if the theory is inconsistent. If one were using unrestricted comprehension notation consciously and believing that it always denoted sets in ZFC, then that would be an incorrect notation, but that's not what pre-ZFC mathematicians were doing: they were working with a mathematical theory that really did have unrestricted comprehension (and hence was inconsistent). – Mike Shulman Nov 19 '22 at 02:29
  • @MikeShulman I'm not sure I understand your response. The point is that the notational calculus leads to inconsistent results. It would be like adding $1+1$ and sometimes getting $2$, but other times getting $3$ (even though the theory says $2\neq 3$). By the way, when working with an inconsistent theory, then there are no objects to be interpreted. There can be models of subtheories, but not of the entire theory. – Pace Nielsen Nov 19 '22 at 04:53
  • Even an inconsistent theory has a model in the trivial topos, where both $2=3$ and $2\neq 3$ hold. The inconsistency is in the theory, not the notation. – Mike Shulman Nov 19 '22 at 22:41
  • But even if there were no models at all, it would still be true (vacuously) that everything in the notational calculus can be interpreted as intended in every model! – Mike Shulman Nov 19 '22 at 23:00
  • @LSpice How about $2 \preceq 6$ or $2 \trianglelefteq 6$? They resemble $\leq$ and play its role in the divisibility lattice. – user76284 Dec 14 '22 at 01:37
  • @user76284, interesting idea. I'd definitely reject $2 \preceq 6$ in this context; students already don't distinguish $a \mid b$ from $a/b$, and I wouldn't want to rely on them distinguishing $a \preceq b$ from $a \le b$, especially when the former "often implies" the latter (whatever that means). But $a \unlhd b$ … maybe! – LSpice Dec 14 '22 at 02:12
  • @user76284, the more I think about it, the more I like the idea of $\unlhd$ as a visual combination of $\le$ and $\mid$. I'm not sure I'm brave enough actually to use it, but I like it! – LSpice Dec 14 '22 at 20:12
  • @LSpice You can always define it at the beginning of a text :) – user76284 Dec 14 '22 at 20:56