21

An example of bad mathematical notation that comes in my mind and has caused complications throughout history is the notation for imaginary numbers. The original notation used to represent imaginary numbers was "$\sqrt{-1}$", where the square root symbol was used to indicate the square root of a negative number. However, this notation can be confusing and misleading, as the square roots of negative numbers cannot be represented as real numbers.

The Swiss mathematician Leonhard Euler, in the middle of the 18th century, proposed a new notation to represent imaginary numbers. He introduced the letter "$i$" to represent the square root of $-1$ (i.e. $i^2 = -1$). This notation simplified and clarified the representation of imaginary numbers, allowing a better understanding and manipulation of them in calculations and equations.

The original notation, using the square root symbol for imaginary numbers, led to misunderstanding and confusion in mathematics. For example, people might think that $\sqrt{-1}\cdot\sqrt{-1} = \sqrt{-1\cdot{-1}} = \sqrt1$, which is clearly incorrect when working with imaginary numbers. The introduction of the "$i$" notation for imaginary numbers helped solve these problems and allowed further development in the field of complex mathematics.

I am looking for other examples of bad notation and its bad consequences.

naturally bad and good in in the eye of the beholder but iam eagera to hear friends of this community about the topic an their examples and opinions and references where the subject is studied scholarly.

  • 9
    This is a good question for ChatGPT. – Iosif Pinelis Apr 19 '23 at 14:04
  • 23
    I disagree that $\sqrt{-1}$ is necessarily a bad notation. It might be pedagogically bad for mathematically inexperienced first-timers in complex numbers, but that's not enough to disqualify it. Arguably, it has certain shock value, which has turned it into a symbol on its own, subconsciously burried into every mathematician's mind: it hits you even before you digest the other parts of a formula. Many classical (and even newer) books on SVCs and Complex Geometry use it instead of $i$, which frees the latter to be used as an index, e.g. Donu Arapura's book, Griffiths and Harris' book etc. – M.G. Apr 19 '23 at 15:12
  • 7
    Cont. In a similar fashion we have for example the notation $\mathbb{Q}(\sqrt{-d})$ in Algebraic Number Theory, the meaning of which is immediately clear. – M.G. Apr 19 '23 at 15:14
  • 14
    As suggested by @M.G., I think that $\sqrt{-1}$ is just fine. Where I see the problem is with thinking that "$\sqrt{-1}$ is the square root of $-1$", emphasis mine, because there is no such thing as the square root of $-1$ (unless you work in characteristic $2$). If you think that $\sqrt{-1}$ is a square root of $-1$, then you should be untroubled by the suggestion that $\sqrt{-1}\cdot\sqrt{-1} = \sqrt{1}$: it says that the product of two square roots of $-1$ (which need not even be the same!) is a square root of $1$, which is true. – LSpice Apr 19 '23 at 17:10
  • 3
    @M.G.: I'm wondering whether this might be an analysis vs. algebra cultural thing. – Jochen Glueck Apr 19 '23 at 17:13
  • 1
    @JochenGlueck: that is very likely, in my opinion. Yet, the "problem" in the OP is not a problem as most functional equations involving usual real functions fail for their complex counterparts, which are generally multivalued. This has clearly nothing to do with the rightness of the notation for $\sqrt{-1}$, people just have to forget about (or, better, adapt) the multiplicative law for square roots when going to $\mathbb{C}$. – Loïc Teyssier Apr 19 '23 at 17:46
  • 1
    @LSpice: my take on $\sqrt{-1}$ vs. multiple roots is that, even though there are multiple different roots, there is no preferred one (having a preferred one being an algebraic impossibility). You pick one, but it doesn't matter which, because any choice works equally well (as long as you are consistent in your choice). – M.G. Apr 19 '23 at 18:18
  • 2
    @JochenGlueck: could well be. My impression is that people tend to use $\sqrt{-1}$ instead of $i$ when they deal with more than 1-2 complex variables (be it analysis or geometry) or an indefinite amount, so that they can have for ex. $z_i$, $1 \leq i \leq n$, or some tensors in coordinates. Replacing $i$ by $\sqrt{-1}$ in single variable complex analysis or Riemann surfaces seems less beneficial. Personally, I prefer to write $i$ instead of $\sqrt{-1}$ whenever possible b/c it's shorter, but I prefer to read $\sqrt{-1}$ rather than $i$ :-) Also, $\sqrt{-1}$ seems perhaps a bit old-fashioned. – M.G. Apr 19 '23 at 18:40
  • 2
    There are a few notations I find irritating. For one, $\mathbb{Z}/n\mathbb{Z}$ is incredibly inconvenient to write or type. God forbid you have to apply any operations to it and parenthesize it as well. I also dislike $a \equiv b$ (mod $n$). I think $a \equiv_n b$ is much better. Neither of these notations are unclear, but they are just tedious compared to superior alternatives. – Bma Apr 19 '23 at 23:36
  • 2
    @Bma, re, aside from $\mathbb Z/n$, which is shorter but otherwise surely not much more convenient as a subject of operations and parentheses, what are the superior alternatives to $\mathbb Z/n\mathbb Z$? (Although there is disagreement on this, as a $p$-adicist I regard $\mathbb Z_n$ as taken, at least for $n$ prime.) – LSpice Apr 20 '23 at 00:51
  • 2
    @LSpice That was one I had in mind actually. I think $\mathbb{Z} /n$ saves quite a bit of time and angst, although it’s still pretty rough. $\mathbb{Z}_n$ is definitely not an option. I’m almost drawn to using $C_n$ instead ($C$ for cyclic group). – Bma Apr 20 '23 at 01:02
  • 2
    @Bma, re, I use $\mathrm C_n$ when I can get away with it. That's two of us! – LSpice Apr 20 '23 at 01:09
  • 5
    The use of $\lim$ as an operator is a serious problem for my analysis students. They have trouble remembering that you can't even write $\lim a_n$ until you have first proved that the limit exists. So an equation like $\lim a_n = 5$ may be neither true nor false but actually ill-defined. By contrast, using $\to$ as a relation, the formula $a_n \to 5$ at least has a definite truth value no matter what sequence $a_n$ is. – Nate Eldredge Apr 20 '23 at 01:34
  • 1
    @LSpice Another choice is $\mathbb{Z}_{/n}$. I think it's a bit ugly/clunky (although this is the notation I personally use), but at least it's kinda convenient – Emily Apr 20 '23 at 02:11
  • 2
  • 2
    I think that there are bad notations and confusing notations : Confusing notation often use the same notation for two different things, depending on context.My favourite example of a confusing notation : $f^{-1}$. It can can be $1/f$ or the preimage or the reciprocal of a bijection. All three are perfectly fine. – Roland Bacher Apr 20 '23 at 06:15
  • 2
    $A\xrightarrow{f}B\xrightarrow{g}C$ composition denotes as $g\circ f$ in which the order of maps is reversed. I rather prefer $(x)f,$ over $f(x).$ – Bumblebee Apr 20 '23 at 10:38
  • 2
    There are quite a few examples of confusing notation in theoretical physics, like a quantity will be written down as if it were a scalar when it is actually a matrix. – Hollis Williams Apr 20 '23 at 11:15
  • In electrical engineering (EE) we use "j" as "i" is reserved for current -- from Wikipedia: French phrase intensité du courant (intesity of current, which we usually just call current). – JosephDoggie Apr 20 '23 at 19:46
  • Notations fixing a unique square root of something can confuse even people who ought to know better. The Algebra II textbooks that my high school used stated in chapter 8(as an explicit, albeit unproved, theorem) that $\sqrt{ab}=\sqrt{a}\sqrt{b}$ for all real $a$ and $b$. Then in chapter 9, it introduced complex numbers, and I pointed out that the "theorem" actually only held for real $\sqrt{a}$ and $\sqrt{b}$. The teacher did not want to get into it, for fear of confusing the other students. – Buzz Apr 21 '23 at 18:48
  • I have long had a problem with precisely √-1, because it neglects to specify if it should mean i or -i. – Daniel Asimov Aug 20 '23 at 08:18
  • I also have a big problem with the math police closing a perfectly interesting and extremely useful question like this one. When that happens, everyone loses. (Except for the math police, who apparently imagine they are doing a good deed. They are not.) – Daniel Asimov Aug 20 '23 at 08:20

8 Answers8

16

In my opinion, bad notations are often a confusing notations, having different possible meanings.

For example, the use of parenthesis for too many things can be confusing:

  • argument of functions,
  • priority in operations,
  • open intervals (I do prefer the french notation $]a,b[)$
  • couples, triples,...
  • row matrices,
  • cycles (permutations).

I have been very disturbed when teaching on polynomials. For example $Q(X-\alpha)$ usually stands for the composition $Q \circ (X-\alpha)$, whereas $(X-\alpha)Q$ is used for the product. Or when working on linear maps on $\mathbb{R}^n$, because we use the same notation when we multiply a vector $(x_1,\ldots,x_n)$ by a real number or when we apply a map to this vector.

Using superscripts that can be misunderstood as exponents is also problematic. In my opinion, the two-dimensional Euclidean unit sphere should be denoted by $\mathbb{S}_2$ and not $\mathbb{S}^2$.

In differential calculus, when I was a student, I was puzzled the first time the teacher computed the second differential of $g \circ f$ by differentiating $D(g \circ f) = (Dg \circ f) \circ Df$. When trying to understand the computation, I finally realized that the symbol $\circ$ used twice in the right-hand side does not have the same meaning at the two occurrences. The correct statement is $D(g \circ f)(x) = Dg(f(x)) \circ Df(x)$ for all $x$.

I now give two notations that have a lot of advantages, but cause occasionally troubles.

In probability theory, the intuitive notation $f(X)$ for $f \circ X$ may induce in error for example in the formula to compute the conditional expectation of a bounded function of two independent random variables $X$ and $Y$, namely $E[f(X,Y)|X] = g(X)$ where $g(x):=E[f(x,Y)]$. It is very tempting but false to write $g(X)=E[f(X,Y)]$.

Polynomials of matrices also lead to confusions, like the false proof of Cayley Hamilton theorem by replacing $X$ by $A$ in the equality $\chi_A(X) = \det(XI-A)$. Distinguishing the $0$ in $K[X]$ from the $0$ in $\mathcal{M}_n(K)$ is a simple way to convince students that this proof is false.

Last, I hate the notations like $$f(x) = g(x), \quad x \in \mathbb{R},$$ which may mean $f(x) = g(x)$ for some $x \in \mathbb{R}$, or $f(x) = g(x)$ for every $x \in \mathbb{R}$, depending on the circonstances.

A good way to prevent a lot of problems is to keep in mind:

  • the definitions,
  • in which sets the objects live,
  • for which elements equalities or properties hold.
  • 9
    $D(g \circ f) = (Dg \circ f) \circ Df$. WTF? That should be $D(g \circ f) = (Dg \circ f) \cdot Df$, then there's nothing wrong with it. – wlad Apr 19 '23 at 22:09
  • 21
    The obvious problem with the French notation for open intervals is bracket pairing — it looks syntactically ill-formed. Your sentence "(I do prefer the French notation $]a,b[$)" highlights this issue: it looks like the matching sets of brackets are of the form ($\ldots$] and [$\ldots$), which is not what is intended. ]Even worse is a negative number of opened brackets.[ – R. van Dobben de Bruyn Apr 19 '23 at 22:42
  • 8
    In the same vein, "$a \leq x,y \leq b$" is a horror show. Both possible meanings are widespread in the literature. – darij grinberg Apr 20 '23 at 00:21
  • 16
    @wlad, re, I think that $D(g \circ f) = (Dg \circ f) \circ Df$ comes from thinking of the derivatives as linear operators (to be composed) rather than as matrices (to be multiplied). – LSpice Apr 20 '23 at 00:56
  • 8
    @R.vanDobbendeBruyn: I don't see how $(1,2]$ is less ill-formed in bracket-pairing than $]1,2]$. You just can't parse expressions containing intervals in the usual way, that's all. – Loïc Teyssier Apr 20 '23 at 08:27
  • 2
    @LoïcTeyssier again stating the obvious: there is a difference between not closing every bracket with the same type or haphazardly opening and closing brackets everywhere. A syntax where ( and [ always increase the parenthesis state by one and ) and ] decrease it still has a semantics-independent parity check ($\mathbf N$-valued, not $\mathbf Z/2$-valued). This also serves a role in readability: opening a bracket makes you look ahead, whereas closing one makes you look back at what came before it. – R. van Dobben de Bruyn Apr 20 '23 at 21:57
  • 2
    As a topologist, I do think of the superscript in $\mathbb{S}^2$ as an exponent, as $\mathbb{S}^2 = \mathbb{S}^1\wedge\mathbb{S}^1$ is the smash product of two 1-spheres, and the $n$-sphere is the smash product of $n$ 1-spheres. This is very useful. – Steve Costenoble Apr 21 '23 at 00:17
  • 4
    @LoïcTeyssier I can see how we can have some fun with that notation if we also allow multiplication of an interval by a scalar (e.g., $3]a,b] = ]3a,3b] = ]a,b]3$). Then if I refer to $[a,b[c,d]e,f]$ and $[g,h]$, am I referring to three intervals $[a,b[c = [ac,bc[$ and $d]e,f] = ]de,df]$ and $[g,h]$, or am I referring to a 3-tuple and a 2-tuple, where the 3-tuple consists of $a$, the interval $b[c,d]e = [bce,bde]$, and $f$? – Timothy Chow Apr 21 '23 at 01:10
  • @R.vanDobbendeBruyn ]Thanks a lot, now I'm going to spend all day with negative unresolved tension. – Carl-Fredrik Nyberg Brodda Apr 21 '23 at 06:04
  • @darij grinberg, POLEASE COULD YOU GIVE ONE OR TWO REFERENCES? – Humberto José Bortolossi Apr 27 '23 at 23:47
  • @HumbertoJoséBortolossi: Theorem 2.1 (b) in https://arxiv.org/abs/math/0005260v1 is an example where it means "$a\leq x$ and $y \leq b$". In matrix algebra, it usually means "both $x$ and $y$ belong to $[a, b]$". – darij grinberg Apr 28 '23 at 00:34
  • @LSpice: I'm quite late to this post, but I believe that one correct way to write that $D(g\circ f)(x)=Dg(f(x))\circ Df(x)$ for all $x$ is $D(g\circ f)=\operatorname{comp}\circ,(Dg\circ f,Df)$, where $\operatorname{comp}$ is the function given by $\operatorname{comp}(\phi,\psi)=\phi\circ \psi$ for linear maps $\phi$ and $\psi$. It can be shown that $\operatorname{comp}$ is a smooth function, which is an ingredient in the proof in Dieudonne's Foundations of Modern Analysis that the composition of smooth functions (between Banach spaces) is smooth (Chapter 12, p. 183). – Joe Mar 02 '24 at 20:54
15

Suppose that $A$ is an oracle; then it is standard to write $\mathsf{P}^A$ for the complexity class $\mathsf{P}$ relativized to $A$. As I have mentioned elsewhere on MO, this is incredibly confusing notation. It can lead to the following spurious argument that has confused generations of students. Assume that $\mathsf{P}=\mathsf{NP}$. Then for all oracles $A$, $\mathsf{P}^A=\mathsf{NP}^A$. But by Baker–Gill–Solovay, we know that there exists an oracle $A$ such that $\mathsf{P}^A\ne \mathsf{NP}^A$. This is a contradiction. Hence $\mathsf{P}\ne \mathsf{NP}$.

Timothy Chow
  • 78,129
  • 1
    Do you have a preferred solution to this? What about P^0 and NP^0? – user21820 Apr 20 '23 at 13:36
  • 2
    @user21820 I admit I don't have a perfect solution, but I think there ought to be some way to indicate that the oracle is being applied to the machine model and not the language. Maybe something like $\mathsf{P}_{T^A}$ instead of $\mathsf{P}^A$ where $T$ symbolizes the Turing machine model. One might still use $\mathsf{P}^A$ (or maybe $\mathsf{P}_A$) for brevity when there is no danger of confusion, but there should be a way to revert to a more precise notation when necessary to avoid confusion. – Timothy Chow Apr 20 '23 at 14:28
  • 2
    For the benefit of someone whose last computability theory was back in their undergraduate days, what is the wrong step here? Is it that equal complexity classes need not have equal relativisations? – LSpice Apr 20 '23 at 23:39
  • 5
    @LSpice Yes. A complexity class is a set of strings. Contrary to what the notation seems to suggest, relativizing to an oracle is not an operation that applies directly to the set of strings; it's a modification of the machine. There may be two different "conditions" which cause the machine to accept equivalent sets of strings, but that doesn't imply that imposing the corresponding conditions on the modified machine will still cause it to accept equivalent sets of strings. – Timothy Chow Apr 21 '23 at 00:49
  • 1
    Minor correction: A complexity class is a set of languages, and a language is a set of strings. – Timothy Chow Apr 21 '23 at 04:35
  • Hmm, I don't like $P_{T^A}$ even more. It doesn't solve the issue you raised... The reason I suggested "P^0" is that as long as you think of "P" as the base model and "P^A" as the result of adding "A" to the base model "P" to get a complexity class, then everything is fine, since "^0" is equivalent to no oracle. Unfortunately, "^0" is really cumbersome. – user21820 Apr 21 '23 at 07:08
  • Another possible solution is that P and NP and other such terms should be defined as models(!) rather than complexity classes, and then the question of equal classes must be expressed differently, say via "P ≡ NP". It is then clear that it may be that P ≡ NP but P^A ≢ NP^A for some A, just like 6 ≡ 10 (mod 4) but 6/2 ≢ 10/2 (mod 4). – user21820 Apr 21 '23 at 07:12
  • @user21820 Redefining $\mathsf{P}$ and $\mathsf{NP}$ to be something other than complexity classes strikes me as too extreme. But whatever the limitations of my suggestion $\mathsf{P}{T^A}$, I think it does address the issue I raised. One is not tempted to infer from $\mathsf{P}{T^\varnothing} = \mathsf{NP}{T^\varnothing}$ that $\mathsf{P}{T^A} = \mathsf{NP}_{T^A}$ for all $A$. Or at the very least, the notation alerts you to the fact that $A$ is not being applied to $\mathsf{P}$ or to $\mathsf{NP}$ directly. – Timothy Chow Apr 21 '23 at 11:51
  • @user21820 Rereading more carefully what you wrote, I believe that what I'm suggesting is not so different from your suggestion. Part of what I'm saying is that the "full notation" for $\mathsf{P}$ would be $\mathsf{P}_{T^\varnothing}$. One would use the abbreviation $\mathsf{P}$ only when there is no danger of confusion. – Timothy Chow Apr 21 '23 at 11:55
  • Adding any oracle can change complexity of the classes involved since some problems that are solved with oracle using "magic" oracle operation cannot be simulated with same amount of resources. It's then matter of finding sufficient resources elsewhere to determine if the oracle machines can be simulated without oracle and checking that those simulations do not use too many resources. But adding oracle operation can also complicate matters, since enumerating all possible outputs of the oracle machine is not as simple. For example having oracle produce random bits to produce thread ids => races – Esa Pulkkinen Apr 21 '23 at 17:29
  • Re, in the spirit of @user21820's suggestion, mightn't a remedy be to regard a complexity class $\mathsf C$ not as a set of languages, but rather as "the machine" in whatever sense; then to denote the corresponding set of languages by, say, $\DeclareMathOperator\Lang{Lang}\Lang(\mathsf C)$, so that $\Lang(\mathsf P) = \Lang(\mathsf{NP})$ obviously need not imply $\Lang(\mathsf P^A) = \Lang(\mathsf{NP}^A)$? – LSpice Apr 22 '23 at 01:43
  • 1
    @LSpice The trouble with that suggestion is that the sets of languages $\mathsf{P}$ and $\mathsf{NP}$ really are the primary objects of interest. They are extremely robust and "machine-independent." Oracles, at least IMO, are not so interesting in their own right; they're tools for helping us figure out what sorts of things are going to be hard to prove and what sorts of things might be more tractable. – Timothy Chow Apr 22 '23 at 01:56
  • @TimothyChow: That's right; my P^0 was exactly your P[T^∅]. And yes you could solve the issue by making very clear (to students) that P is merely abbreviation for "P^0". The more radical alternative I suggested was just a fun way that allows us to actually retain the usual appearance of these notations except for changing "=" to "≡". XD – user21820 Apr 22 '23 at 04:30
9

Elements of groups (or monoids, semi-groups, non-commutative algebras) are composed from left to right, functions, maps, functors from right to left. Things get very confusing when working for example with actions of the symmetric group (it is even worse when using the cycle notation). I think the Polish notation (arguments first) is a brillant idea but it is probably too late now for imposing it.

Roland Bacher
  • 17,432
  • 2
    On the contrary ("it is probably too late now for imposing it") in semigroup theory it is very common to compose from right to left, writing arguments first. – Carl-Fredrik Nyberg Brodda Apr 20 '23 at 07:48
  • I guess you wanted to say 'compose from left to right' (otherwise you have the same problem)? In group theory, it is also common to consider right actions (I can never remember this : right is with respect to the group element). – Roland Bacher Apr 20 '23 at 08:44
  • 1
    What is mean is just that, for example, in the full transformation monoid on a set $X$ (being just the set of all functions $f \colon X \to X$), one writes $(x)fg$ for $(g \circ f)(x)$, with $x \in X$. So the multiplication $fg$ corresponds to composing from right to left (unless that is taken to mean the opposite thing :)! ). – Carl-Fredrik Nyberg Brodda Apr 20 '23 at 08:57
  • 1
    That's called a 'right action' (if I am not mistaken) in the context of group theory. – Roland Bacher Apr 20 '23 at 10:17
  • 2
    Yes, that sounds right :-) (I always mix up left and right when it comes to left/right cosets vs. left/right ideals, and especially the associated action). – Carl-Fredrik Nyberg Brodda Apr 20 '23 at 14:00
  • Action notation is also asymmetric where normally operation (multiplication) is associative and commutative. This causes innumerable problems interpreting the notation when exact domain and codomain of the operations is not stated. – Esa Pulkkinen Apr 21 '23 at 16:48
4

1) Ornstein–Uhlenbeck process

Consider the SDE for the Ornstein–Uhlenbeck process:

$$X_t:=X_0+\int_{h=0}^{h=t}\theta(\mu- X_h)dh + \int_{h=0}^{h=t}\sigma dW_h.$$

Solution can be found by applying Itô's lemma to the function $F(X_t,t) := X_t e^{\theta t}$, which leads to:

$$X_te^{\theta t}=X_0+\int_{h=0}^{h=t}\left(e^{\theta h}\theta\mu\right)dh+\int_{h=0}^{h=t}\left(e^{\theta h} \sigma\right)dW_h.$$

The final step is then to isolate $X_t$ on the LHS by dividing through by $e^{\theta t}$:

$$X_t=X_0e^{-\theta t}+\int_{h=0}^{h=t}\left(e^{\theta(h-t)}\theta\mu\right)dh+\int_{h=0}^{h=t}\sigma e^{\theta(h-t)} dW_h.$$

The short-hand notation for the Orsntein–Uhlenbeck process leads to confusion:

$$dX_t:=\theta(\mu- X_t)dt + \sigma dW_t.$$

Applying Itô's lemma and continuing in the short-hand notation leads to:

$$d(X_te^{\theta t})=\left(e^{\theta t}\theta\mu\right)dt+\left(e^{\theta t} \sigma\right)dW_t.$$

Using the short-hand notation, I have seen students mistakenly cancel out the terms $e^{\theta t}$ that would normally be written as $e^{\theta h}$ in the long-hand notation inside the integrals.

2) Geometric Brownian Motion

Even the very well-known SDE for the Geometric Brownian Motion written in short-hand notation leads to confusion:

$$dS_t=\mu S_tdt+\sigma S_tdW_t$$

Again, the terms on the RHS would normally be written as $S_h$ inside an integral, which makes it obvious that the terms cannot be "taken out" of the integral until they are integrated.

In the short-hand notation, I have seen far too often attempts to divide through by $S_t$ and write:

$$\frac{dS_t}{S_t}=\mu dt+\sigma dW_t$$

The "next step" would then be to "integrate both side" (with respect to what variable?) and write:

$$\ln(S_t)-\ln(S_0) = \mu t+\sigma W_t$$

Which is obviously wrong (the solution is actually $\ln(S_t)-\ln(S_0) = \mu t - 0.5 \sigma^2 t + \sigma W_t$, again using Ito's lemma).

In conclusion, I would argue that the short-hand notation for SDEs is rather unfortunate, particularly for new students to the field. I would encourage anyone new to stochastic calculus to use the long-hand notation until they become very comfortable with the subject.

3

Littlewood, J. E., A mathematician’s miscellany, London: Methuen & Co. VII, 136 p. 18 diagrams. (1953). ZBL0051.00101.

In §12 , Littlewood shows two proofs of the same thing. First a "beginners' proof" and then a "civilised proof".

Gerald Edgar
  • 40,238
2

Any notation that has become indelibly ambiguous, like (Does it mean the set of nonnegative, or just positive, integers?), has become bad notation. Not necessarily through any fault of its own.

Another example in my opinion is the backslash, which was introduced (as far as I know) by Hu to mean set subtraction: X \ Y = {x ∊ X | x ∉ Y}, apparently to avoid overloading the minus sign.

But the minus sign used to mean set subtraction seems to have caused no confusion that I know of. While the backslash can also be used for quotienting a structure Y on the right by a substructure X on the left.

  • 13
    For me this mostly highlights how this question seems skewed from field to field; in my area I would probably never encounter something like quotienting on the right and the backslash $\setminus$ is used exactly for set substraction, whereas the minus sign between sets $X-Y$ is used in the context of e.g. convex analysis in connection with the Minkowski sum $X+Y$ which would conflict with set subtraction badly. (I think mostly for singletons $Y$ but still.) – Hannes Apr 21 '23 at 06:28
-2

I have long felt that the convention that writes the numerator above the denominator in fractions is the wrong way round. The consequences don't bother mathematicians but sometimes cause beginners to stumble.

When children first encounter a fraction like $3/5$ they think "divide something in $5$ parts and take $3$ of them". In that description you see the $5$ before the $3$. Writing "$3/5$" counterintuitively names the number you take before telling the reader how many parts there are.

When adding fractions, you must deal first with the denominators to find a common one. Only then do you think about the numerators. When you teach that to schoolchildren you require them to read from bottom to top.

In calculus, $dy$ is (roughly speaking) the change in $y$ caused by the change $dx$ in $x$. That sentence talks about the cause before the effect. Not how causality works.

LSpice
  • 11,423
  • 1
    3/5 is a symbolic notation for "three fifths". So it is indeed written in the natural order. – Emil Jeřábek Apr 21 '23 at 19:59
  • @EmilJeřábek That convention is natural only because it's the convention. Years ago the community could have chosen to write that rational number as "$5/3$", All I am saying is that I wish they had. – Ethan Bolker Apr 21 '23 at 20:34
  • 6
    You misunderstood what I wrote. This is the order that was used spontaneously in common spoken language centuries before anyone invented a notation for it. The notation follows common usage, not the other way around. – Emil Jeřábek Apr 21 '23 at 21:12
  • @EmilJeřábek Thanks. I learned something. I still wish it had been the other way around. – Ethan Bolker Apr 21 '23 at 21:16
  • You make an interesting point, and I think it is worth mentioning where hysterical raisins may be interfering with pedagogy, and explicitly evaluating which one should be given priority. But I do not understand your plaint about $dy$. If I say "I am wet because it's raining", then I am mentioning the effect before the cause, against the order of causality, but I think that no-one is confused. ("Because it's raining, I am wet" sounds more awkward to me.) So why is "the change in $y$ caused by [a given] change in $x$" problematic? – LSpice Apr 23 '23 at 17:04
-7
  • $\newcommand\E{\mathsf E}\newcommand\P{\mathsf P}\newcommand\Eb{\mathbb E}\newcommand\Pb{\mathbb P}\newcommand\R{\mathbb R}\newcommand\C{\mathbb C}$ChatGPT gives a few examples of bad notation used more or less long ago, including (i) Roman numerals; (ii) the original notation for logarithms, "which used geometric figures and decimal numbers"; (iii) the use of x for both multiplication and variables.

I think everyone would indeed agree that Roman numerals are inconvenient for arithmetical or other numerical operations.

I have seen the use of (cursive!) $x$ for multiplication of real numbers (!) even on MO! — Something like $axb$, to denote $ab$.

I cannot vouch for the original notation for logarithms. (Added later: On further questioning, ChatGPT said that its previous claim that the original notation for logarithms "used geometric figures and decimal numbers" was incorrect.)

  • However, I think the commonly used notation, $\log$, for $\ln$ is bad. Indeed, $\ln$ is more (and completely) specific and shorter than $\log$. So, I don't see a good reason to use $\log$ for $\ln$. (I know that this suggestion may excite some passions.)

  • Another example of commonly used bad notation is $\Pb$ and $\Eb$ to denote the probability and the expectation. It is better to use $\P$ and $\E$, or simply $P$ and $E$, leaving the blackboard-bold font for $\R$ and $\C$, and the like.

  • Also, the standard convention used to be to write something like $\E X$ and $\E XY$, without any brackets or parentheses — with apparent understanding that $\E$ is a linear (and integral) operator, and we still commonly write $Tx$ rather than $T(x)$ if $T$ is a linear operator. Nowadays, people mostly write, I guess under the influence of computer science, $\E(X)$ and $\E(XY)$, or $\E[X]$ and $\E[XY]$, which in some cases makes formulas hard to read, with the necessity of going through all those parentheses or brackets. (I understand that this remark may excite some passions, too.)

  • Alas, it is becoming more and more common (or maybe even "cool") to use the same symbol to denote a random variable and any of its values. This can clearly create confusion.

  • Another complaint: the rather common use of $f(x)$ to denote a function $f$. In one post on MO, I even saw something like $\langle f(x),g(x)\rangle$ to denote $\langle f,g\rangle$. (Of course, $\langle f(x),g(x)\rangle$ will make sense only if the values of the functions $f$ and $g$ are in an inner-product space.)

  • Yet another complaint is the common use of something like "for all $0\le x\le1$", which I find impossible to read aloud — in place of "for all $x\in[0,1]$" or "for all $x$ such that $0\le x\le1$".


Rejoinder: I expected passionate reactions to some of the items above.

Any use of ChatGPT seems to continue to excite some strong passions. In this case, a user (say CF) wrote in a comment: "It seems the irony that the generated list is numbered with Roman numerals is lost on ChatGPT." However, it should have been clear from the first three lines of my post that the lower-case Roman numerals (in parentheses) are mine. I also tried to explain this in my response 10 minutes after this comment by CF. There has been no response from CF to my repeated requests to deal appropriately with this counterfactual comment (which has garnered 10 upvotes).

Another user wrote in a comment (which has garnered 8 upvotes): "I tried to ask Chat GPT some group theory questions yesterday. Absolute garbage, wrong answers." It is not quite clear what this has to do with my post. Yet another user wrote in a comment: "anytime [ChatGPT] gets a fact right that’s a happy accident." -- In this case, ChatGPT immediately gave 5 suggestions, of which at least 2 were good (I think); is this a bad score for an immediate response? Anyhow, I think what should be judged foremost is, not the tools used, but the quality of the post itself -- which is the eventual product.

My complaint about phrases like "for all $0\le x\le1$" has also met some rather passionate opposition. The same user CF suggested that there was no problem "to change [the] grammar" to read the similar phrase "for all $1\le k\le n$" as "for all $k$ between $1$ and $n$". My response to this was that I cannot see any compelling reason for us to have our own grammar rules here given such perfectly grammatical alternatives as "for all $x\in[0,1]$", "for all $k\in[n]$", and "for all $k\in\{1,\dots,n\}$" (more details on this can be found in my comments, especially the most recent ones as of today, 23 April 2023).

Overall, my answer has had 8 upvotes (thank you!) and 12 downvotes. The overall score, -4, is disappointing. However, I am happy that some users have found my post useful, to some degree. Thanks to everyone who read this post.

Iosif Pinelis
  • 116,648
  • What is the precedence of $\mathsf E$? It looks like you're saying $\mathsf EXY$ should mean $\mathsf E(XY)$, but probably $\mathsf EX+Y$ shouldn't mean $\mathsf E(X+Y)$. – LSpice Apr 19 '23 at 14:48
  • 21
    Do you know why Romans did not invent neither algebra, nor statistics ? Because they considered $X$ as a constant (equal to $10$). – Denis Serre Apr 19 '23 at 14:49
  • @LSpice : You are quite right. – Iosif Pinelis Apr 19 '23 at 15:01
  • @DenisSerre : Interesting. – Iosif Pinelis Apr 19 '23 at 15:02
  • 1
    I have seen x used for cross product on MO today. – Ben McKay Apr 19 '23 at 15:10
  • 10
    It seems the irony that the generated list is numbered with Roman numerals is lost on ChatGPT. – Carl-Fredrik Nyberg Brodda Apr 19 '23 at 15:42
  • @Carl-FredrikNybergBrodda : Actually, the lower-case Roman numerals (in parentheses) are mine. I find them convenient for enumeration and separation of items of comparatively small length. But, of course, it is hard to do arithmetical or other numerical operations with Roman numerals. – Iosif Pinelis Apr 19 '23 at 15:52
  • 8
    I tend to disagree with the reason why you describe "for all $0 \le x \le 1$" as bad notation. (Though one could argue that it is ungrammatical from a purely linguistic point of view.) Whenever I read mathematics aloud I will anyway paraphrase what I read to make it easier to digest. Your example illustrates this quite well: when reading "for all $x \in [0,1]$" aloud I would rather say something like "for all x in the interval from 0 to 1" rather than reading literally what is written on the paper. Similarly, I would read "for all $0 \le x \le 1$" aloud as "for all x between 0 and 1". – Jochen Glueck Apr 19 '23 at 16:58
  • 1
    In the construction $a x b$ for multiplication, I suppose one is meant to take implicitly $x = 1$. \ I think one issue is that we surely all can read "for all $x \le 1$" or "for all $x \ge 0$" with no hiccoughs, and logic, in contradiction to grammar, suggests that surely these two together are fine. \ MathJax is not as smart/aggressive as TeX about swallowing whitespace, so command definitions must end on the same line as the following text to avoid a spurious blank space. I edited accordingly. – LSpice Apr 19 '23 at 17:06
  • @JochenGlueck : I don't think that we need to "anyway paraphrase what [we] read". Rather, every symbol in a math formula needs to be expanded -- according to its definition. For instance, the compact formula "$x\in[0,1]$" contains three symbols: (i) $x$; (ii) $\in$ (meaning "is in"); and (iii) $[0,1]$ (meaning "the closed interval from $0$ to $1$"). So, just expanding the formula "$x\in[0,1]$" by substituting the meanings of the symbols, we read: "$x$ is in the closed interval from $0$ to $1$", with no problems and without any need to paraphrase anything. Do you not agree? – Iosif Pinelis Apr 19 '23 at 17:14
  • 9
    I tried to ask Chat GPT some group theory questions yesterday. Absolute garbage, wrong answers. – JP McCarthy Apr 19 '23 at 17:22
  • 1
    @LSpice : Thank you for comments and edits. I think we should avoid contradicting grammar, though. – Iosif Pinelis Apr 19 '23 at 17:22
  • Re, I think that @JochenGlueck's suggestion was, in my awkward paraphrase, that the expansions of some symbols are context dependent. For example, some people write "Let $x\in\mathbb R$" to mean "Let $x$ be an element of $\mathbb R$", but "Suppose that $x\in\mathbb R$" to mean "Suppose that $x$ is an element of $\mathbb R$"—I don't like this, but it's common. So one could argue that, in the context "for all $0\le x \le1$", the expansion is "for all $x$ such that $0\le x$ and $x \le1$". – LSpice Apr 19 '23 at 17:22
  • @IosifPinelis: I do agree that your suggestion (paraphrase every symbol) works well in this example. I also think, though, that it's a fine line between expanding the symbols according to their definition and paraphrasing. If, for instance, a compactness argument plays a crucial role, I would most likely say "for all x in the compact interval from 0 to 1". If "the closed interval from 0 to 1" is the definition of [0,1], I thus already started pharaphasing. And I'll do so much more ruthlessly when it comes to more complicated formulas. – Jochen Glueck Apr 19 '23 at 17:27
  • For this reason, I'm quite open to writing things which I might also need to be paraphrased a bit when reading them aloud. The point made in @LSpice's comment is also a good one, I think. – Jochen Glueck Apr 19 '23 at 17:28
  • 1
    @JPMcCarthy : I think ChatGPT can be a useful tool, and it has given me correct answers about some rather nontrivial (to me) math on a couple of occasions, even though on most other math-involved occasions it was mostly nonsense. The question is how to make it useful. In this case, for this posted question, it did supply some useful ideas, even if some of them were misleading. So, I did try to sort them out. – Iosif Pinelis Apr 19 '23 at 17:29
  • 1
    @LSpice : Thank you for your further comments. I still think we should avoid contradicting grammar, especially when it is so easy to do. – Iosif Pinelis Apr 19 '23 at 17:32
  • @JochenGlueck : If you do want to emphasize the compactness, I think it is much better to just write: "for all $x$ in the compact interval $[0,1]$", instead of "for all $0\le x\le1$" or even instead of "for all $x\in[0,1]$". However, I think "paraphrasing" is hardly ever unavoidable or warranted; mere expanding of symbols according to their definitions should almost always suffice; actually, at this point I cannot think of any exceptions. – Iosif Pinelis Apr 19 '23 at 17:41
  • @IosifPinelis: While paraphrasing might be avoidable in most cases, I don't think that it should be avoided. E.g., in a course on ordered Banach spaces I recently wrote "$\forall x ,y \in E ; \exists z \in E: ; z \ge x,y$" on the blackboard. While writing it I said "For all $x$ and $y$ in $E$ there exists $z$ in $E$ which is an upper bound of $x$ and $y$". A few seconds earlier I had used the expression "which dominates both $x$ and $y$" for the same property. I find this more lively and more intuitive than sticking to "such that $z$ is greater or equal than $x$ and $y$" throughout. – Jochen Glueck Apr 19 '23 at 22:50
  • @JochenGlueck : I don't think this is paraphrasing. This is an example of using synonyms, which indeed makes the speech more lively and less repetitive. Similarly, we use "belongs to" or "is a member of" in place of "is in". – Iosif Pinelis Apr 19 '23 at 23:48
  • 1
    @JochenGlueck : On yet another thought, I would suggest that the term "paraphrasing" does not seem appropriate here at all. Indeed, paraphrasing means restating a grammatical phrase into another grammatical phrase of the same or close meaning. But the phrase "for all $0\le x\le1$" is not grammatical in the first place. So, what you were trying to do was, not paraphrasing, but repairing the phrase, turning it into a grammatical one. – Iosif Pinelis Apr 20 '23 at 01:14
  • 3
    Replacing "for all $0 \le x \le 1$" with "for all $x \in [0,1]$" is one thing. But replacing "for all $1 \le k \le n$" with "for all $k \in [1,n]$" is another. – Nate Eldredge Apr 20 '23 at 01:30
  • 1
    @NateEldredge : Yes, there may be some very subtle difference here. Yet, both phrases "for all $1\le k\le n$" and "for all $k\in[1,n]$" are bad, each in its own way: the former one is ungrammatical (and possibly ambiguous: is $n$ fixed here or not?), while the second one misses the condition that $k$ be an integer (which you apparently meant here). A good replacement would be "for all $k\in[n]$" (assuming that the notation $[n]$ is universally understood, or explained) or "for all integers $k\in[1,n]$" or "for all $k\in{1,\dots,n}$". – Iosif Pinelis Apr 20 '23 at 02:31
  • Your remarks about random variables reminds me that I find it very confusing that conditional entropy $\mathrm{H}(Y|X)$ is an expected value and not a random variable. – Timothy Chow Apr 20 '23 at 04:26
  • @TimothyChow : Indeed, this notation may be confusing. – Iosif Pinelis Apr 20 '23 at 15:02
  • @IosifPinelis Honestly I don’t see why “for all $1 \leq k \leq n$” is ungrammatical. Yes, it’s ungrammatical if you read the individual symbols one by one, after another. But in languages we can combine symbols to change their grammar (for a relevant and a bit silly example, the grammar of “AI”, being a noun, is quite different from the grammars of the article “A” and the pronoun “I”). So why not let the grammar of “$1 \leq k \leq n$” in this context be as we all use it? I also don’t see how the ambiguity of $n$ is somehow more of an issue in $1\leq k \leq n$ than it is in $k \in [1,n]$. – Carl-Fredrik Nyberg Brodda Apr 20 '23 at 22:12
  • @Carl-FredrikNybergBrodda : I can respond to your latter comment. Before I do so, I would like to see your feedback to/acknowledgment of my response to your previous comment: "It seems the irony that the generated list is numbered with Roman numerals is lost on ChatGPT." Do you agree with that response of mine or not? – Iosif Pinelis Apr 20 '23 at 23:38
  • @Carl-FredrikNybergBrodda : Also, can you let me know how you propose to change grammar so that the phrase "for all $1\le k\le n$" could be read aloud, and how would you read it aloud? Please try to be quite specific here. – Iosif Pinelis Apr 20 '23 at 23:49
  • 2
    @IosifPinelis I would read it aloud, and let its grammar be, as "for all $k$ between $1$ and $n$", or a small variation of this -- because that's how it's used. I'm a descriptivist, not a prescriptivist. – Carl-Fredrik Nyberg Brodda Apr 21 '23 at 05:58
  • 2
    I upvoted this answer. But regarding arithmetic, addition and subtraction are easier with Roman numerals than with the usual decimal notation (of course, for numbers small enough for the Roman numerals to be used). – Martin Argerami Apr 21 '23 at 12:18
  • @MartinArgerami : Thank you for your comment and the upvote. – Iosif Pinelis Apr 21 '23 at 13:51
  • @Carl-FredrikNybergBrodda : Again, let us try to do everything in order. So, first, before dealing with "for all $1\le k\le n$", let us deal with your previous comment: "It seems the irony that the generated list is numbered with Roman numerals is lost on ChatGPT", which is factually incorrect. It would be nice to finally see your response to that. – Iosif Pinelis Apr 21 '23 at 13:57
  • I’m not sure you can make ChatGPT useful as a knowledge base—that’s just not what it is. It’s expert at modeling textual communication and was trained on a massive and diverse corpus, but anytime it gets a fact right that’s a happy accident. It has no concept of facts, just symbols and how they’re statistically strung together by humans. We’re the intelligent ones. It’s not. – bob Apr 22 '23 at 03:16
  • @bob : I don't know what you mean by a knowledge base, and I never said that it can be used as such. Whether it is intelligent or not is a matter of how intelligence is defined, and this is quite irrelevant here. What I said was that it is a tool, which could be useful if used appropriately. In the present case, it immediately gave 4 or 5 suggestions, of which at least 2 were good -- not bad at all. But nobody forces you to use it! – Iosif Pinelis Apr 23 '23 at 01:32
  • @Carl-FredrikNybergBrodda : You have been given at least three explicit opportunities to deal with your counterfactual comment "It seems the irony that the generated list is numbered with Roman numerals is lost on ChatGPT". However, you have not taken any one of these opportunities. I don't know why you chose to do so, but at this point I just want this fact to be recorded. – Iosif Pinelis Apr 23 '23 at 01:35
  • @Carl-FredrikNybergBrodda : You wrote: "I would read ["for all $1\le k\le n$"] aloud, and let its grammar be, as "for all $k$ between $1$ and $n$". -- I hope we can agree that, if we write a mathematical text in (say) English, then we should generally follow the rules of English grammar. Perhaps we can also agree that, whenever we choose to deviate from those rules and "let [our own] grammar be [something else]" (in your words), we should have a good, compelling reason for that. – Iosif Pinelis Apr 23 '23 at 02:02
  • Previous comment continued: In our case, according to the standard use of the quantifier "for all", its object is a symbol that should immediately follow it, as in $\forall x$. So, what symbol is the object of the "for all" in "for all $1\le k\le n$"? It can only be $1$, which leads to nonsense. The statement $1\le k\le n$ is a statement, and not a symbol, and therefore this statement cannot normally be the object of this quantifier. Moreover, we have good, fully grammatical alternatives, including "for all $k\in[n]$" and "for all $k\in{1,\dots,n}$". – Iosif Pinelis Apr 23 '23 at 02:03
  • Previous comment further continued: So, I cannot see any compelling reason for us to have our own grammar here. – Iosif Pinelis Apr 23 '23 at 02:03
  • 3
    I do not think AI-based answers (and questions) should be allowed here (at MO). – Moishe Kohan Apr 23 '23 at 17:43
  • 1
    @MoisheKohan : But why? And do you mean, exactly, by "AI-based"? What other tools would you forbid as well? – Iosif Pinelis Apr 23 '23 at 17:51
  • 2
    As in "ChatGPT gives...." If you happen to share ChatGPT's opinion on this, please say so directly instead of referring to ChatGPT. – Moishe Kohan Apr 23 '23 at 18:21
  • 1
    @MoisheKohan : First of all, you did not seem to answer any of my questions: (i) Why do you not think that "should be allowed here (at MO)"? (ii) What do you mean, exactly, by "AI-based"? (iii) What other tools would you forbid as well (and why)? – Iosif Pinelis Apr 23 '23 at 18:33
  • @MoisheKohan : As for "ChatGPT's opinion", I don't think it has opinions. I said in the beginning of my post, "ChatGPT gives a few examples of bad notation used more or less long ago, including (i) [...]; (ii) [...]; (iii) [...]. Then I expressed my opinion on each of these three examples, in more nuanced and detailed ways than simply stating "I share (or don't share) this 'opinion'". What could I possibly have done wrong here? – Iosif Pinelis Apr 23 '23 at 18:41