173

What's the most harmful heuristic (towards proper mathematics education), you've seen taught/accidentally taught/were taught? When did handwaving inhibit proper learning?

Gerry Myerson
  • 39,024
  • 10
    In view of many of the answers to this question, it might help to have in the statement a definition of heuristic as it is applied to mathematics. – Pete L. Clark Apr 26 '10 at 03:56
  • 11
    In fact, the harmful entity in most answers is not a heuristic at all! – Victor Protsak May 22 '10 at 15:07
  • 1
    Calculus. In many small Universities (mine included) students have to take Calculus before Real Analysis, and I think that this does some serious damage. – Nick S Jan 24 '21 at 00:53

39 Answers39

237

Not the most harmful, but a fun example (credit due to Tony Varilly):

"You can't add apples and oranges."

False. You can in the free abelian group generated by an apple and an orange. As Patrick Barrow says, "A failure of imagination is not an insight into necessity."

Dave Penneys
  • 5,335
  • 55
    This almost belongs in the mathematical jokes question. ;) – GMRA Oct 25 '09 at 02:13
  • 39
    Two apples plus three oranges equals five pieces of fruit. What's the problem? – Gerry Myerson Aug 19 '10 at 05:50
  • 43
    Indeed. Take the free abelian group A generated by the set of all types of fruit and consider the natural homomorphism onto the free abelain group generated by {Fruit} induced by sending each generator of A to the single generator of ... – Steven Gubkin Oct 02 '10 at 23:59
  • 6
    So according to Steven's remark, there is of course a universal way to add apples and oranges. (If this observation is not in Mathematics Made Difficult, then it ought to be.) – Todd Trimble Apr 17 '13 at 19:59
  • 27
    Just occurred to me to wonder whether we shouldn't be adding apples and oranges in the free abelian grape. – Gerry Myerson Feb 15 '15 at 04:59
  • 1
    Well, in the usual math notation, two apples plus three oranges can be written explicitly as $2e_{\mathrm{apple}}+5e_{\mathrm{orange}}$... – Matemáticos Chibchas Feb 15 '15 at 19:59
  • 20
    Isn't the saying "you can't compare apples and oranges"? I'm not aware of a natural order structure on the free abelian group generated by an appl and an orange. – Paul Siegel May 29 '15 at 02:29
  • 5
    I'd like to add that this comes up in differential geometry as well. When trying to define the covariant derivative, the intuitive idea of a limit doesn't make sense since you'd have to subtract vectors from different tangent spaces. In this sense, they are in different spaces and so it is like subtracting an apple from an orange. But if you define a connection between tangent spaces, and once you have that connection, you get a lot of powerful geometric objects that make rigorous this idea. So indeed, you cannot add apples and oranges.. until you find out how to relate apples and oranges. – Chris Rackauckas Apr 13 '16 at 01:10
  • 1
    The point is that you can't combine them as like terms, being inherently unlike. In other words, there is no solution (the the free abelian group of fruit) to the question 3 |apple> + 2 |orange> = 5 |x>.

    This is far from being harmful, as every elementary algebra teacher knows and as every intro quauntum mechanics professor knows.

    – Michael Maloney Jun 26 '16 at 17:09
193

This isn't really a heuristic, but I hate "functions are formulas". For most students it takes a really long time to think of a function as anything other than an algebraic expression, even though natural algorithmic examples are everywhere. For example, some students won't think of

\begin{gather} f(n) = \{\text{1 if $n \bmod 2 = 0$ $\lor$ $-1$ otherwise}\} \end{gather}

as a function until you write it as $f(n) = (-1)^n$

Qiaochu Yuan
  • 114,941
  • I think this actually comes up even as late as upper division linear algebra, when they start to talk about general linear transforms, or, something even harder for students, the space of linear transforms from V to W. I worked with a student for quite a long time on this – Michael Hoffman Oct 24 '09 at 21:27
  • 2
    I think that by the second year of high school normally smart students should perfectly get the point with this. – Qfwfq Apr 25 '10 at 19:06
  • 36
    I'm a high school student and I can safely say that most of my peers just don't get what a function is. The only ones who do seem to have learned from programing. Then again, all the really mathematically talented students in my very small school also program...

    Functions seem to get slipped in somewhere along the line without a proper introduction, and then it is assumed that students know it from there on in.

    – Christopher Olah Apr 25 '10 at 22:21
  • 15
    Actually I still have a lot of trouble going back the other way, to "functions are polynomial formulas, not maps" in algebraic geometry and/or combinatorics. – Elizabeth S. Q. Goodman Jan 27 '12 at 08:31
  • 3
    Moreover, even for the best of mathematicians on 18th and much of 19th century "function" meant "analytic function"... – Michael May 28 '15 at 21:06
  • 6
    That's precisely the Euler view: in his works, "continuous functions" were those you may write with a single analytic expression. So 1/x was continuous, while "0 for $x<0$, $x$ for $x\ge 0$" wasn't. – Kolya Ivankov Oct 20 '16 at 09:56
  • 8
    Much worse than this heuristic is the "official" definition of a function $X\to Y$ as a subset of $X\times Y$ satisfying various axioms. – Amritanshu Prasad Oct 20 '16 at 18:05
  • 7
    @AmritanshuPrasad, are you saying that doesn't match your intuitive picture of a function? This sort of intuitive friendliness is why I define a partial function as the converse of an injective relation; it's so much clearer that way. – LSpice Oct 20 '16 at 18:39
  • 1
    @LSpice My comment was not an answer to the question, in the sense that this definition is not a heuristic. It registers my disagreement with Qiaochu;s answer. My point is that sometimes functions should be thought of as formulas, rather than as relations of a special kind, and this is something that mathematicians often do (for example when we talk about formal power series or polynomials as functions). – Amritanshu Prasad Nov 04 '16 at 08:55
  • @AmritanshuPrasad, I agree with you. I was just joking. – LSpice Nov 04 '16 at 18:13
  • 1
    putting on my constructivist hat Well if you define a function over the naturals by its values for odd and even numbers, then you must prove that this indeed defines a function for all $n:\mathbb N$. Some people can even claim this way that there are discontinuous functions $\mathbb R \to \mathbb R$! – Anton Fetisov Aug 25 '17 at 22:54
160

A tensor is a multidimensional array of numbers that transforms in the following way under a change of coordinates...

I saw that for years, and I never understood it until I saw the real definition of a tensor.


[Clarification] Sorry, I did leave that very vague. A tensor is a multilinear function mapping some product of vector spaces $V_1\times \cdots \times V_n$ to another vector space. In the context of differential geometry, we're really talking about a tensor field, which assigns a tensor to every point that acts on the tangent and/or cotangent spaces at the point.

A more abstract definition is possible by considering tensor products of vector spaces, but the definition using multilinear functions is (to me) extremely intuitive and general enough for a first encounter. It also leads naturally enough to the abstract concepts anyway, as soon as you start thinking about the set of all tensors of a particular rank and its structure.

The "multidimensional array" definition suffers from conflating object and representation. The array is an encoding of the underlying multilinear function, and it's perfectly reasonable if understood in that way (to partially reply to Scott Aaronson's comment). Unfortunately, the encoding depends on an arbitrary choice (coordinate system), while the underlying function obviously doesn't, so it gets very confusing if you try to use it as the definition.

Regarding accessibility (also referring to Scott Aaronson's comment): I don't really agree: I think multilinear functions are pretty accessible. Assuming a familiarity with vector spaces and linear transformations, multilinear functions are a natural and very tangible extension of those ideas. And since multilinearity is the key concept underlying tensors, if you're going to deal with tensors, you should really just bite the bullet and deal with the concept.

Darsh Ranjan
  • 5,932
  • 4
    What's "the real definition of a tensor"? Element of a tensor product? A section of a tensor bundle? – Victor Protsak May 22 '10 at 15:09
  • 17
    I second this wholeheartedly!!!!!! @Victor Protsak: For me, "the real definition of a tensor" is something like the following. "Let M be a smooth manifold; let FM be the space of smooth functions on M, let VM be the space of smooth vector fields on M, and let VM be the space of smooth covector fields on M. A (k,l) tensor is a multilinear map from (VM)^k x (VM)^l to FM." There might be more abstract and versatile definitions, but this one seems to work pretty well in the context of general relativity, which is where the definition Darsh Ranjan quoted tends to show up (in my experience). – Vectornaut May 22 '10 at 22:01
  • 1
    OK, a proposed intuitive point of view: say you have an ordered pair of vectors. If you multiply one of them by a nonzero scalar and the other by the reciprocal of that scalar, you've got a different ordered pair of vectors. But in both cases you have the same tensor. All the other algebraic requirements that are supposed to be satisfied are just there to make sure the algebra works out neatly the way it should. But this one is where the basic intuition is. – Michael Hardy Jun 16 '10 at 15:08
  • 2
    For that matter, teaching linear algebra and doing echelon forms and so on does not strike me as very enlightening. When I saw a matrix for the first time, it was already in the context of linear maps and for given bases of source and target spaces. – Thierry Zell Aug 18 '10 at 21:54
  • 5
    I second, third and fourth that, Darsh. That particular definition of tensor set back my understanding of differential geometry by at least a year. – Cosmonut Oct 03 '10 at 03:39
  • 19
    The trouble I have is that none of the alternative definitions on offer seem accessible to someone first learning about tensors! Related to that (in my mind), they don't make clear how one would actually represent a tensor on a computer (e.g., how many degrees of freedom are there, and what do we do with them?). So, is there a way to explain what tensors are that satisfies those constraints but also leads to fewer wrong intuitions? – Scott Aaronson Apr 18 '13 at 05:26
  • 4
    I agree with Scott Aaronson. In fact, the physicist way of defining tensors as things that change correctly under coordinates gives a nice way to define tensor fields on manifolds (Simply a smooth collection of multi-index beasts on different open sets such that on the intersection they are related by an appropriate transformation (the transition functions of the tensor bundle)). I am not sure if this "heuristic" actually gives rise to wrong intuitions. – Vamsi May 29 '15 at 00:24
  • 3
    @Vamsi - from personal experience: maybe not wrong intuitions, but rather absence of intuition. – Darsh Ranjan Jun 26 '15 at 17:57
  • 1
    I completely agree with @DarshRanjan: in fact, one of the main reasons why I never really learnt general relativity, is because I never understood that definition (how can numbers vary? Shouldn't it be, in any case, a multidimensional array of functions? Any assumptions on smoothness of those functions? I never found satisfactory answers to those questions). I can finally say that I'm starting to get the meaning of a tensor, thanks to the definition that you wrote in this answer, which I just read. – David Fernandez-Breton Nov 03 '16 at 14:09
  • 2
    An advantage of the "physicist's definition" is that it treats all tensors, of any rank, on an equal footing. With the definition favored by Darsh Ranjan and Vectornaut, we have to first build up the notions of scalar, vector, and covector, then define all the other ranks, and then go back and do some more equivalence classing to deal with the fact that the definition is not consistent, because a vector is not literally the same as the dual of its own dual.[...] –  Jan 19 '17 at 00:14
  • 2
    [...] It also doesn't give a definition of a scalar that maps nicely onto how physicists think about scalars. E.g., a physicist does not think of a time coordinate as a scalar, but a time coordinate is certainly a "smooth function on M." A scalar has to be invariant under a change of coordinates. –  Jan 19 '17 at 00:15
  • The pedagogical difficulty with tensors is that there are many more possible operations with them than with matrices and vectors. If I'm comfortable with composition of linear maps and smooth vector fields, then it's not a big jump to tensors on a manifold with the right definition. But then my teacher has to actually compute something, and the quickest way to write down tensor contractions is to use these multi-index beasts, and we have to go through an explanation of these to get the calculation, and by this point I've forgotten the point and just see indices. – Tim Carson Sep 20 '17 at 04:56
  • 5
    The definition of tensors as linear maps, although common, is not the "right" definition. It requires you to define (1,0) rank tensors as linear maps from the dual space, and relies on the isomorphism between a vector space and its double dual. But that isomorphism fails unless your vector space is finite dimensional. A tensor should not be a linear map, It should just be an element of the appropriate tensor product. – ziggurism Oct 27 '18 at 01:09
  • A bit late, but the definition of tensors that I got the most mileage of wrapping my head around (and connecting to the other more rigorous definitions) was that a tensor (for two vector spaces) simply a linear combination of the symbols $e_i\otimes f_j$ where ${e_i},{f_j}$ are a basis of the two vector spaces. – Yongyi Chen Nov 06 '20 at 22:38
137

Along the same lines as Qiaochu's and Zach's responses, the commonly taught heuristics pertaining to functions, differentiability and integration are a pet hate of mine.

I certainly left school thinking of functions as formulas involving combinations of elementary functions and having a very poor understanding of the relevance and correct relationship between integration and differentiation, the worst manifestation of which, now that I'm a bit older, seems to have been that

Differentiation is a nice, computable operation and tells you about functions; integration is hard and tells you about areas under curves.

Areas under curves never seemed interesting. As an analyst, my personal feelings towards them are now almost entirely reversed and I think of integration as my friend and differentiation as the enemy.

Differentiation uses up regularity; integration smooths.

Spencer
  • 1,771
  • I went through the same reversal as you recently. Slightly different but related reason. My explanation is here. – Dan Piponi Mar 01 '10 at 05:03
  • 78
    That's because on formulas differentiation is nice and integration is hard, but on computable functions differentiation is hard and integration is nice. In theory, we have a denotational semantics between formulas that functions that should transport these notions back-and-forth, but we really really don't. There are tons and tons of papers in computer algebra which basically boil down to this massive gulf between abstract analysis (the study of functions given by properties) and concrete analysis (study of functions given by formulas). – Jacques Carette Mar 13 '10 at 03:50
  • 15
    I'm upvoting this partially because I agree, but mostly because you used the term "pet hate" as opposed to "pet peeve". – James Weigandt Apr 26 '10 at 00:51
  • 1
    @Jacques: that's really well-phrased! I had an "a-ha" moment reading your comment. – Neel Krishnaswami Apr 26 '10 at 09:40
  • I had a similar reversal in a different area. In numerical analysis, numerical integration is (relatively) well understood, stable, and generally nice, while numerical differentiation is kind of a mess. But until graduate school I would have said the opposite. – Andrew T. Barker Apr 18 '13 at 10:02
  • 5
    I’m reading this as a second year undergraduate student and I didn’t go through this kind of reversal yet. I’d be glad if someone would give a short explanation in layman terms why it’s the other way around for computable functions! – Lenar Hoyt Aug 21 '13 at 01:26
  • 12
    @user8823741, in general, numerical differentiation is an unstable process. Think about how the derivative is defined; you are in effect subtracting two nearly equal quantities to get a tiny result, and then dividing that tiny result by another tiny value to get a result that is almost often far from tiny. That's a lot of opportunities for a computer to slip up. – J. M. isn't a mathematician Jun 09 '16 at 04:12
  • 1
    @J.M. Unless you do complex-step differentiation which gives you near machine-tolerance accuracy by avoiding the round-off error completely. Of course, then you need a function that can work with complex numbers. – Thomas Antony Jun 26 '16 at 18:44
  • 1
    For anyone who's been waiting 6 and a half years, @DanPiponi's link got mangled, but (I think) was meant to point to http://mathoverflow.net/questions/11540/what-are-the-most-attractive-turing-undecidable-problems-in-mathematics#15770 . – LSpice Oct 20 '16 at 18:43
  • @JacquesCarette Could you provide some elementary example on what you said or elementary literature on it? I got a bit curious about it. – Red Banana Dec 10 '20 at 03:03
  • 3
    @BillyRubina There are no elementary examples, else this would be well-known. But recall that Weierstrass's example of a continuous but differentiable nowhere function is computable. So it would be a torture test to any purported differentiation algorithm. As far as I know, there is no literature that makes this point, because there's no one who studies both kinds of analysis simultaneously. – Jacques Carette Dec 10 '20 at 12:29
116

The "FOIL" (first+outside+inside+last) mnemonic for multiplying two binomials is terrible. It suppresses what is really going on (three applications of the distributive property) in favor of an algorithm. In other words, it is teaching a human being to behave like a computer.

The legacy of FOIL is clear when you ask your students to multiply three binomials, or two trinomials. Students usually either have no idea what to do, attempt it but get lost in the algebra, or succeed but complain about the arduousness of the task.

CJP
  • 1
  • 1
    I am usually the first in line to bash blind application of algorithms, but, interestingly, I've never had the problem that others do with FOILed students. My calculus students insist on referring to multiplying any two (non-monomial) expressions as “FOILing it out”, but they seem—even the ones who also think that $x^3 - x = x^2$—perfectly able to multiply, say, two trinomials correctly, even though the acronym ‘FOIL’ makes no sense in that context. – LSpice Apr 25 '10 at 18:14
  • 22
    I can't stand FOIL! It seems to indicate to students that order matters here. I don't see what FOIL adds, but it certainly detracts from the idea of just multiplying all the pairs and adding. Instead of teaching the idea (which they'll never forget), they now have something memorized (easy to forget). And I once had a student erase their correct work because they accidentally did FLOI or something and rewrite the same thing in a different order. – Matt Apr 25 '10 at 18:43
  • 10
    As much as I dislike teaching mathematics "algorithmically", there is a reason why FOIL is taught as such: by forcing the user to adopted an algorithm, you can minimize mistakes. Doing things "in order" is a good habit, which should be encouraged. It is unfortunate the trend where "educators" take good practices, and distil from it something all but recognizable... – Willie Wong Apr 25 '10 at 21:18
  • 42
    As a high school teacher, I usually encountered students after their first exposure to FOIL, so I made a point to revisit the process and introduce "Super-FOILing" (which, of course, was just applying the distributive property to two polynomials of any length).

    Yes, yes: I hammered proper terminology and all the conceptual stuff, too, but starting off with "Ah, so you can FOIL ... but can you SuperFOIL?" really made the ears perk right up!

    In a way, prior exposure to FOIL was helpful to me, providing an accessible object lesson that math is always "bigger" than any of us are ever taught.

    – Blue Apr 25 '10 at 21:23
  • 2
    I partially disagree, because for the sake of computations algorithms and notations actually do not have to show "what is really going on": the more non-relevant information they suppress, the better to efficiency. Nevertheless, sooner or later a student should be told what's behind the sum of fractions, or the multiplication of polynomials. – Pietro Majer Jun 25 '10 at 11:05
  • 7
    "teaching a human being to behave like a computer" -- or like a dog. Show students an expression like $(x-1)(x-3)$, and many will have a Pavlovian response "FOIL!" even if it doesn't do them any good. – Todd Trimble Aug 25 '12 at 20:43
  • 57
    Todd: Ask students on an exam to solve an equation such as $(x-1)(x-2)(x-3)(x-4)=0$. I've done this a couple of times. A very common attempt of solution was to expand things out (often making mistakes along the way), contemplate the new, messy equation, and declare, "It can't be factored!". Sad... – Pedro Teixeira Apr 17 '13 at 21:35
  • 24
    @PedroTeixeira Surely this is the way to proceed! Let $y = x - 2.5$; then $x = y + 2.5$, and the equation becomes: $0 = (y + 1.5)(y + 0.5)(y - 0.5)(y - 1.5) = (y^2 - 2.25)(y^2 - 0.25)$, which holds when $y = \sqrt{2.25}$ or $y = \sqrt{0.25}$. For each square root we obtain two possible $y$-values; add back the $2.5$ to each to get the four possible $x$-values. A similar approach can be found by observing $(x-1)(x-4) = (x-2)(x-3) - 2$; now denote the LHS by $y$ so that the original equation becomes $y(y+2) = 0$; solve for $y$ using the quadratic equation, etc. – Benjamin Dickman Aug 19 '14 at 06:44
  • The reflex to multiply out any pair of binomials is similar to the reflex to take the determinant of any matrix. Some students are so used to being asked to compute determinants that they replace any small square matrix with a scalar, including when asked to solve $A\vec{x}=\vec{b}$. – Douglas Zare Oct 25 '16 at 20:31
  • 1
    @DouglasZare As long as they stick to square ones. I have seen so many students "compute" the determinant of non-square matrices (in many very weird ways). – Tobias Kildetoft Aug 10 '17 at 11:21
  • @ToddTrimble I remember doing the same thing in high school - I was given something like (x-1)(x-5) = 0 and asked for the roots, my solution was to multiply them together and use the quadratic formula – mc-lunar Oct 27 '18 at 01:55
110

"Stacks are schemes with groups attached to points."

I don't know how much damage this has caused, but I never understood how it was actually helpful to anybody. Not only is it hand-wavy (which is okay for a heuristic), but it's hand-wavy in a way that can't really be corrected (because it's false). My feeling is that people who adopt this heuristic are trapped. If they use the heuristic to come up with a result, it's very hard to sharpen the reasoning to turn it into a proof. You have to just start from scratch and not use the heuristic.

  • 9
    How do I up-vote answers multiple times?!?! – Kevin H. Lin Oct 24 '09 at 23:02
  • 18
    By leaving a comment explaining that the answer is so great others just have to upvote it. You convinced me, by the way, to give my last daily vote :). – Ilya Nikokoshev Oct 24 '09 at 23:50
  • 8
    Anton: ok, the heuristics of "groups attached to points" is very incomplete, but... so how do you (heuristically) imagine a stack, you really think of it as a forest of objects and arrows over the category of schemes?? [*/G] ? Orbifolds? Orbifold curves? Gerbes? – Qfwfq Apr 25 '10 at 19:14
  • 5
    @unknown: How do you (heuristically) imagine schemes? It's fine to use terminology like "fat point" so long as you keep in mind that the "fatness" of a point is not all the information there is: Spec(k[ε]/ε³) is different from Spec(k[x,y]/(x²,xy,y²)), even though they're both "fat points of order 3". Similarly, points of stacks do indeed have automorphism groups, but it is important not to think that that's all there is to it. I guess my point was that I feel like too many people take this heuristic as the definition, so they are not sufficiently mindful of its limitations. – Anton Geraschenko Apr 25 '10 at 22:56
  • 5
    This seems to be a carbon copy of a very useful, in my opinion, heuristic "orbifolds are manifolds with groups attached to points". – Victor Protsak May 22 '10 at 15:15
  • 8
    This seems to me to be one of those heuristics which is very useful as a first approximation, but very misleading if one starts to think of it as the whole story. – Peter LeFanu Lumsdaine Sep 27 '10 at 18:21
  • 3
    This heuristic is analogous to the heuristic that groupoids are sets with groups attached to their points. We forget how the points interact with each other. – HeinrichD Nov 10 '16 at 10:06
  • Wait... this isn't true? – Nico A Oct 28 '18 at 14:56
106

Two-column proofs

Usually the only proofs that students see upon graduating from high-school are the geometry "two-column" proofs, and trying to convince them that the essence of mathematical proof lies not in the form but in the logical deductive argument takes a lot of convincing.

  • 12
    Do students even see the two-column proofs any more? From some things I've read I've gotten the impression that those have been pushed aside in favor of just not proving anything at all. – Michael Lugo Oct 28 '09 at 23:11
  • 28
    They certainly do. – Akhil Mathew Oct 29 '09 at 00:06
  • 1
    Excellent answer. While there's some great responses in this list, this one definitely gets my vote for most harmful. – Cam McLeman Apr 26 '10 at 03:27
  • 10
    If students are taught that two-column proofs are the only kind there is, then I agree that they could be harmful. However, I think the framework of two-column proofs can be extremely helpful in teaching students to think through the underlying structure of a proof before trying to write it out in paragraph form, because it helps them avoid vague hand-waving arguments. When I teach undergrads how to do proofs, I have them write two-column proofs first, and then explain that "This is what the proof looks like naked. But to take it out in public, you need to put clothes on it." – Jack Lee Aug 18 '10 at 17:30
  • 41
    ...what is a two-column proof? – Piero D'Ancona Aug 18 '10 at 19:57
  • 17
    A two-column proof is a proof arranged as a series of numbered statements, with the statements in the left-hand column and corresponding justifications in the right-hand column. This used to be the way proofs were universally taught in US high-school geometry courses. They're still taught this way, but somewhat less universally, I think. – Jack Lee Aug 18 '10 at 20:39
  • 6
    I realize I am very late to the party here, but I couldn't resist commenting that the high school student I am currently tutoring is required to do these types of proofs. In fact, when I explain to people that research mathematicians prove theorems, the most common response I get is "I hated doing proofs in geometry!" Upon examination, I always find that they did two-column proofs, and this is their only association with the term. – Jeremy West Dec 06 '10 at 18:21
  • I've never seen a two-column proof. – Michael May 28 '15 at 21:13
  • @MichaelLugo Speaking as someone who took highschool geometry three years ago...so six years after you wrote that comment...yep, they're still a thing. – auden Aug 19 '19 at 21:53
96

"Generalization for the sake of generalization is a waste of time"

I think that generalization for the sake of generalization can be rather fruitful.

Gil Kalai
  • 24,218
  • 185
    Whoever first said that had in mind one or two specific examples of empty or shallow generalizations, and generalized based on those examples, purely for the sake of generalization. – Tracy Hall Aug 18 '10 at 22:18
  • 15
    I'm not sure if this statement can be generalized ... – Hagen von Eitzen Dec 17 '14 at 15:55
85

"Truth is binary. If a theorem has been proven once, there is no need in a second proof."

  • 2
    Is this a quote? If yes, who do you quote? – eins6180 Dec 02 '14 at 21:23
  • 10
    I am not quoting anything. I am merely trying to clarify that both of my sentences are part of the false heuristic, rather than the first being the false heuristic and the second being its refutation. Maybe I should have used parentheses, but I don't want to be that guy. – darij grinberg Dec 02 '14 at 22:44
  • 3
    A genuinely better proof, in the sense that you feel pretty certain your argument is easier to follow/more intuitive and perhaps even shorter, is ALWAYS of value. And we should not discourage people from publishing their work when they chance upon such an improvement. It makes the field more accessible to newcomers and speeds up advancement. – Michael Cotton Oct 06 '17 at 04:34
  • I don't think this statement is false. A single proof is sufficient to show that something is true, and a second is not necessary. While I do agree that improved proofs benefit society, I think you have missed the point of this statement. – user400188 Sep 24 '20 at 09:53
  • 2
    @user400188: It's not false as a statement; it is harmful as a heuristic. – darij grinberg Sep 24 '20 at 09:58
  • 1
    Sorry, I read your first comment and got that impression. In retrospect it fits perfectly with the question asked. – user400188 Sep 24 '20 at 10:03
84

Linear algebra purely as row manipulations. I've written about this here:

Students stuck in a rut of thinking of matrices as a clever way to arrange numbers will get lost and confused; I know this because I was one of those students. I had to “de-program” what I was taught in high school before I could grasp what was going on.

Jason Dyer
  • 2,585
  • 3
    Agreed. It's really hard to internalize what all those intermediate steps in a row reduction actually mean. – Qiaochu Yuan Oct 24 '09 at 21:11
  • 3
    I had no idea why matrices would exist until beginning the linear algebra class I'm currently in. They seemed perverse and non-sensical. They really don't belong in high school math, frankly. I didn't even remember how to multiply them until I refreshed myself recently. – DoubleJay Oct 25 '09 at 16:47
  • By the time I got to linear algebra last year, I had already totally forgotten how to multiply matrices. Luckily, for proofs, the definition of matrix multiplication is a better way to prove something than drawing out (with ...'s) a big nxn matrix. – Harry Gindi Apr 25 '10 at 15:43
  • 7
    I didn't come across matrices until university, but I wholeheartedly agree that linear algebra should not begin with matrices and their operations. I didn't get a proper view of linear algebra (especially the determinant, which was basically taught by giving the definition and making the students calculate the determinant of a general four-by-four matrix by hand) until I read Sheldon Axler's "linear algebra done right". There the pedagogical idea was to begin with linear mappings and noting as a side note how they can be presented with these funny squares of numbers etc... – Rami Luisto Apr 17 '13 at 20:13
  • This. Matrices in high school were one of the things that pushed me towards going for Physics instead of Math in college. Only when I had my linear algebra course did I get what matrices were about. – finitud Apr 25 '14 at 15:36
  • 10
    Picking a basis in a vector space is the root of much evil – Hagen von Eitzen Dec 17 '14 at 15:52
78

Similar to Tom's answer,

a vector is a mathematical quantity with both a magnitude and a direction.

Useful for distinguishing between speed and velocity but little else. The above is a typical definition from a physics textbook I had on the shelf; here in British Columbia, vectors are introduced in high school physics but not high school math. By the time students get to linear algebra in first- or second-year university, it can be hard to convince them that a real number (much less a polynomial) can be a vector. Usually, you have to resort to "a real number does too have a direction: positive or negative" and even then they don't believe you because

a scalar is a mathematical quantity with a magnitude and no direction

and so if real numbers are vectors, how can they be scalars?

Don't even ask about function spaces.

Ross Churchley
  • 631
  • 7
  • 12
  • 19
    My mother had an old "Advanced Calculus" book lying around when I was in high school. It mentioned this old chestnut and commented that it is a poor definition because some things are vectors but have neither magnitude nor direction (like scalars) and some things have both but are not vectors (like trains). – Ryan Reich Nov 19 '11 at 07:06
  • 6
    +1: it's just wrong for so many reasons. For one thing, it sounds sort of like a reduction of math to physics or something. For another, you need something like an inner product to make sense of it. But worst of all, it's totally ass-backwards when it comes to abstract mathematics, because "vector" has no independent meaning. Rather, a "vector" just means an element of some given vector space, which is a set equipped with ... so it's the concept of vector space which is primary, not vector! Paul Halmos had a similar rant in his automathography. – Todd Trimble Aug 25 '12 at 19:54
  • 8
    If you are trying to say that $\mathbb R$ is a real vector space, do people really object that $-3$ and $+3$ only have magnitudes, and not directions? I prefer an actual definition over a misleading characterization, but I don't think this one leads to big problems. – Douglas Zare Apr 17 '13 at 20:47
  • 25
    I recently heard someone joke that a movie must be a vector, since it has both length and direction. – Gerry Myerson Oct 20 '16 at 22:45
  • The magnitude-and-direction definition doesn't even really work in physics. In relativity, you can't define a vector by its magnitude and direction, because a nonzero vector can have a zero magnitude. –  Jan 18 '17 at 13:51
67

One extremely harmful heuristic I held until fairly recently: identifying math with algebraic manipulation. When asked to prove an identity or an inequality I would often dive straight into algebraic manipulation of the relations that I knew, wasting many many hours of my time. I have found that it is much more useful to try and test statements against examples I already know, and to try and rephrase identities and inequalities in terms of a statement in natural language that I have some intuition for.

Kevin Teh
  • 775
58

"Categories can be specified by objects alone." It's easy to get this impression, because people who are familiar with the categories in question already know the morphism structure, and don't bother to specify it. There is a related heuristic concerning the composition law, but it doesn't seem to burn people as often.

S. Carnahan
  • 45,116
  • 9
    Similar abuses of language include naming a model category by its fibrant objects ("the model category of quasicategories") or a 2-category by its 1-morphisms ("the 2-category of spans"). – Reid Barton Oct 24 '09 at 22:03
  • 29
    yet nobody is brave enough to name categories from the name of arrows, like if we said "category of continuous mapping" for Top, etc. – Pietro Majer Jun 25 '10 at 10:55
  • 1
    It's worse! A classmate of mine once gave a talk about that very problem, where he explained how sloppiness with the very nature of the arrows had set back the understanding of a problem many years. I wish I could remember the topic, but it's been over 10 years. Still, even though I don't work with categories (explicitly), the essence of this chilling tale remains with me to this day. – Thierry Zell Aug 18 '10 at 22:02
  • 9
    @Pietro With the exception of Ehresmann and his school. :-) – Robert K Mar 13 '11 at 15:24
  • 9
    I'd like to hear a convincing example where this has really been a problem. Usually there's a default notion of morphism (think of the category of sets, for instance), and in my experience, when anyone departs from the default, they make a point of it (e.g., the category or bicategory of sets and relations -- see, I didn't specify the 2-cells just now!). I hope Thierry can remember the details of his tale. – Todd Trimble Aug 25 '12 at 19:40
  • 11
    Ironically, I just had an example the other day (linear codes) where it wasn't completely clear to me what the correct notion of isomorphism should be!! So this is me answering my former (August 25 2012) self. – Todd Trimble Apr 18 '13 at 15:24
  • 6
    @PietroMajer, famous counterexample: The category of cobordisms. And maybe "Linear categories", if you wish. – Manuel Bärenz Sep 10 '14 at 14:55
  • Typically a category seems to be named after the objects unless there is already a category with the same objects unless (Set and Rel.) – Christopher King Feb 14 '15 at 03:47
  • 5
    @ToddTrimble: The category of metric spaces. There are 6 natural classes of morphisms which are also used in practice: continuous maps, short maps, isometric maps, Lipschitz maps, equicontinuous maps, quasi-isometries. – HeinrichD Nov 10 '16 at 10:12
  • A version of this is trying to identify a group with its collection of irreducible representations or their characters. There is more in Tannaka duality. – Kapil Oct 27 '18 at 05:12
  • 1
    I'd argue that categories really should be named after their objects, but that you can only tell what the objects really are by looking at the structure of the morphisms. For example the category $\mathbf{Rel}$ of "sets" and relations should really be called $\mathbf{FreeSupLat}$. – Oscar Cunningham Oct 27 '18 at 09:37
48

"A continuous function is one you can draw without raising the pencil"

This has terrible disadvantages when generalizing functions defined on a real interval to non connected sets, non compact sets and in general topological spaces.

Bruno Stonek
  • 2,914
  • 1
  • 24
  • 40
  • 104
    oh and I heard of a student claiming that "x+1" is not continuous because you need to raise the pencil at least twice whn you write it. – Pietro Majer May 22 '10 at 16:57
  • 10
    @Pietro: Se non e vero, e ben trovatto! – Victor Protsak May 23 '10 at 07:06
  • 5
    Victor: compliments, very good knowledge of Italian -and Italians – Pietro Majer May 23 '10 at 22:35
  • 9
    Pietro, that's just too funny (albeit in a sad way). For that matter, $x$ is discontinuous, unless you're in the habit of making your $x$'s look like $\alpha$'s. – Todd Trimble Aug 25 '12 at 19:58
  • 1
    Yes, and certainly because of that I remember to be shocked when I realized that a map like $x^2 \boldsymbol{1}_{\mathbb Q}(x)$ is continuous (with all derivatives continuous!) at zero. – Adrien Hardy Feb 15 '15 at 10:32
  • 1
    @AdrienHardy Sorry, but $x^2\boldsymbol1_{\mathbb Q}(x)$ has no twice derivative at $0$ since it's not differentiable in a neighborhood of $0$. –  Nov 25 '15 at 19:44
  • 3
    If $x+1$ is discontinuous because you need to raise the pencil, does it even pass the vertical line test? – Joe Berner Oct 20 '16 at 13:26
  • @JoeBerner "$1$" certainly doesn't! – Neal Oct 20 '16 at 14:37
  • 5
    The idea that continuity means no jumps and holes and then differentiability means no pointy places or vert ramps is actually pretty useful for students as long as you stress that you're only taking about real functions. – Michael Cotton Oct 06 '17 at 05:13
46

That there is something weird and unsavory about field extensions that are not separable and that serious contemplation of such things should be put off to the indefinite future.

(In fact, much of the richness and "pathology" of geometry in characteristic p is easily understood once one has a firm grasp of how field extensions behave.)

Pete L. Clark
  • 64,763
  • 7
    Moreover, the heuristic that there is something weird about the "theory of the automorphism groups" of inseparable extensions. Rather, the automorphisms that do exist are perfectly fine; it's just that inseparable extensions are more rigid, so there are fewer of them. – Jay Apr 27 '10 at 02:36
  • 11
    @Jay True in one sense, false in another. I remember in grad school several of us got interested in computing the group scheme of autmorphisms of an inseparable extension. It's length is more than the degree, although all of that length is nilpotent, so you don't see it in the actual automorphisms. – David E Speyer Apr 11 '11 at 12:16
42

In elementary school, there are false principles which take a lot of effort to overcome:

  • Math problems have one answer.
  • There is one right method.

These may be ok (though the second is debatable) when you are working on $1+2$, but not when you are supposed to isolate a variable, to graph a function, to recognize how you can apply the chain rule, to solve a complicated word problem, or to prove something. Many students don't think math is a place to experiment or to apply creativity. They are afraid to take incorrect steps even when it is no longer convenient or possible to say what the right first step is.

There is an interesting app called Dragonbox. It is very popular in Norway. When children think of algebra as a puzzle or game, they feel free to experiment, and they quickly learn to do things like isolate variables which usually give algebra students trouble. See also Terry Tao's blog posts on gamifying algebra. Students can learn to solve the problems, but have difficulty because these incorrect principles get in the way.

Douglas Zare
  • 27,806
38

The opposite of Qiaochu's dictum is just as misleading - "formulas are functions". There are a lot of non-denoting expressions! It's just that mathematicians don't tend to write non-denoting terms very often. Of course, there's a good reason for that - you can't prove anything interesting about non-denoting terms (or rather, way too much). But then students never get the intuition that there are expressions which are 'junk', nor tools to prove that something is 'junk'.

My favourite 'junk' expression is $$1/\frac{1}{\left( x - x \right) } $$

Lest you think this is not very important, try to "teach" first-year calculus to a computer, and you'll see how these non-denoting terms are most troublesome.

36

"Vectors are directed line segments." When worded this way, this utterance is only acceptable if the student is satisfied with getting on his or her bicycle at the end of class and never returning to mathematics again.

  • 6
    Well...in principle, you could define a vector of, say, R^2 to be an equivalence class of "directed line segments". – Qfwfq May 11 '10 at 12:26
  • 4
    That's verbatim how I learned the definition of vector. But the "equivalence class" part of it changes everything (and did not go over too well with many of the other students; it was junior-high after all...) – Thierry Zell Aug 18 '10 at 22:06
  • This was (more or less) the definition I heard when I was 7 or 8. I think it's great for a seven or eight-year-old, but probably not so great for an undergraduate mathematics major. :) – apnorton Oct 17 '14 at 19:41
  • 12
    Could you say in more detail what's wrong with this one? In an affine space, a directed line segment is indeed the same thing as a tangent vector. And there's no need for equivalence classes—line segments based at different points live in different tangent spaces, so they shouldn't be identified (although all the tangent spaces are canonically isomorphic through translation). I certainly agree that it's harmful to give the impression that all vectors are directed line segments, but I think it's very true and useful to point out that all directed line segments are vectors. – Vectornaut Feb 15 '15 at 04:06
31

Not sure if this qualifies exactly, but I can never remember which theorems of group theory apply to finite groups, and which ones apply to groups in general. Anytime I remember a result, I have this sinking feeling that it appears in a textbook preceded by "for the remainder of this section, let G be a finite group." I'm not sure how well-founded this fear is (other than the theorems that obviously don't make sense for infinite groups, like the Sylow theorems).

Gabe Cunningham
  • 1,861
  • 6
  • 26
  • 31
  • 18
    By the way, the Sylow theorems make sense (and are true, I think) for infinite groups if you make a few modifications. A p-Sylow subgroup is a maximal subgroup which is a p-group. The first theorem (existence) is obvious by Zorn's lemma. The second (that all p-Sylows are conjugate) is interesting. The third is interesting if the index of a p-Sylow is finite or if the number of p-Sylows is finite. – Anton Geraschenko Oct 24 '09 at 21:47
  • 8
    There are also profinite Sylow theorems, yielding the existence of a maximal pro-p subgroup. The proofs are relatively straightforward extensions of the finite proofs. – S. Carnahan Oct 24 '09 at 21:54
  • 3
    This got me in a lot of trouble in my first-year graduate algebra class. I also had a habit of forgetting that infinite groups even exist, which is the same sort of thing. – Michael Lugo Oct 24 '09 at 22:06
  • 1
    @ML: right. I don't think textbooks can be fairly construed to be confusing about which results apply only to finite groups. BUT most undergraduate algebra textbooks I have seen certainly give the impression that finite groups are more important, more natural, and more studied than infinite groups, when many if not most mathematicians would say that the reverse is true. – Pete L. Clark Mar 01 '10 at 00:08
  • I was tempted to try adding something like this to the false beliefs question.

    At a higher level, the same becomes true with the properties "finitely generated" or "residually finite".

    – Jonathan Kiehlmann May 09 '11 at 08:35
  • Michael Artin's Algebra is a very good antidote to the typical finite-group-obsessed undergraduate algebra textbook. I used a lot of material from Artin's book when teaching group theory. I certainly found this more interesting, and I think the students did too. – Chris Brav Jan 18 '12 at 12:20
  • 2
    This is a sad truth, really. Because in practice outside of pure algebra we almost never care about finite groups. In analysis and dynamics at least, the groups are almost always infinite and have natural topologies. Haha – Michael Cotton Oct 06 '17 at 05:23
22

A natural (iso)morphism is one that is "canonical", or defined without making "choices", or that is defined "in the same way" for all objects.

This is a heuristic I found in every introductory text on category theory I can remember reading (and usually followed with the single/double dual of a vector space as an example) and it took me quite a while to realize that this is not only inaccurate, but just plainly wrong.

Explanation of "wrongness": A natural morphism is a morphism between two functors. That is, a morphism in the category of functors between two categories. And as such, should be thought as usual as mapping the "data" in a way that preserves the "structure" and choices have really nothing to do with it.

For example, thinking of a group $G$ as a one object category, functors from it to the category of sets form the category of $G$-sets. A morphism of $G$-sets is a map of sets preserving the action of $G$ and not a map of sets that "does not involve choices". Same goes for other familiar categories of functors (representations, sheaves etc.)

Another example is the category of functors from the one object category $G$ again to itself. To give a natural map (isomorphism) from the identity functor of $G$ to itself is just to pick an element of the center of $G$. I don't imagine anyone describing it as doing something that "doesn't involve choices".

Moreover, every category $C$ is the category of functors from the terminal one-object-one-morphism category to $C$. Hence, every morphism in any category is a "natural morphism between functors" so there is really no point in specifying a heuristic for when a morphism is "natural". This is utterly meaningless.

In the other direction, it is easy to write down "canonical" object-wise maps between two functors that fail to be natural in the technical sense. Conisder the category of infinite well ordered sets with weakly monotone functions. The "successor function" is definitely defined "in the same way" for all objects, but is not a natural endomorphism of the identity functor in the technical sense.

Explanation of harmfulness": Well I guess it is clear that a completely wrong heuristic is a bad one, but I'll just point out one specific example that is perhaps not so important, but shows clearly the problem. When showing that every category is equivalent to a skeletal category there is a very "non-canonical" construction of the natural isomorphisms. I saw several people get seriously confused about this.

Some thought: One might argue that this heuristic was advanced by the very people who invented category theory (like Maclane) and thus, it is perhaps a bit presumptuous to declare it as "plainly wrong". My guess is that at the time people where considering mainly large categories (like all sets, all spaces, all groups etc.) as both domain and codomain of functors and were focusing on natural isomorphisms. In such situations it is unlikely that the functor will have non trivial automorphisms (or have very few and "uninteresting" ones) and therefore a natural isomorphism will be in fact unique so maybe this is the origin of the heuristic (It is just a guess, I am not an expert on the history of category theory).

This relates to the point that by definition, if specifying an object does not involve choices, then it is unique (this is a tautology). So when we say that an isomorphism is "canonical" we usually mean that given enough restrictions, it is unique (and not just natural in the technical sense). For example, the reason we identify the set $A\times (B \times C)$ with the set $(A\times B)\times C$ is not because there is a natural isomorphism between them, but because if we consider the product sets with the projections to $A,B$ and $C$, then there is a unique isomorphism between them. And this is in line with the general philosophy of identifying objects when (and only when) they are isomorphic in a unique way. In contrast, we don't identify two elements of a group $G$, just because they are conjugate (This is "naturally isomorphic" viewed as functors of one object categories $\mathbb{Z}\to G$) precisely because this natural isomorphism is not unique.

Well, I did not intend this to get so lengthy... I was just anticipating some "hostile" responses defending this heuristic, so I tried to be as convincing as possible!

KotelKanim
  • 2,270
  • 9
    I think there is a version of this heuristic that is mostly accurate and useful: almost any "canonical" construction is functorial or a natural tranformation. As you point out, this isn't always the case (and the converse certainly isn't the case in general), but in my experience the exceptions that arise in practice are quite rare and it is not difficult to get an intuition for detecting the rare cases when it fails. A special case of this that is actually literally always correct is that any canonical construction is functorial/natural with respect to isomorphisms. – Eric Wofsey May 29 '15 at 13:34
20

Almost any heruistic can be "most harmful" if used by a teacher in a situation when the audience does not know why it makes sense, and without an explanation. This is especially dangerous in the frequent case that the heruistic does not actually seem reasonable to a person seeing it for the first time, since it makes sense only in some ways but not others. It might require months of experience for an uninitiated person to understand how and why it applies.

For example, the heuristic of schemes as manifolds is such -- every algebraic geometer understands it, but it actually is harmful to a person who is seeing schemes for a first time (such a person would vary likely interpret this heruistic as saying that affine schemes are trivial to understand). Same applies to "integration is the inverse of differentiation", and some of the other answers to this question.

Of course, these heuristics are also the most useful ones, once you (and any audience you might have) actually understand them. The whole point of learning math is to gain more such heuristics, and to makes the ones you have more precise. For this reason, it seems to me that the use of such heruistics on an unprepared audience is the most common problem in the lectures by the very best mathematicians.

A related problem is the an abundance of statements that are not strictly true, but "correct in spirit". Again, this may be very useful in research or when talking to a person of appropriate sophistication, but it is very bad for students if such statements are used carelessly and without explanation.

P.S. This whole answer is generalization for the sake of generalization. Was it a waste of time, I wonder?

17

Also not really a heuristic, but "differentiation is easy," as encoded in the following two sub-heuristics:

  • Differentiation is just repeated application of the product and chain rules, and
  • Most functions are differentiable most of the time.

Edit: Someone doesn't seem to like this answer, so I'll expand. Students who leave calculus with this impression enter analysis with a disadvantage: differentiation is not a property that "most" functions have in any reasonable sense, not even continuous ones, and to compute the derivative of a function that isn't given as a sum of compositions of "elementary" functions requires an entirely different mindset than the one that values the product and chain rule.

Qiaochu Yuan
  • 114,941
  • 16
    I think your argument is more effective against a slogan like, "all interesting functions are differentiable". In my (limited) experience, differentiation tends to be algorithmic in practice, although it can be unstable in numerical applications. This is in contrast to integrals, which exist much more often and tolerate numerical error well, but are generally very difficult to compute. – S. Carnahan Oct 24 '09 at 22:38
  • 3
    Somewhat related is the assertion that "differentiation is more fundamental", since it is "easier" and usually taught first.

    Not only is this misguided for the reasons you and Scott cite, but following Roger Penrose we can also turn the argument upside down in the complex plane by using Cauchy's theorem to define the derivative of a function by means of a contour integral. I've always hoped there was some alien civilization in another spacetime where derivatives were actually introduced this way.

    – jvkersch Mar 01 '10 at 12:39
17

I wish to point the attention on Pete Clark's very relevant initial comment. The term heuristic is often taken as synonymous to non-rigorous method, only based on intuition or experience. I personally dislike this acceptance of the word in mathematics, and I suspect it is not even historically correct (now I'm curious to check the use of it in the classic authors). The etymology of the adjective, from the verb εὑρίσκω (to find, discover) means "aimed to find". As I see it, it is exactly the method we follow when looking for a solution of a problem: using all implications of being a solution in order to identify a candidate solution. Of course, the heuristic is only half the job, and it is only rigorous if followed by part 2: checking the solution. But there's a very smart idea in it. For instance: solving an equation, transform it, but do not check the equivalence of each single step, just follow a chain of implications. So, what is harmful is not the heurstic method, but leaving out the (often less creative) part 2. That said, here's my example: let F be a smooth function bounded below (or a functional) with only one critical point. Then one would argue:

Any minimum point of F(x)=0 satisfies F'(x)=0, whose only solution is x0. Hence, x0 is the minimizer.

False!, if one does not check that F(x0)≤F(x) for all x ("direct method in Calculus of Variations") or if one has not proved the existence of a minimizer (indirect method). Many students make this mistake... but not only them!

Jiahao Chen
  • 1,870
Pietro Majer
  • 56,550
  • 4
  • 116
  • 260
16

Any attempt to draw a fat Cantor set is a bad heuristic in my opinion. I saw such a diagram as an undergrad and believed for a while that there were intervals contained in the fat Cantor set. I don't think it's possible to express in a picture that a fat Cantor has positive Lebesgue measure and has empty interior.

Kevin Teh
  • 775
  • I'm upvoting because until now I'd only ever heard of fat Cantor sets in passing, and if you hadn't said this, I probably would have been misled in exactly the same way you were. – Vectornaut Feb 15 '15 at 04:11
  • 7
    An animation would be better (now possible with computers). Zoom in on it and see that the seemingly-"interval" areas have holes, then zoom in on the seeming-"interval" areas there, and so forth, until one "gets the point". – The_Sympathizer Feb 15 '15 at 06:45
  • Understandable. They often do a bad job explaining that any Cantor set is just an embedding of $2^\omega$ into the space. And it's not that hard to show later that this essentially all your proper closed subsets. :/ – Michael Cotton Oct 06 '17 at 05:27
14

"Teach the subject before its applications."

Some important constructions seem quite pointless until you understand the rationale for them. For example, I recall finding the lectures in freshman linear algebra on constructing Jordan Normal Form extremely boring and pointless until JNF came up in the context of solving linear ODEs a year later. "That's what Jordan Normal Form is for!" - I thought - "I wish I knew that a year ago!"

Gerry Myerson
  • 39,024
Michael
  • 2,175
  • 19
    As a counterpoint, I never understood Jordan normal form until I learned that it was a special case of the classification of finitely generated modules over a PID. In other words, my difficulty with Jordan normal form came from teaching this application of representation theory before the subject! – Vectornaut May 28 '15 at 22:14
  • 2
    Both of your points are true. It is a good idea to bring up the Jordan normal form before the theory of modules over a PID, but it is not at all necessary to teach its proof and the algorithm before the general case of a PID. – darij grinberg May 29 '15 at 00:12
  • Well, I think most good teaching is either motivated theory or theoretically sound applications, because these two things should almost never live without each other. – Juan Sebastian Lozano Oct 20 '16 at 20:38
11

Writing a proof as a chain of expressions connected by equals signs whether they are appropriate or not.

9

Two bad principles that taste worse together: Decimals are the true numbers. Rounding makes no difference.

Since students learn about decimals after they've learned about whole numbers and fractions, they might assume that decimals are always the preferred way to represent real numbers, and so everything should be converted to decimals. Meanwhile, since in generally one cannot be expected to write out an infinite decimal expansion, they might assume that stopping after two decimal places makes no difference.

I'm not saying that approximations are bad. But it's bad to approximate if you have no sense of your error tolerance, or even of the fact that you're introducing an error at all.

Here are two perverse outcomes.

  1. Imagine a problem whose answer is, say, $\pi/4$, and a solution that ends like this: $$\text{blah blah blah} = \pi/4 = 3.14/4 = .785.$$ I'm sure that there are some situations where it's important to know that your answer is between $.78$ and $.79$. But much of the time, conversion to decimals obscures what's going on.
  2. (Small sample size alert!) About half of my calculus students will, on the first day of class, mark the equation $\frac{1}{3} = 0.33$ as ``true''.
  • 3
    What fraction of your students do you want to mark $1/3=0.33$ as true? There are different conventions people use, like the way mathematicians use "if" to mean "iff" in definitions like "$x$ is even if there is some integer $k$ so that $x=2k$." It's perfectly reasonable to say $1/3=0.33$ in some contexts. It looks strange because we don't usually use the $=$ sign to mean that, but others do, such as in the $f(n) = O(g(n))$ notation. – Douglas Zare Oct 21 '16 at 05:31
  • 5
    As you will know because I've told you this in person, I frequently encounter students who think that $\sqrt2 \approx 1.41$ but $\sqrt2 = 1.41413562$ (since it's all the digits displayed on the calculator). – LSpice Nov 28 '17 at 20:10
9

"Differentiation and integration are inverse operations."

To many calculus students, this is their conception of the fundamental theorem. There's truth to this heuristic, of course, but one needs to be constantly informed by a much deeper understanding of integration (and differentiation) in order to properly wield this correspondence in most situations beyond those encountered in a first course in calculus.

Zach Conn
  • 269
  • 6
    Generalizing differentiation and integration lead us to see that they differ as left- of right- sided inverses. One side generalizes to Lebesgue differentiation theorem, on the other side generalizes to bounded variation and absolute continuity. –  Apr 25 '10 at 14:41
  • 25
    I disagree with this: I think it is a fantastic heuristic, indeed the single most important heuristic of first year calculus. To argue against it is mostly to say "I don't like heuristics", it seems to me. – Pete L. Clark Aug 02 '12 at 08:21
  • 4
    Well, I didn't really have first year calculus in mind when I wrote this answer. Sure, it's a great heuristic at that level, but it's not so great later on. I guess the lesson here is that you can't really talk about a heuristic without talking about the context as well. My answer was less about the heuristic being bad, and more about it being bad to cling onto a heuristic as you transition into territory where it ceases to be so fantastically useful. – Zach Conn Nov 18 '12 at 05:21
  • It sounds like Zach is saying that some unlearning has to take place if they go on in math. That's true, but at the same time there are so many viewpoints on what differentiation "is" (see for example Thurston's list in the beginning of his Proofs and Progress paper) that it's hard to get more than just a few across in a semester or even year-long course, so I suppose some unlearning will have to take place anyway. The inversion heuristic has an advantage of being memorable. – Todd Trimble Oct 27 '18 at 11:29
9

"you'll need a computer for that".

8

From Keith Devlin's article

http://www.maa.org/devlin/devlin_06_08.html

"Multiplication is repeated addition."

This is true when multiplying natural numbers, but is a special case of a scaling operation in the reals. We know it is also a rotation in the complexes, but that should probably be left out at the beginning, although it might interesting to think about how one would include them at the beginning.

Devlin also mentions "exponentiation is repeated multiplication."

  • 5
    It's an incomplete heuristic, one that does work only for very special cases. But does this mean it is a bad heuristic? The only case where I can imagine getting bitten by it is when defining a linear map, forgetting the $f\left(\lambda x\right)=\lambda f\left(x\right)$ condition. On the other hand, here is a much more malign heuristic: Lie brackets are commutators. Very dangerous when you consider the tensor algebra of a Lie algebra. – darij grinberg Apr 10 '11 at 21:20
  • 11
    On the other hand, "the exponential map is an infinitely repeated infinitesimal multiplication" is a very good heuristic to have, particularly in Lie groups... – Terry Tao Dec 13 '11 at 19:20
  • 10
    But this rule has such a nice direct application: it shows that all rings (with unit) admit a map from $\mathbb Z$. – Elizabeth S. Q. Goodman Jan 27 '12 at 08:48
6

The "size" of a finite-dimensional vector space is proportional to its dimension.

In fact, the "size" of a finite-dimensional vector space is almost always better thought of as being exponential in its dimension. This is easiest to see for (finite-dimensional) vector spaces over finite fields, which have finite cardinality. But it's a better heuristic even for vector spaces over infinite fields.

Internalizing the correct intuition makes it clear why forming the (algebraic) direct product of two vector spaces causes their dimensions to add, and not to multiply as you might naively expect based on the fact that taking the direct product of groups multiplies their orders.

Another confusing point to which this misconception leads regards the advantage that quantum computers give over classical ones. The difference is sometimes stated as "quantum computers have a state space that's exponentially large in the number of qubits," but this is highly misleading, because classical computers also have a state space that's exponentially large in the number of bits. The better intuition is: since quantum computers have a state space whose dimension is exponentially large in the number of qubits, the state space itself is actually doubly exponential in the number of qubits, while the state space of a classical computer is only singly exponential in the number of bits.

The reason why this misconception is so widespread is that early courses in linear algebra almost always begin with vectors spaces over infinite fields (usually $\mathbb{R}$ or $\mathbb{C}$), which have infinite cardinality, so the dimension is the only finite number available. This practice leads to misleading intuition for general vector spaces.

tparker
  • 1,243
5

"Mathematical knowledge is contained and communicated primarily by documents."

I'm not sure if this is a heuristic, but in terms of beliefs that inhibit learning, this is definitely the one that hurt my mathematical development the most.

I would say the correct statement is "Mathematical knowledge is contained primarily in the minds of mathematicians and communicated primarily by informal oral communication."

This problematic belief grew out of the way that I (and pretty much everyone else) was taught mathematics at the undergraduate and beginning graduate level. In this setting texts are a central authority and a complete, well-written resource for the knowledge needed to solve any mathematical problem encountered.

In the world of mathematical research, this is no longer the case. I finally figured this out by reading Thurston's essay "On proof and progress in mathematics", which I would strongly recommend for any beginning mathematician.

Maybe it is possible to do research mathematics using papers as a primary resource, but I believe this is highly inefficient. I spent several years trying to learn the noncommutative standard model by reading the available papers on the subject and made no real progress. Looking back, I don't think I ever had a chance of succeeding with this approach.

I would guess that to be successful in mathematics, it is absolutely vital to become regularly involved in conversations with working mathematicians, as awkward and intimidating as that might be.

Kevin Teh
  • 775
  • 5
    This has nothing to do with heuristics, whatsoever! – Mariano Suárez-Álvarez Apr 17 '13 at 20:52
  • 2
    Sorry but I disagree. There is so much buried knowledge in the unread works of the past that I wouldn't be surprised if it surpasses the knowledge of currently living mathematicians. Take into account that many authors forget their own papers after a couple of decades... – darij grinberg Apr 17 '13 at 22:56
  • 3
    I disagree with the disagree-ers. Sure, this answer is a little more "meta" than the question likely intended, but not overwhelmingly so: if we take "a heuristic in math" to mean "a rule of thumb for how to prove things in math", then this answer is arguably on target, even though it is more methodological and less domain-specific. Besides, I think it's an important message to have out there, a realization that every mathematician will have to come to in order to be successful. Even though it's not the sort of issue you'll find discussed in papers :). – Tim Campion Nov 14 '13 at 21:52
  • 2
    Especially a lot of the intuition is communicated orally and informally. And it is impossible to do mathematics without having an intuition about what you're doing. – Manuel Bärenz Sep 10 '14 at 15:13
5

Talking about "functions" when we are actually talking about equivalent classes of functions almost everywhere equal

An element of a $L^p(X)$ space is usually called a "function", and is usually denoted by letters that are used typically for functions ($f$, $g$, $h$, etc.).

It seems to be a harmful heuristic to act "as if" $L^p(X)$ is made of functions, as a function is really something that should give you a value for each point $x$ in $X$. I am aware that it is now common practice, but I am sure it would help to actually introduce an actual name besides "function" to call "equivalent classes of functions modulo equality almost-everywhere". This concept is fundamental, and should be given a proper name. I don't have a proposition for such name, but I kind of wish someone in the past did.

Phil-W
  • 975
  • 6
  • 13
  • 4
    So, what actual harm does it cause? – Gerry Myerson Jan 18 '17 at 11:42
  • 1
    @GerryMyerson It can lead to errors in the theory of Hilbert spaces. Many people incorrectly think of the Hilbert space $L^2(\mathbb{R})$ as the space $\mathcal{L}^2(\mathbb{R})$ of square-integrable functions. But $\mathcal{L}^2(\mathbb{R})$ isn't an inner product space at all, because the naive "inner product" isn't positive definite. Moreover, the "square-integrable function" intuition often leads people to believe that orthonormal bases of $L^2(\mathbb{R})$ have the cardinality of the continuum, which would imply that the Hilbert space is non-separable, when in fact it is separable. – tparker Oct 20 '18 at 03:35
  • 7
    @GerryMyerson It can lead people to ask for "values" of elements of $L^p(X)$ at points of $X$. For an individual element of $L^p(X)$ this may actually be made sense of (due to some kind of regularity), but it is surely misleading to think that there is a linear functional like "evaluation at a point". – Kapil Oct 27 '18 at 05:19
  • 1
    Why not build the term "ekafunction" from sanskrit eka meaning "one" ? The sound "ek" evocates "Equivalence Class" while the meaning of the prefix suggests that one single equivalence class corresponds to several different genuine functions. – Sylvain JULIEN Oct 27 '18 at 08:26
5

The following refers to the school system in Germany, it may be different in other countries:

In my opinion, one really bad heuristic happens in elementary school, when children learn arithmetics with natural numbers. They learn that addition and subtraction are two entirely different things, because they are taught $a+b=b+a$ but $a-b\neq b-a$. Thus addition is commutative, and subtraction is not. At that level, numbers are solely understood as enumerations of objects.

Then they learn about numbers with units, such as lengths or prices or weights. Also they learn that numbers might have geometric meaning, e.g. as lengths of line segments. But still no concept of negative numbers.

Years later, when they finally get to know negative numbers as well, they have so much incorporated that subtraction is something different from addition that they have difficulties to grasp that $a-b=a+(-b)=(-b)+a$, i.e. that subtraction is nothing else than addition of a negative number.

I think that postponing negative numbers so long is a mistake, and that children in elementary school would very well be capable to understand them.

Nithilher
  • 81
  • 3
  • 7
  • 1
    Not a heuristic, but my least-favourite mannerism in additive groups is the habit of reading '$-x$' as 'negative $x$'. This is at best meaningless (for example, when $x$ is a non-real complex number) and at worst false (students with this habit cannot understand how $|x| = -x$ can be true, since the left-hand side is positive (not necessarily, but that's not the real issue) and the right-hand side is negative …). – LSpice Nov 28 '17 at 20:08
4

"Basic (and useful) mathematics is about calculations and higher (pure) mathematics is about proof."

One reason I think this is harmful is that there is no sharp line between calculations and proofs. Very often a certain calculation is essentially the proof except for a few logical connectives. Conversely, in formal logic, one can create a "calculus" that makes proofs appear to be calculations.

Another reason is that it leads students (and more importantly teachers!) to think that a drastic change of mindset is required to learn higher mathematics.

It is indeed true that analysis is quite different from calculus even though there is strong linkage. However, the former also leads to better techniques to calculate things. Putting too much emphasis on the "proof" aspect of analysis tends to put off a lot students who enjoyed playing with polynomials, trigonometry and calculus. Conversely, many students who like to work with proofs are encouraged to believe that what they are doing is somehow "superior" (higher) to "mere" calculus; they then do not do enough computational drills which ill-serves them if they actually take up mathematics!

Kapil
  • 1,546
  • 1
    I think a drastic change of mindset is required to learn higher mathematics. That doesn't mean throwing away computation, but embracing proof - partly recognizing the importance of proof, and partly coming to grips with proofs having structure beyond that of just algebraic manipulation (e.g. the first time a student learns about proof by induction). I do agree that devaluing computation is bad, but I think you're underestimating the conceptual change needed to move into higher mathematics. – Noah Schweber Oct 27 '18 at 15:24
2

Division by Zero is Infinity. This was taught in my seventh grade while the teacher was explaining the concept of infinity and it's definition(s).

False: division by zero is Undefined.

0

In linear algebra, things that are specified by a single number are scalars and things that are specified by a collection of multiple numbers are vectors (or higher-rank tensors).

This is wrong for at least two reasons. First, it blurs the distinction between a one-dimensional vector space over a field and field itself. Second, and perhaps more problematically, it gives the incorrect impression that (e.g.) if $\vec{V}(\vec{r}) = (V_x, V_y, V_z)$ is a vector field, then the individual component $V_x(\vec{r})$ is a scalar field and transforms accordingly under coordinate rotations.

tparker
  • 1,243
-2

Perhaps one of the worst heuristics is Cramer's rule as a method of computing determinants (a hideous sum over $n!$ signed permutations ...) in linear algebra classes. (I don't know why it is even mentionned.) So often have I seen students (correctly) compute the determinants of $3 \times 3$ matrices and then take only 6 permutations to try to calculate the determinants of $4 \times 4$ matrices.

-2

The excluded middle ( A Law or an Heuristic) .

On a more general level given any closed question: Is it A or B ? , the heuristic says it is one or the other disregarding the option : the question is wrong or stupid or irrelevant or incomplete.

The principle of excluded middle disregards intuitionist logic. And has been harmful in not providing direct (constructive) proofs which are often more clear - yet can be harder to find.

Intuitionism is is also rather natural : being against anti-communists does not means you are a communist.

  • 6
    Perhaps more proponents of intuitionism should have readily available examples for the glaring question: what are some natural settings where classical logic is faulty next to an (intuitionistic) alternative. Compelling answers to this question are much scarcer than suggestions to consider intuitionism. – AndrewLMarshall Aug 10 '11 at 23:51
  • 7
    A topology is an example of a Heyting algebra, not a Boolean algebra. How's that? – Todd Trimble Aug 25 '12 at 20:02