110

I'm not teaching calculus right now, but I talk to someone who does, and the question that came up is why emphasize the $h \to 0$ definition of a derivative to calculus students?

Something a teacher might do is ask students to calculate the derivative of a function like $3x^2$ using this definition on an exam, but it makes me wonder what the point of doing something like that is. Once one sees the definition and learns the basic rules, you can basically calculate the derivative of a lot of reasonable functions quickly. I tried to turn that around and ask myself if there are good examples of a function (that calculus students would understand) where there isn't already a well-established rule for taking the derivative. The best I could come up with is a piecewise defined function, but that's no good at all.

More practically, this question came up because when trying to get students to do this, they seemed rather impatient (and maybe angry?) at why they couldn't use the "shortcut" (that they learned from friends or whatever).

So here's an actual question:

What benefit is there in emphasizing (or even introducing) to calculus students the $h \to 0$ definition of a derivative (presuming there is a better way to do this?) and secondly, does anyone out there actually use this definition to calculate a derivative that couldn't be obtained by a known symbolic rule? I'd prefer a function whose definition could be understood by a student studying first-year calculus.

I'm not trying to say that this is bad (or good), I just couldn't come up with any good reasons one way or the other myself.

EDIT: I appreciate all of the responses, but I think my question as posed is too vague. I was worried about being too specific, so let me just tell you the context and apologize for misleading the discussion. This is about teaching first-semester calculus to students straight out of high school in the US, most of whom have already taken a calculus course in high school (and didn't do well or retake it for whatever reason). These are mostly students who have no interest in mathematics (the cause for this is a different discussion I guess) and usually are only taking calculus to fulfill some university requirement. So their view of the instructor trying to get them to learn how to calculate derivatives from the definition on an assignment or on an exam is that they are just making them learn some long, arbitrary way of something that they already have better tools for.

I apologize but I don't really accept the answer of "we teach the limit definition because we need a definition and that's how we do mathematics". I know I am being unfair in my paraphrasing, and I am NOT trying to say that we should not teach definitions. I was trying to understand how one answers the students' common question: "Why can't we just do this the easy way?" (and this was an overwhelming response on a recent mini-evaluation given to them). I like the answer of $\exp(-1/x^2)$ for the purpose of this question though.

It's hard to get students to take you seriously when they think that you're only interested in making them jump through hoops. As a more extreme example, I recall that as an undergraduate, some of my friends who took first year calculus (depending on the instructor) were given an oral exam at the end of the semester in which they would have to give a proof of one of 10 preselected theorems from the class. This seemed completely pointless to me and would only further isolate students from being interested in math, so why are things like this done?

Anyway, sorry for wasting a lot of your time with my poorly-phrased question. I know MathOverflow is not a place for discussions, and I don't want this to degenerate into one, so sorry again and I'll accept an answer (though there were many good ones addressing different points).

Steven Sam
  • 10,197
  • 36
    Maybe I misunderstand your question. But what would be the point of teaching students the symbolic rules as axioms without explaining to them how they are derived? Would you advocate teaching maths undergraduates the combinatorial properties satisfied by character tables of finite groups, so that they can work out the tables in most cases, without proving any of the properties or maybe even without explaining what a character is? – Alex B. Sep 27 '10 at 05:41
  • 76
    I think your only alternative is to present the "magic" differentiation rules with no justification. It is already common for students to have a black-box view of mathematics; I don't think you want to encourage it.

    Perhaps you want to begin with the definition via limits and then derive the rules from there. Emphasize to your students that "Why didn't we just use the rule from the start?" is not a valid question. The rule is a consequence of the definition, not a self-evident truth.

    – Austin Mohr Sep 27 '10 at 05:42
  • 2
    I guess it's hard to get across what I'm trying to ask (maybe I don't even understand), but I am not trying to say that we should get rid of definitions in the first place (maybe the headline is misleading). Maybe in line with Austin's comment that students already have a black-box view of mathematics would be the question of how to get students to care (or why should they care) about the definitions in the first place (in the case of derivatives it seems particularly easy for students to not care once the symbolic rules are in place). – Steven Sam Sep 27 '10 at 06:08
  • 4
    One can do calculus based on infinitesimals, which are probably somewhat easier to manipulate than limits: http://www.math.wisc.edu/~keisler/calc.html – Michael Greinecker Sep 27 '10 at 09:20
  • 3
    Each time you meet a new "basic" function, you need to compute its derivative before you can apply your standard toolbox (product rule, chain rule) to derivatives of that function composed with other functions. So with polynomial, exponential, trigonometric, etc. functions their derivatives have to come from somewhere. On the one hand you can refer to a uniform approach to start finding their derivatives (which is the h--> 0 limit definition) or you can just tell students the answers and then everything seems like more of a black box. And in economics, derivatives use h = 1 as "small"! – KConrad Sep 27 '10 at 10:08
  • 19
    there is an article by Solomon Friedberg entitled "Teaching mathematic graduate students how to teach" in the Notices of the AMS (52) 2005, where the question you ask and its didactical implications is part of a "case study". – Holger Partsch Sep 27 '10 at 11:35
  • 11
    -1, as I don't see the point in asking such a question. That's simply the most effective definition of derivative (the nonstandard analysis one would require a knowledge of logic that no freshman is supposed to have!). By the way, in Italy it is perfectly normal to learn and use the epsilon-delta definition of limit/continuity/derivative at the last year of high school... – Qfwfq Sep 27 '10 at 12:14
  • 8
    If calculus class were devoted to the project of getting students to learn to appreciate mathematics by a process that resembles mathematics (which they aren't, perhaps with good reason), then one could do this by simply holding off on the introduction of the power, product, chain and quotient rules. The geometric problem of computing tangent lines is natural and easy to motivate; the limit definition is reasonably easy to motivate from the geometric problem; and then students could spend reasonable amount of time flailing around trying to compute derivatives of different functions. (Cot'd) – JBL Sep 27 '10 at 14:04
  • 5
    In the process, they would discover that computing derivatives from scratch (as a special case of computing limits of formulas) is quite difficult; this would allow them to appreciate the rules of differentiation, and where they come from. (I think there is a similar problem in the way the Fundamental Theorem of Calculus is taught -- if antidifferentiation and definite integration are taught simultaneously, one loses appreciation for the miracle that we can compute definite integrals without resorting to Reimann summation or the like.) – JBL Sep 27 '10 at 14:07
  • 2
    There is such a course at many universities. Called "Calculus for Business" or "Calculus for Life Sciences" or such. Only calculus for engineers, physical scientists and mathematicians emphasizes the limit. But then at small colleges, where they have a "one size fits all" course, what to do? – Gerald Edgar Sep 27 '10 at 15:32
  • 1
    @Gerald: This is one of the biggest challenges for me when teaching calc 1: half of the class (if I'm lucky) will go on to take at least 3 more semesters of the stuff (to diff eq), and they must know the calc 1 material forward and backwards. Then there's always one half that only needs a calc 1 on their transcript (e.g. for med school). A few might be somewhere in between (Chem majors maybe?). These guys have to work hard to get that credit, but then again, that might be the rationale behind requiring it. (We do have a separate calc for business.) – Thierry Zell Sep 27 '10 at 20:15
  • 2
    Sorry for repeating, I included this in another comment below.

    But if any student asks why the definition is needed, ask him/her to derivate $f(x)= x^2 \sin(\frac{1}{x})$ for $x \neq 0$ and $f(0)=0$.

    – Nick S Sep 27 '10 at 22:49
  • 1
    @Thierry: I'm a chemist by training; as I recall the treatment we had only gave a passing definition of limits (so no $\epsilon-\delta$) and the derivative as a limit, and much of the remainder was taught as algorithms handed down from a mountain (though I appreciate they took the time to define the natural logarithm as an integral and the exponential as the inverse, and then deduce the requisite properties from those). – J. M. isn't a mathematician Sep 28 '10 at 03:06
  • 3
    Short answer: because a derivative IS a limit. Even in algebra, the shortest definition of the derivative (of a polynomial or rational function over any ring) at a point is to taking the limit of the differential quotient at this point. "Limit", of course, means that we cancel as much as we can from numerator and denominator, and then evaluate at the point. If you want to make stuff simpler, you can try eliminating the precise definition of the analytic notion of "limit", but that again comes at the cost of functions such as $e^x$ and $\sin x$. – darij grinberg Sep 28 '10 at 11:10
  • 3
    The edit has clarified the issue, but I still see two questions rolled into one -- which may explain why the discussion is sometimes at cross purpose: 1. Why do we define the derivative as a limit. 2. Should we expect the students to compute derivatives using limits. It's in 2 that the bigger controversy appears to reside. And I would agree that any problem that hints at jumping through hoops arbitrarily is not desirable; fortunately, we have here examples that show how the limit definition be needed even at the calculus level. – Thierry Zell Oct 09 '10 at 04:08
  • 1
    I'm starting to teach calculus for business majors now, and the book (contrary to some remarks above) does use the limit definition of derivative. But it does not give any definition of limit! I don't really see the point of this. If we're going to hand-wave limits (and completely fudge the definite integral, which this book also does), then why not hand-wave derivatives? (And if we leave limits until later, then we get to useful applications faster.) – Toby Bartels Apr 03 '11 at 22:17
  • 1
    I'm going to stuff my favourite calculus pet peeve in a comment here: the claim (which I have seen!) that $\lim_{x \to 0} \sin(x)/x$ is computed using L'H^opital's rule. (Why is it a pet peeve? Well, aside from mere pedagogical preference, what limit do you compute to find $\sin'(0)$? In a word---I can't resist the pun---it's circular.) – LSpice Dec 15 '13 at 13:40
  • "What benefit is there in introducing to calculus students the $h→0$ definition of a derivative?" -- you have to define it anyway, don't you? I agree that calculating a derivative of something like $x\mapsto 3x^2$ by definition can be boring, it is better to ask a student to derive the product or the quotient rule for the derivative, for example. – Alexey Muranov Dec 26 '14 at 09:50
  • You can do derivatives over ℤ without travelling off onto philosophical side roads. With a lagged difference operator (setting $h=1$) you can show that diff(1,4,9,16,25) = 3,5,7,9. This is simple enough that even non-university students could understand. One can't treat sinc this way, but maybe you could introduce h↓0 second (talk about 1/.00000003 and 1/−.000000003), after they understand symbolic differentiation of polynomials. – isomorphismes Sep 14 '15 at 20:34
  • More generally what's the point of teaching real analysis if it's mangled into a commodity called "calculus"? – sfmiller940 Jun 21 '19 at 21:51
  • I find it important when teaching an important notion, to present different points of view and their interrelation, rather than a unilateral approach. The $h \to 0$ thing may be the first time students are exposed to manipulating quantifiers in a "non-trivial" way, which has to be done at some point. Finding the derivative of $x^2$ can make the link with the symbolic approach. Here is a provocative analogue: why would we teach addition and multiplication in elementary school while calculators can do it? One reason could be they learn what is behind the scene (and a first example of algorithm). – François Brunault May 03 '21 at 14:35
  • Because if we don't teach the derivative as a limit, we end up with questions like this: https://matheducators.stackexchange.com/q/21203/127 – Gerald Edgar Aug 09 '21 at 12:30
  • 1
    @Qfwfq : "the he nonstandard analysis definition would require a knowledge of logic that no freshman is supposed to have". One could just as easily say that the epsilon-delta definition would require a knowledge of Cauchy sequences or Dedekind cuts that no freshman is supposed to have. In calculus, we simply postulate and work with the basic properties of the real numbers, without actually proving that anything has those properties. I don't see where there's any difference in principle between that and simply postulating the properties of the non-standard reals. – Steven Landsburg Aug 10 '21 at 22:02
  • @Steven Landsburg : we're probably using different definitions of "freshman calculus". In some countries this not-yet-rigorous version of elementary real analysis simply doesn't exist: people learn to deal with all the rigorous definitions and all the proofs from the get go. There, every freshman is supposed to get that knowledge of epsilon-deltas etc during "calculus" because "calculus" is done that way. So the question is probably location-dependent. (...) – Qfwfq Aug 12 '21 at 22:34
  • (...) In my own experience, in high school (second year) we defined real numbers by Dedekind cuts (a construction easily forgotten and immediately replaced by intuitive use of the usual properties! But we were told exactly what $\sqrt{2}$ is in that framework). Then in the fifth year of high school real analysis was done pretty rigorously including epsilon-deltas, continuity, derivatives, fundamental theorem of calculus (but with Riemann integral done with no proofs and with only semi-rigorous definition; and no mention of Cauchy sequences). – Qfwfq Aug 12 '21 at 22:34

34 Answers34

141

This is a good question, given the way calculus is currently taught, which for me says more about the sad state of math education, rather than the material itself. All calculus textbooks and teachers claim that they are trying to teach what calculus is and how to use it. However, in the end most exams test mostly for the students' ability to turn a word problem into a formula and find the symbolic derivative for that formula. So it is not surprising that virtually all students and not a few teachers believe that calculus means symbolic differentiation and integration.

My view is almost exactly the opposite. I would like to see symbolic manipulation banished from, say, the first semester of calculus. Instead, I would like to see the first semester focused purely on what the derivative and definite integral (not the indefinite integral) are and what they are useful for. If you're not sure how this is possible without all the rules of differentiation and antidifferentiation, I suggest you take a look at the infamous "Harvard Calculus" textbook by Hughes-Hallett et al. This for me and despite all the furor it created is by far the best modern calculus textbook out there, because it actually tries to teach students calculus as a useful tool rather than a set of mysterious rules that miraculously solve a canned set of problems.

I also dislike introducing the definition of a derivative using standard mathematical terminology such as "limit" and notation such as $h\rightarrow 0$. Another achievement of the Harvard Calculus book was to write a math textbook in plain English. Of course, this led to severe criticism that it was too "warm and fuzzy", but I totally disagree.

Perhaps the most important insight that the Harvard Calculus team had was that the key reason students don't understand calculus is because they don't really know what a function is. Most students believe a function is a formula and nothing more. I now tell my students to forget everything they were ever told about functions and tell them just to remember that a function is a box, where if you feed it an input (in calculus it will be a single number), it will spit out an output (in calculus it will be a single number).

Finally, (I could write on this topic for a long time. If for some reason you want to read me, just google my name with "calculus") I dislike the word "derivative", which provides no hint of what a derivative is. My suggested replacement name is "sensitivity". The derivative measures the sensitivity of a function. In particular, it measures how sensitive the output is to small changes in the input. It is given by the ratio, where the denominator is the change in the input and the numerator is the induced change in the output. With this definition, it is not hard to show students why knowing the derivative can be very useful in many different contexts.

Defining the definite integral is even easier. With these definitions, explaining what the Fundamental Theorem of Calculus is and why you need it is also easy.

Only after I have made sure that students really understand what functions, derivatives, and definite integrals are would I broach the subject of symbolic computation. What everybody should try to remember is that symbolic computation is only one and not necessarily the most important tool in the discipline of calculus, which itself is also merely a useful mathematical tool.

ADDED: What I think most mathematicians overlook is how large a conceptual leap it is to start studying functions (which is really a process) as mathematical objects, rather than just numbers. Until you give this its due respect and take the time to guide your students carefully through this conceptual leap, your students will never really appreciate how powerful calculus really is.

ADDED: I see that the function $\theta\mapsto \sin\theta$ is being mentioned. I would like to point out a simple question that very few calculus students and even teachers can answer correctly: Is the derivative of the sine function, where the angle is measured in degrees, the same as the derivative of the sine function, where the angle is measured in radians. In my department we audition all candidates for teaching calculus and often ask this question. So many people, including some with Ph.D.'s from good schools, couldn't answer this properly that I even tried it on a few really famous mathematicians. Again, the difficulty we all have with this question is for me a sign of how badly we ourselves learn calculus. Note, however, that if you use the definitions of function and derivative I give above, the answer is rather easy.

Deane Yang
  • 26,941
  • 17
    I emphatically agree that students don't know what a function is. But then again, it is a deceptively deep concept. As for modern treatments that emphasize other things than the standard, did you ever look at "Calculus in Context", the five colleges calculus? Available at http://www.math.smith.edu/Local/cicintro/cicintro.html My problem with this approach though is that even if you can convince me easily that it's the right thing to do mathematically, how will it mesh with the courses in other disciplines that students take, which will expect much more traditional material. – Thierry Zell Sep 27 '10 at 12:55
  • 5
    Any changes to math courses should of course be done only in close collaboration with other departments who rely on the math courses. But I think you'll find that, with the computational tools available, many of them will quite sympathetic to a "concept first, hand computation second" approach. Besides, I don't argue against teaching symbolic computation, just delaying it. And I do also like the Calculus in Context book but have not had experience using it. I suspect it works best with students with a stronger background than the ones I teach. – Deane Yang Sep 27 '10 at 13:39
  • 3
    Granted, the thing about $\sin \theta$ is easy if you do a symbolic computation and consider the chain rule. – Harry Gindi Sep 27 '10 at 16:58
  • 17
    Harry, that is exactly how any pure mathematician, including me, would do it. But that's the hard way. For an engineer or physicists, who thinks in units and dimensional analysis and views the derivative as a "sensitivity" as I've described above, the answer is dead obvious. – Deane Yang Sep 27 '10 at 17:51
  • 3
    I would add that when we used it as an audition question, we assumed that everyone do it the "hard way" and would see that each sine function could be written in terms of the other and apply the chain rule. We weren't looking for the right answer, just the right thought process. I was quite stunned to see people not only not finding the right thought process but also guessing the wrong answer. It was around then that I started to suspect that not only are we teaching calculus students very badly, we're also teaching Ph.D. students badly, too. – Deane Yang Sep 27 '10 at 17:54
  • 1
    Deane - you have any decent materials for a semester of calculus without any functions defined symbolically? I do what you suggested for the first two weeks (not enough, but I have a syllabus to get through), and the third time around I finally got around to typing up some notes so students have something to read in preparation for class. That took a couple weeks out of my summer, and didn't turn out all that great. – Alexander Woo Sep 27 '10 at 18:33
  • 1
    Deane, I find very interesting your method, and I agree with your conceptual approach. still I prefer not to wait a whole semester before starting symbolic manipulations. The style I like is: start from the problems (even simple, but real); emphasize the need of abstraction to treat them better; introduce the theory (not too much) and prove theorems; then go back to applications, and, lastly, by another important step of abstraction, develop the useful notation and formalism, and the rules of calculus. We can follow this procedure at each chapter of calculus: series, derivatives, integrals &c. – Pietro Majer Sep 27 '10 at 19:47
  • 6
    Alexander and Pietro, unfortunately I said "I would like to see...", which means I don't really get to banish symbolic methods for a whole semester. In fact, I advise being pragmatic and teaching in a fashion that will not alienate you from your department or school administration. That said, if you want to slip in more understanding (which I claim actually helps students learn the symbolic methods better), I recommend taking problems from the Harvard calculus textbook, as well as their precalculus text ("Functions Modeling Change"). Especially those where no formula is given for the function. – Deane Yang Sep 28 '10 at 02:07
  • We use the Harvard book, which I like a lot. I also like Ostebee and Zorn, if you manage to find a copy. – Alexander Woo Oct 01 '10 at 03:36
  • 17
    Your description of a function as a box seems to miss the most important part: that whenever you put a given number in, you always get the same output. That is, the box behavior should be single-valued. (Otherwise, we might imagine a black-box that accepts a given input and outputs a random number, perhaps different every time, and although this accords with your description, it is not a function.) – Joel David Hamkins Jan 24 '11 at 11:36
  • 4
    Deane writes about "the first semester." For at least half of my students, the first semester of calculus is the only semester of calculus they will ever see. If they don't see, say, the product rule first semester, they'll never see it. These are largely students in financial studies. – Gerry Myerson Jan 29 '11 at 23:49
  • 6
    Gerry, I have two reactions to your comment. One is that if you are teaching a single semester "terminal" calculus course, then you definitely have to choose and compress your topics carefully. The second is that I still consider it much more important to teach such students how to use a derivative as a useful measure of "sensitivity", rather than how to compute derivatives symbolically. In fact, using the sensitivity approach (as often seen in physics and engineering courses), the product rule appears very naturally. – Deane Yang Jan 30 '11 at 04:20
  • 15
    I do consulting in the financial sector, and first and second derivatives are widely used but never symbolic differentiation. The derivatives are always computed numerically and under the hood. What your finance students need to know is how to interpret and use these numbers. This is presented rather well in my view in the first few chapters of the Harvard Calculus text. – Deane Yang Jan 30 '11 at 04:23
  • 3
    Deane Yang, thank you for your thoughtful comments. – Gerry Myerson Jan 31 '11 at 11:17
  • 3
    I'm not sure that I understand the sine problem; it seems too easy. You're talking about the function that maps any $x$ to the sine of $x$ radians and the function that maps any $x$ to the sine of $x$ degrees, right? These are different functions, so they'll have different derivatives, barring some coincidence; intuition leads us immediately to the right answer. We still check for coincidence; two functions have the same derivative (if and) only if they differ uniformly by a constant. To rule this out, it's enough to notice that these two functions agree at $0$ but not at $1$. – Toby Bartels Apr 03 '11 at 22:32
  • 2
    OK, that's still longer than saying, hey, $\mathrm{rad}^{-1}$ and $\mathrm{deg}^{-1}$ are different units, so of course the derivatives expressed in these units are different. (Probably not longer to guess the right answer, just to check it with confidence.) But how could it trip anybody up? – Toby Bartels Apr 03 '11 at 22:36
  • 4
    Toby, the problem is easy, if you understand and use the actual meanings of the words "function" and "derivative". But it is a measure of badly we teach calculus that even people with Ph.D.'s don't always answer freshman calculus problems because they are going into a reflexive "freahman calculus mode", which does not involve or use the actual meanings of the word "function" and "derivative". If you think about it, most freshman calculus courses do not require students to know or use the precise meanings of these words. So most students don't. – Deane Yang Apr 03 '11 at 22:45
  • 3
    Darn, I was kind of hoping that I'd misunderstood the question ... – Toby Bartels Apr 04 '11 at 00:24
  • 6
    It should be noted that when we first thought of asking the question about the derivative of sine of $x$ degrees versus sine of $x$ radians, the intent was not to test whether the potential instructor knew the right answer or not. We assumed that they did but wanted to see how well they could explain why to a student. It was a rude shock to discover that more than one person with a Ph.D. in math did not even know the right answer. – Deane Yang Apr 04 '11 at 01:21
  • 5
    @Toby Bartels: "You're talking about the function that maps any x to the sine of x radians and the function that maps any x to the sine of x degrees, right?" For first-year calculus students who happen to have taken differential topology, there's a third option. The sine function is most naturally defined as a map from the unit circle to the real line, and its derivative turns tangent vectors of the circle into tangent vectors of the line. The derivative clearly doesn't depend on how you measure angles; in fact, it doesn't depend on the concept of "angle" at all. (CONTINUED--->) – Vectornaut Jun 24 '11 at 06:19
  • 4
    (--->CONTINUED) If you do pick a way of measuring angles (that is, a parameterization of the unit circle), you can think of the sine function as a map from the real line to the real line. The derivative of this "sine on the line" does depend on how you measure angles. Of course, this probably can't be explained to first-year calculus students... and if it can, it probably shouldn't be. :( – Vectornaut Jun 24 '11 at 06:19
  • 1
    One frustrating thing about teaching calculus is to try to prove (or give some idea of the proof) of derivatives of exponential and trig. functions. This is easily done if one teaches the definite integral first. Then one may define ln(x) as the integral of 1/u from 1 to x, and invert to get the exponential (this is done in Spivak and Apostol). Also, to differentiate sin(x), you need to use arclength, which is again done easily first using integrals. The alternative rigorous approach is to define the exponential and radians via approximation by rationals, which may be done before integration. – Ian Agol Sep 29 '11 at 05:47
  • 3
    Agol, this is an issue only if you feel obligated to give a rigorous proof of everything. Defining the logarithm first and the exponential only as an inverse to the logarithm is for me pedagogically backwards, since the exponential functions are much more easily motivated. Heuristic arguments for the derivatives of exponential and trig functions are easily given. You can then assert the formulas as axioms. – Deane Yang Sep 29 '11 at 14:48
  • @Vectornaut - you are saying first year calculus students shouldn't understand the chain rule? – Steven Gubkin Mar 09 '12 at 18:27
  • 2
    Personally, I don't think introducing $e^x$ as the inverse to the natural logarithm is the best way to do it. I actually wait until we've talked about derivatives, etc., then introduce $e^x$ as the solution to a particular differential equation. At this point my students have done some work with differential equations (via a CAS such as Mathematica), so this is not too foreign for them. I am a big believer that every calculus class should have a significant CAS component. So while I give an intuitive definition of a limit in class, most of my students see them mostly in guided calculation. – Steve D May 04 '12 at 04:27
  • I like "sensitivity" instead of derivative, thanks for that. I also like "stability" instead of "continuity." I try to emphasize the issue of control...as in what would happen if you tried to drive a car where the position of the wheels didn't vary continuously with how you turned the steering wheel. – David Feldman May 19 '12 at 05:04
  • 4

    In my department we audition all candidates for teaching calculus and often ask this question.

    So I might not get that job, but my answer would have been "it depends." If you really do calculus right, the derivative of a function is not another plain-old function, but rather a 1-form. So in that more correct framework the passage from a fixed function to its derivative is coordinate free and that means units don't matter. Of course if you really mean to change function instead of changing units, you have a different function and you get a different derivative.

    – David Feldman May 19 '12 at 05:15
  • 1
    Seems like this thread is still open to discussion. On the derivative of $\sin\theta$ and $\sin x$, I think the problem is, while most people are ok with functions, they sort of have formulas in their mind. It is ok if one just think this as a formula in this case. However, if one really think about $f$, the sine function as a function on the circle -- a map from the circle to the real line, defined independent of any coordinate, then he will soon realize that the "derivative don't make sense", what make sense, is the one form $df$. That's the real conceptual jump, which is done by E.Cartan. – temp May 21 '12 at 05:04
  • 1
    David, you don't seriously believe that we should teach 1-variable calculus in terms of 1-forms, do you? Co-ordinate-free mathematics is important and useful to some of us (it certainly is for me), but, it seems to me, not for the vast majority of our students. And, yes, when I say "sin of x degrees", I do mean a different function than "sin of x radians". – Deane Yang May 21 '12 at 07:40
  • Seems David Feldman just said exactly what I wanted to say in my last comment. I learnt math from books translated from Russian, they at least mention "the invariance of 1-forms" when they talk about the chain rule. People might found this pointless, as calculus students may not need that fact. But 1 year-ish later when I really get what that "the invariance of 1-forms" I was really happy. And for not so advanced people, that's a reliable way to memorize the chain rule, right? – temp May 22 '12 at 01:51
  • 1
    @Deane Yang: With all this bad teaching, how can you hope to get a good education in math?! – The_Sympathizer Oct 23 '14 at 13:12
  • Depends on what you mean. If you need it for your work, you'll learn it properly through what you do. That's always the best way, because you understand why it's needed. If you just want to learn it, you succeed either by having a really good teacher or studying it on your own (and not being satisfied until you understand it inside out). – Deane Yang Oct 23 '14 at 13:41
  • I've read through the whole dialog in comments about the sine function, and I'm still mystified. Among the mathematicians who got it wrong, what was the wrong answer that they gave? What was the reasoning that they offered? –  Dec 22 '14 at 15:39
  • 5
    Ben, the ones who got it wrong just guessed that the derivative formulas would be the same for degrees as radians. The fact that they guessed instead of making some effort to work it out was in fact what flabbergasted me. I would also note that many (maybe most?) mathematicians are uncertain about whether the constant is $\pi/180$ or $180/\pi$. This is not such a serious issue, but it bothers me that any engineer or physicist would be able to answer instantly. – Deane Yang Dec 22 '14 at 17:04
  • The use of the term sensitivity for the derivative is incredibly silly to me, especially for physics majors. The natural introduction is through the Newton quotient, velocity, and trajectories of balls. Then generalize to limiting rates or ratios of changes of other quantities. – Tom Copeland Mar 22 '16 at 05:56
  • I view sensitivity as a way to indicate clearly why the derivative is a useful tool for analyzing functions outside the context of the motion of objects.. Also, I don't think the terminology is the crucial issue. I believe students stumble here, because we have not ensured that they have a working (and not just a formal) understanding of what a function is. – Deane Yang Mar 22 '16 at 20:33
  • Students have an implicit, working understanding of functions. They know that given any group of people they can assign a unique height (mod units in feet and inches or centimeters) to each person but given a height they might not be able to assign only one person to it. If they are confused, it can only be by an obfuscating formalism. – Tom Copeland Mar 22 '16 at 20:50
  • That's not my experience at all. Here is a simple example from finance: The price of an option on a stock is a function of the stock price. Suppose that when the stock price is 100, the derivative of the option price with respect to the stock price is 2. Suppose today's stock price moved by $0.20. Estimate the change of the option price. Will a student say immediately that the option price changes by $0.40? Or will they try to figure out what rule or formula they should use to figure this out? – Deane Yang Mar 22 '16 at 21:04
  • I can only suppose that you would ensure that they think of the problem geometrically in terms of a graph of the dependent and independent variables of suitably averaged prices, which would be the natural line of thinking if one started from the physics approach first. I can appreciate how difficult it might be to come up with a derivative from the real data that would have predictive value. What should be the window for the time-averaging, ..., etc.? – Tom Copeland Mar 22 '16 at 21:46
  • I would hope that anyone who has learned calculus properly would know immediately (without having to think about the graph) that, under reasonable circumstances, the option price would change by approximately by $0.40. In fact, option traders, including those who have no formal training in calculus, do use the derivative in exactly this way. They even know that the second derivative tells them whether the estimate is an over- or underestimate. And they understand that there are extreme circumstances where the estimate is not useful at all. This is what I mean by working knowledge of calculus. – Deane Yang Mar 23 '16 at 18:44
  • So they have at least an implicit knowledge (and most likely explicit knowledge from books on quantitative vs. fundamental analysis of the market) of when and how to fit a curve (a function) to sections of massaged data, in agreement with my first point. Btw, I taught freshman calculus and physics without calculus during graduate studies at an engineering university. No motivated student had a problem with the courses, in particular with concepts involving ratios of changes. – Tom Copeland Mar 24 '16 at 14:17
  • No. The traders just know that there is a black box called the Black-Scholes formula (or more sophisticated variants) that, given a volatility and stock price, spits out the option price. They want to hedge their position by shorting the right amount of stock that will offset any changes to the option price. The derivative tells them the right "hedge ratio". It's all very simple and uses nothing but the basic concept of a derivative as a sensitivity. This simple view is useful in many other contexts. So I believe it's shame to flood students with so many other things but not this. – Deane Yang Mar 24 '16 at 19:23
  • 1
    By the way, viewing a derivative as a "sensitivity" is nothing but a way to describe the linear approximation of a function. But I believe it's a way that makes it easier to understand and use the approximation than the usual formula we teach. – Deane Yang Mar 24 '16 at 19:25
  • 1
    @TomCopeland, students have a rough-and-ready understanding of concepts that mathematicians would recognise as applications of functions, but they don't understand functions themselves! As one illustration of this, witness the difficulty faced by students who, where $x(t)$ gives the displacement of a particle from the origin after $t$ time units, are asked to distinguish between the graph of $x$ versus $t$ and the trajectory of the particle. – LSpice Mar 07 '20 at 22:04
  • 1
    @Lspice, after coming back to America and getting some experience with 11 and 12-th graders studying calc, I understood how even functional notation is difficult for many, maybe most, students to handle. I attribute it to doctrinaire, opportunistic admin and overtaxed teachers who follow texts more or less blindly and promote students rather than deal with toxic parents ... . Don't get me started on the dumbing-down of America by big business (as educ. has become) and big gov't. E.g., witness the authors for Sci. Am. 30 years ago (researchers) compared to now (professional science writers). – Tom Copeland Mar 08 '20 at 13:51
  • 1
    I'm not a professional mathematician but I majored in math and the sin problem is completely easy to me. After all, I learned calculus from Hughes-Hallet! – Thierry Oct 03 '21 at 19:56
98

I'm teaching Calc 1 this semester, and I've stumbled onto something that I like very much.

First of all, I start (always) by having my students draw bunches of tangent lines to graphs, compute slopes and draw the "slope graphs" (they also do "area graphs", but that's not relevant to this answer). They build up a bit of intuition about slope and slope graphs.

Then (after a few days of this) I ask them to give me unambiguous instructions about how to draw a tangent line. They find, of course, that they are stumped.

In the past, I went from this to saying "we can't get a tangent line, but maybe we can get an approximately tangent line" and develop the limit formula.

This semester, I said, "we have an intuitive notion of tangency; suppose someone offered a definition of tangency -- what properties would it satisfy?" We had a discussion with the following result: tangency at point $x = a$ should satisfy:

  1. tangency (of one function with another) should be an equivalence relation
  2. if two linear functions are tangent at $x= a$, they are equal.
  3. a quadratic has a horizontal tangent line at its vertex.
  4. if $f$ and $g$ are tangent at $x = a$, then $f(a) = g(a)$.
  5. if $f_1$ is tangent to $f_2$ at $x = a$ and $g_1$ is tangent to $g_2$ at $x = a$ then $f_1 + g_1$ is tangent to $f_2 + g_2$ at $x = a$ and similarly for the products.
  6. the evident rule for composition.

Using these rules, we showed that if $f$ has a tangent line at $x = a$, it has only one. So we can define $f'(a)$ to be the slope of the tangent line at $x = a$, if it exists!

The axioms are enough to prove the product rule, the sum rule and the chain rule. So we get derivatives of all polynomials, etc., assuming only that tangency can be defined.

Then (limits having presented themselves in the computation of area) I defined $f$ to be tangent to $g$ if $\lim_{x\to a} {f(x) - g(x) \over x-a} = 0$. We derive the limit formula for the derivative, and check the axioms.

EDIT: Here's some more detail, in case you're wondering about implementing this yourself. I had the initial discussion about tangency in class, writing on the board. A day or so later, I handed out group projects in which the axioms were clearly stated and numbered, and the basic properties (as outlined above) given as problems.

The students' initial impulse is to argue from common sense, but I insisted on argument directly from the axioms. There was one day that was kind of uncomfortable, because that is very unfamiliar thinking. I had them work in class several days, and eventually they really took to it.

Jeff Strom
  • 12,468
  • 25
    This is very nice. – Deane Yang Sep 27 '10 at 14:20
  • 1
    I second Deane's comment. – Mark Meckes Sep 27 '10 at 17:12
  • 1
    Is there really a unique equivalence relation satisfying these rules? I do not see how these rules could ever access a function which is not a polynomial. If not, saying that you can define f'(a) to be the slope of the tangent line at x=a presupposes that you have chosen one of the many equivalence relations which satisfy these properties. – Steven Gubkin Nov 03 '10 at 15:41
  • 1
    So see that I can get more than polynomials: By using the chain rule and the product rule I can actually get any algebraic function. But I still do not see how to get any transcendental function. – Steven Gubkin Nov 04 '10 at 03:01
  • 6
    See this MO question: http://mathoverflow.net/questions/44774/do-these-properties-characterize-differentiation/44782#44782 – Steven Gubkin Nov 04 '10 at 14:23
  • I don't claim these properties fully characterize differentiation. I've toyed with adding a "uniform convergence" type axiom, but not for Calculus 1 purposes. Unfortunately, all of my analysis knowledge is half-remembered, so I'd have to sit down and think about this, and I have not found the time yet. – Jeff Strom Nov 04 '10 at 18:12
  • 6
    @Jeff: The MO question I linked to shows that for C^\infty functions, these axioms do characterize differentiation. – Steven Gubkin Nov 11 '10 at 15:21
52

I'm going to answer this part:

does anyone out there actually use this definition to calculate a derivative that couldn't be obtained by a known symbolic rule?

Yes. $sin(x)$.

My point is that of course we can just learn the derivative of this function, but then we could just learn the derivative of any function. So looking for a "complicated function" that needs the limit definition is pointless: we could just extend our list of examples to include this function. It's a bit like the complaint that there's no closed form for a generic elliptic integral: all we really mean is that we haven't given it a name yet.

In fact, one could do $x^2$ like this, or even $x$, but I think that $sin(x)$ has a good pedagogical value. If you can get them first to ponder the question, "What is $sin(x)$?" then it might work. I'm teaching a course at the moment where I'm trying to get my students out of the "black box" mentality and start thinking about how one builds those black boxes in the first place. One of my starting points was "What is $sin(x)$?". Or more precisely, "What is $sin(1)$?". If you take that question, it can lead you to all sorts of interesting places: polynomial approximation of continuous functions, for example, and thence to Weierstrass' approximation theorem.

Many students will just want the rules. But if the students refuse to learn, that's their problem. My job is to provide them with an environment in which they can learn. Of course, I should ensure that what they are trying to learn is within their grasp, but they have to choose to grasp it. So I'm not going to give them a full exposition on the deep issues involving the ZF axioms if all I want is for them to have a vague idea of a "set" and a "function", but I am going to ensure that what I say is true (or at the least is clearly flagged as a convenient lie).

Here's a quote from Picasso (of all people) on teaching:

So how do you go about teaching them something new? By mixing what they know with what they don't know. Then, when they see vaguely in their fog something they recognise, they think, "Ah, I know that." And it's just one more step to, "Ah, I know the whole thing.". And their mind thrusts forward into the unknown and they begin to recognise what they didn't know before and they increase their powers of understanding.

We all remember professors who forgot to mix the new in with the old and presented the new as completely new. We must also avoid the other extreme: that of not mixing in any new things and simply presenting the old with a new gloss of paint.

Andrew Stacey
  • 26,373
  • 13
    +1 for "if the students refuse to learn, that's their problem. My job is to provide them with an environment in which they can learn." – Mark Meckes Sep 27 '10 at 14:44
  • 22
    @Mark and Sean, I have to admit that I'm a little put off by the cheerleading for this particular phrase -- stripped from the context Andrew provided it, it comes across rather as, "I don't do a bad job of teaching, my students do a bad job of learning." I think this is an attitude those of us who teach should be careful to avoid, in general. (Of course, everyone who has ever taught has come across specific cases where it might be applied.) – JBL Sep 28 '10 at 13:12
  • 3
    @JBL: point taken, although I disagree with your restatement. "My job is to provide them with an environment in which they can learn" is the sentiment of someone who takes doing that job well seriously. I really liked the line because of several recent conversations about students who don't take notes, skip class, rush through homework, and don't ask questions. Such students are the exception rather than the rule, but they can get under one's skin. For times like that, I thought Andrew's line would be a good substitute for the glib old saw about a horse and water. – Mark Meckes Sep 28 '10 at 13:57
  • 2
    @Mark: Yes, I didn't think that either you or Sean agreed with my rephrasing, just that it seems to me that this statement (in isolation) has a little of that ring to it. Students who behave as you describe are extremely irritating, but I think teachers would do well to avoid sounding like we think this is the norm :) – JBL Sep 28 '10 at 15:06
  • @JBL: I agree with your underlying point, but disagree with the point you actually made. I think that Mark quoted enough context by quoting the second sentence. I also disagree with your rephrasing because my original statement said, "If the students ..." whereas your rephrasing seems more like "When the students ...". There's loads more to say on this, but MO is (thankfully) a lousy place to say it. So I will content myself with saying that I know Mark and I know that he takes teaching very seriously so I see his "cheerleading" of that sentence coming from the best possible motives. – Andrew Stacey Sep 28 '10 at 16:07
  • But the question is not whether the students refuse to learn but whether the environment that we are providing is good. Explaining an intuitive concept with a complicated definition is not conducive to that good environment. – Toby Bartels Apr 03 '11 at 19:51
42

While I think that ideally, even in a freshman course of calculus, students should receive some historical notions about the development of the ideas of infinitesimal calculus, I think that, even in a freshman course of calculus, the true definition of derivative of a function should be given, that is, via the first order approximation. A function $f:(a,b)\to\mathbb{R}$ is differentiable at $x$ if there exists $m$ such that $$f(x+h)=f(x)+mh+o(h)\quad \mathrm{as}\\ \\ h\to0. $$

The fact that the coefficient $m$ (the derivative) can be characterized, and sometimes efficiently computed, as a limit of a quotient, has certainly to be observed, and should be applied immediately to treat some elementary functions like $x^2$, $1/x$ or $e^x$, as usual. But I would never give it as a definition.

I think there is a philosophical issue here. It may seem simpler to define something as the result of a procedure for getting it, compared with defining it via a characteristic property. But the latter way is superior, and on a long distance, simpler. And in the case of students who will stop there their mathematical education, then, I prefer they at least see the true idea behind, rather that being able to compute the derivative of $\cos(e^x)$ : when will that be of use for them?

The definition via first order expansion is very natural, and more understandable to the freshman students. It has a more direct geometrical meaning. It reflects the physical idea of linearity of small increments (like in Hooke's law of elasticity, etc). It is much closer to the practical use of derivatives in approximations. It makes easier all the elementary theorems of calculus (consider how needlessly complicated becomes the proof of the theorem for the derivative of a composition by introducing a useless quotient). Finally, it is closer to the generalization to Fréchet differential, which is a good thing for those students that will continue their study in maths.

A funny remark, from my experience. Ask students that received the definition of derivative as limit of incremental quotient, to compute $\lim_{x\to 0 }\sin(x)/x$. Will anybody say, it's the derivative of $\sin(x)$ at $0$, that is $\cos(0)=1$? No, they will try and use the "rule of de L'Hopital"!

Pietro Majer
  • 56,550
  • 4
  • 116
  • 260
  • +1. This reminds me of the following question: without making use of the fact that $\lim_{x\to 0}\frac{\sin x}{x}=1$, are there any other ways of developing the formulae for the derivatives of the trigonometric functions? It seems there's no way to escape this (which makes it even funnier that people actually use l'Hôpital for this). – J. M. isn't a mathematician Sep 27 '10 at 12:40
  • 26
    From my prof back in the days: "Some of you might have heard of a thing called L'Hopital's rule. It has a lot of hypotheses that no-one ever checks, and students always apply it when the quotient is in the wrong form, so I won't teach it and you'd better not use it." And now I do the same... (I didn't mind when he said that because I was one of the ones who'd never heard of it.) – Thierry Zell Sep 27 '10 at 12:42
  • 1
    Yes! I think that introducing the differential of a function of several variables would be much easier for students if they had this point of vue on derivative. – Benoît Kloeckner Sep 27 '10 at 15:57
  • 1
    @Thierry If I was chairman of the mathematics department when your professor was teaching and I'd heard that,he'd have been fired on the spot unless he had tenure-and if he had tenure,I'd have made sure he never taught it again. He should have been ashamed of himself.Really. If you think this material is usually presented so lousily,do something about it! – The Mathemagician Sep 27 '10 at 16:53
  • J.M. - If you can make them believe that $e^ix = \cos(x) + i\ \sin(x)$ then it is pretty easy once they know the derivative of $e^x$. But that's a pretty tall order there – Simon Rose Sep 27 '10 at 17:17
  • 9
    @Andrew: and if I was the chairman in another maths department, I'd immediately engage him with double salary. De'LHopital himself would be embarassed to know somebody's still wasting time with such an awkward theorem like that thing that brings his name. Theorems, like cakes, don't always come out well; that thing came out very badly, and left a mess in the oven. Today, it may be at most of some historical interest. Teach the Landau notation instead! (Btw, as you probably know, Edmund Landau was fired from Göttingen in 1933, with the pretext of his way of teaching a calculus course.) – Pietro Majer Sep 27 '10 at 19:13
  • 16
    @Pietro: what you give is essentially Caratheodory's definition, as alluded to in my answer. It's so close to the usual definition that I don't really believe that students have a significantly easier time with it. However, I believe that when you teach calculus, this definition inspires you and you do a very good job teaching it, more so than you would with the standard definition. I suspect that most "the students find it easier when..." statements are like this, but that's fine -- finding the version that you can get behind enthusiastically and explain well is part of good teaching. – Pete L. Clark Sep 27 '10 at 19:49
  • @Andrew: the prof's point though was that you never need l'Hopital's rule. – Thierry Zell Sep 27 '10 at 20:18
  • 1
    @Pete: This is a good observation, and I certainly agree with it. There's also a vague and soft irony in it, that at the moment I'm not able to answer otherwise than upvoting your comment. – Pietro Majer Sep 27 '10 at 20:34
  • 5
    About that funny remark: even saying $\lim_{x\to 0} \sin(x)/x$ is the derivative of $\sin(x)$ at 0 may be viewed as cheating, since the typical textbook approach is to use a geometric argument to prove $\lim_{x\to 0} \sin(x)/x = 1$ and then use that limit to prove that $\frac{d}{dx} \sin(x) = \cos(x)$. – Mark Meckes Sep 28 '10 at 14:27
  • 1
    Without L'Hôpital's Rule, how do your students evaluate limits such as $\lim_{x \searrow 0} x^x$ (or $\lim_{x \searrow 0} x \ln x$)? Not that I can't think of a way to do it, but I'd like to know what you (those who wouldn't teach the rule) would suggest to their students as the method of attack. – Toby Bartels Apr 03 '11 at 23:02
  • 2
    Well for $\lim_{x\to0+} x\log x$ I guess they would make a substitution $x=e^{-t}$ to get $\lim_{t\to+\infty} -te^{-t}=0$ (and for the latter, they know how to use inequalities such as $e^t\ge t^2/2$ for $t>0$ to conclude). – Pietro Majer Apr 04 '11 at 06:29
  • @Mark Meckes: I don't think so; actually what I'm saying is that applying de l'Hopital's rule to compute that limit would be cheating. Recognising that this limit is the derivative of $\sin(x)$ at $0$ is just calling a thing by its name, which is quite a fair fact -especially if one already knows the name. But I agree that one should also keep in mind the way these identities were introduced, to avoid circular or redundant argments. – Pietro Majer Sep 15 '13 at 11:09
37

I agree with the above comments.

The point of my comment-question "What competing definition do you have in mind?" was to emphasize something that seems to be under-emphasized in the question itself: the reason we speak of derivatives as limits is because that's the definition of the derivative, and we want to give a definition of the concept that is going to be discussed for much of the semester.

[It is possible to give other definitions of a derivative, but they are all variations on the same theme and, in particular, all use either the concept of limit or the (equivalent!) concept of continuity. For instance, Caratheodory has a nice definition of the derivative in terms of functions vanishing to first order, but this is not going to be any more palatable to the freshman calculus student.]

[Added: I admit that I forgot about nonstandard analysis when I wrote the above paragraph. That indeed has a somewhat different feel from the usual limits and continuity. One the one hand, although I have never taught calculus this way, I rather doubt that doing so would suddenly make the difficult concepts of continuity and differentiability go over easily. On the other hand, I certainly couldn't decide to teach a nonstandard approach to calculus because it would be...nonstandard. The curriculum among different sections, different classes and different departments has to have a certain minimal level of coherence, and at the moment the majority of the grad students and faculty in every math department I have ever seen are not familiar enough with nonstandard analysis to field questions from students who have learned calculus by this approach.]

If we don't give a definition of the most important concept in the course, then we lose all pretense of developing things in a logical sequence. In particular, it's hard to see how to discuss the derivations of any of the basic rules the students will actually be using to compute derivatives, and thus we would be forced to reduce calculus to a (long!) list of algorithms based on certain unexplained rules.

Nevertheless I take your question seriously, since I have taught a fair amount of freshman calculus in recent years. It is absolutely correct that a lot of students get impatient, angry and/or confused at the limit definition of the derivative (or really, at anything having to do with limits and/or continuity). I do derivations of things like the product rule and the power rule rather quickly in class, because I know that something like half the class isn't following and doesn't care to follow. And yet I do them anyway (not all of them, but more than half) because, to me, not to do them makes the course something I could not bring myself to teach (and, by the way, would put it well below the level of the AP calculus class I had in high school: I feel honorbound to give to my calculus students not too much less than was given to me). Thus there is a real disconnect between the calculus class that I want to teach and the calculus class that something like half of the students want to take. It's discouraging.

I would be happy to hear that I am making a false dichotomy between giving the limit definition of the derivative and just giving algorithms to solve problems. I definitely experiment with different kinds of explanation beyond (and instead of!) just a formal proof. Here are some things I have tried:

1) Take the definition of continuity as primary, and define the limit of a function at a point as the value at which one can (re)define the function to make it continuous. I think this should be helpful, since I think most people have an intuitive idea of a "continuous, unbroken curve" and much less of the limit of a function at a point.

2) Emphasize physical reasoning. The last time I taught freshman calculus, I spent the entire first day talking about velocities: first average velocity, then instantaneous velocity. If a differentiation rule has a plausible physical interpretation -- e.g. the chain rule says that rates of change should multiply -- then I often give it.

3) Emphasize "chemical reasoning", i.e., dimensional analysis:. I often give the independent variable and the dependent variable units and emphasize that the units of the derivative are different from the units of the original function. In this way one can see that the conjectured product rule $(fg)' = f'g'$ is dimensionally wrong and thus nonsense. (And again, the chain rule is "obvious" from a unit conversion perspective.) Similarly dimensional analysis should stop you from saying that the volume of a cylinder is $\pi rh$.

Unfortunately none of these things have worked with the portion of the class that doesn't want to hear anything but how to solve the problems.

Added: To more directly address your specific question: yes, there are problems one can ask of freshman calculus students which require them to use the limit definition of the derivative rather than (just) the differentiation rules, but I do not recommend asking many of these questions, since the students find them very difficult. A personal example: when I was teaching Math 1A (first semester calculus) as a graduate student at Harvard, we had communal exams but the course head (who was a tenured professor of mathematics, hence a very brilliant person) had the final say. On the first exam, we decided that one of the questions was too hard, so at the last minute the course head replaced it with the following one (which he did not show to us):

Consider the function $f(x)$ defined as $x^a \sin(\frac{1}{x^2})$ for $x \neq 0$ and $f(0) = 0$. What is the smallest integer value of $a$ such that $f$ is (i) continuous, (ii) differentiable, (iii) twice differentiable?

I had the good fortune to grade this problem. Out of $200$ or so exams, the median score was $0.5$ out of $12$. About three students wrote down the right numerical answer for part (iii), but this was not supported by any work or reasoning whatsoever.

Added: by the way, it's not as though the above question is "bad" in the sense that it's not testing mathematical competence and depth of understanding of calculus. I think it absolutely is, just at a level way above that which one should be testing in a freshman class for non math majors. For the next few years, when the story came up in a social setting involving mathematical hotshots, after telling it I would press them for an answer to part c) on the spot. Most people I asked did not get it. (Note that I would not of course give them pen and paper and a quiet spot to think about the problem for some period of time. I generally required an answer after a minute or so. Let's hold PhD mathematicians to higher standards than freshman non-majors after all!) For instance, I watched a cloud pass over one Fields Medalist's face as he got very confused. After a while though I stopped using this as a pop quiz in addition to a story: I can't explicitly remember why, but I'd like to think it dawned me how obnoxious it was to put people on the spot like that...

Pete L. Clark
  • 64,763
  • Thanks for the answer and sharing your experiences with this. Of course as a mathematician I understand that we give the definition because otherwise we have no logical foundation to work with, but I also understand that there is a huge disconnect between the mathematician's mentality and what students expect out of a freshman course. As a young person I really know nothing about good teaching so it's good to hear about what kind of balance is possible. – Steven Sam Sep 27 '10 at 06:18
  • 4
    +1. Differentiation can be done without limits too, http://en.wikipedia.org/wiki/Formal_derivative I interpreted the question as distinguishing between derivatives in analysis and "generating functions". Even most trigonometric functions have combinatorial meaning and so their derivatives can be computed formally. But as you say that misses the point of calculus (continuity, physical reasoning etc.). – Gjergji Zaimi Sep 27 '10 at 08:37
  • 3
    The freshman-level example is especially biased because students tend to believe that any problem with two letters in it is very hard. But I've had a reasonable degree of success with a problem of this type, using a value for a (disclaimer: at a good school, though no Harvard). But it's also because I'd spent some time on this in class; you can't spring this on students out of the blue like that professor did and expect they'll do well. – Thierry Zell Sep 27 '10 at 11:41
  • 2
    Teaching undergraduate calculus using nonstandard analysis is not out of the question. I haven't done it but I know others who have, using for example Henle and Kleinberg's Infinitesimal Calculus. – Timothy Chow Sep 27 '10 at 14:50
  • Was your «who was a tenured professor of mathematics, hence a very brilliant person» parenthesis ironic? :P – Mariano Suárez-Álvarez Sep 27 '10 at 16:55
  • 1
    I really like the unit analysis example. I might keep that in mind next time I teach. – Simon Rose Sep 27 '10 at 17:12
  • 5
    @Mariano: no, I was dead serious. The point was that this person was far too bright to realize that this was a ridiculously hard question for freshman calculus. – Pete L. Clark Sep 27 '10 at 19:33
  • 2
    @Thierry: I like your quote of "any problem with two letters in it is too hard." – dvitek Sep 27 '10 at 20:36
  • 6
    A much simpler version of your problem is the following (I actually used this problem in the past):

    Consider $f(x)= x^2 \sin( \frac{1}{x})$ for $x \neq 0$ and $f(0)=0$.

    Find $f'(0)$.

    This function is differentiable at zero but $f'$ is not eve continuous at 0 (so no power series representation), so I doubt that any other approach than the limit would work.

    I think this is one of the simplest examples which explains why the limit is needed.

    Also, what if one discovers a completelly new great function $f$, how does one find $f'$?

    – Nick S Sep 27 '10 at 22:45
  • 1
    To belatedly reply to Timothy Chow: when I said that I couldn't teach a nonstandard calculus course if I wanted to, what I meant was that I alone couldn't do it. It would have to be a decision made at the departmental level, and I don't think it would be an easy sell. – Pete L. Clark Nov 06 '10 at 18:05
27

I wanted to add one further point to the many good answers already given here: "black box" symbolic computation, in the absence of understanding the formal definitions, can work when everything goes right, but is very unstable with respect to student errors (which are sadly all too common). Knowledge of definitions provides a crucial extra layer of defence against such errors. (Of course, it is not the only such layer; for instance, good mathematical or physical intuition and conceptual understanding are also very important layers of defence, as is knowledge of key examples. But it is a key layer in situations which are too foreign, complicated, or subtle for intuition or experience to be a good guide.)

For instance, without knowing the formal definition of the derivative, a student could very easily start with a true formula such as

$$ (x^2)' = 2x$$

and do something like "substitute x=3" to obtain the false formula

$$ (9)' = 6.$$

(An example that I have actually seen: someone attempted to prove Fermat's last theorem by starting with the equation

$$ a^n + b^n = c^n$$

and then differentiating with respect to $n$. Ironically, a variant of this type of trick actually works when solving FLT over polynomial rings, but that's another story...)

Now, without bringing in the definition of a derivative (and of a function), how could you explain to the student what went wrong here in a way that the student will actually remember? Saying that one can use the law of substitution or the trick of differentiating both sides in some situations, but not in others, is likely to be recalled inaccurately, if at all (and may have the side effect that the student may view such basic moves as substitution as somehow being "suspect", thus avoiding it in the future).

Terry Tao
  • 108,865
  • 31
  • 432
  • 517
  • 2
    People say I'm mean for asking for the derivative of $\pi^2$, but I think it's a memorable example for the students. Another place the blind symbol manipulation goes wrong is on $\sin^{-1} x = \arcsin x$. Many students are willing to assume that if $f(x)=g(x)$ then $f'(x)=g'(x)$, which is only a consequence for some meanings of the first equation. – Douglas Zare Nov 06 '10 at 16:43
  • 2
    @Terry: I agree that the black-box use is an issue, but some might argue that it's possible to fix without necessarily going all the way to a limit definition, because the root problem is conceptual understanding of functions, and more precisely of the derivative as a function. A simple aphorism would suffice: "chug then plug, don't plug and chug!" More seriously, one could discuss the difference between the derivative as a function and the derived number at a point (slope) purely graphically, with no references to limits. – Thierry Zell Nov 07 '10 at 00:01
  • 4
    @Douglas: I didn't think the derivative of $\pi^2$ was a mean thing to ask! Who thinks so? students? colleagues? Never mind, I'll be sure to borrow it for next time. (Even meaner would be to ask for the derivative of $e^2$, btw.) As for the blind symbol manipulation, I cannot believe that it took me all these years before seeing for the first time (in an exam) that the derivative of $\arctan x$ was $\mathrm{arcsec}^2 x$. In retrospect, I should have been expecting this for a long time! – Thierry Zell Nov 07 '10 at 00:09
  • This didn't stand out among the students' complaints. Other instructors flagged differentiating $\pi^2$ as a trick question, and I see their point. I still like the question, but I moved it from the quiz to classroom discussion, and revisited it as an example of the chain rule. – Douglas Zare Nov 07 '10 at 08:41
  • 1
    I think this type of error is avoidable if you introduce $x$ not as a variable, but as a special symbol for identity function. Even more, you can use bold $\mathbf{x}$ for the identity function and normal $x$ for a value. – Anixx Jan 02 '11 at 18:35
  • 23
    When my father was a judge at a high school math fair, a student gave a presentation on calculus. During the question period after the presentation, he asked the student "If f(x) = 3^2, what is the derivative of f(x)?" The student said "6". My father then asked "If f(x) = 9, what is the derivative of f(x)?" The student said "0". He asked the first question again and the student still said "6". Of course this student did not go on to the next round of the math fair. – KConrad Jan 23 '11 at 20:31
  • How about ‘Don't use substitution in formulas that don't really make sense?’. Of course, you're right that people need to be taught what things mean so that they know what makes sense, but why do we use bastard notation like ‘$(x^2)' = 2x$’ in the first place? Substitute into $\mathrm{d}x^2 = 2x, \mathrm{d}x$, and the result is true. – Toby Bartels Apr 03 '11 at 20:14
21

The derivative of $x|x|$ is best computed via the "limit" definition. A more general example is $xf(x)$ where $f$ is any continuous function, and we are computing the derivative at $x=0$.

  • 7
    If you want example with nonzero derivative consider $x|x|+x$. – Igor Belegradek Sep 27 '10 at 17:12
  • 1
    For the more general example, one should also ask that $f$ is not differentiable at $0$. (By varying such $f$, we can get any derivative at all, or none. Taking $f(x) = x \sin(1/x)$, extended continuously, gives a very interesting example that's come up elsewhere on this page.) – Toby Bartels Apr 03 '11 at 23:08
  • This example can be done straightforwardly without limits, using infinitesimals (either informally or in NSA). –  Oct 06 '12 at 02:10
21

I am surprised that no answer has explicitly mentioned the fundamental theorem calculus yet: that is a classic, and important, instance of calculating the derivative using the limit definition. So, for example, the integral sine function

$$\int_0^x \frac{\sin t}{t} dt $$

has important applications in signal processing and the cumulative distribution function of the normal distribution $N(a,\sigma^2)$

$$\frac{1}{\sqrt{2\pi\sigma^2}}\int_{-\infty}^x e^{-\frac{(t-a)^2}{2\sigma^2}}dt$$

is the bread and butter of probability and statistics. Both functions are not elementary and their derivatives, while significant, would be impossible to calculate by other means.

I also disagree with the comment that piecewise defined functions "are not good at all" for illustrating the definition of the derivative based on limits. In fact, piecewise polynomial functions, in the form of splines, are used in mechanical engineering (e.g. to design the shape of the car body), and provide a neat opportunity to relate conceptual and computational aspects of derivatives.

  • 6
    +1, the guys who deal in splines always make a big deal out of left and right continuity for making the approximant $c^n$ (for whatever value of $n$ is needed by the application). I too feel that the error function and the sine integral are too important not to at least be given a passing mention in the context of the Fundamental Theorem. – J. M. isn't a mathematician Sep 28 '10 at 01:55
  • I agree with both points here. – Deane Yang Sep 28 '10 at 02:08
20

The way that Calculus is traditionally taught gives a false impression that every function worth looking at can be differentiated using the rules of differentiation. This comes from a misconception that any function worth looking at can be described by an algebraic formula, or using trigonometric or logarithmic functions.

That's just not the case: the most common everyday functions don't have any formulas. Some examples:

  1. Price of a company stock over several decades.
  2. Volume of water in a water tower over the course of a week
  3. Median price of a house in your area (adjusted for inflation), over the course of 100 years.
  4. US National Debt over the last two hundred years.
  5. US Deficit

For such functions, rate of change has a very real meaning. I find that students who had Calculus in high-school are stumped if I give them an example like that and ask them to graph the rate at which, say, the US national debt has changed throughout US history, and how that relates to the deficit.

Understanding the derivative as both rate of change and the slope of the tangent line helps, and the only good way to tie those concepts is with using limits.

  • 3
    @Anna: I agree heartily. To take it a step further: I have often thought that the traditional calculus sequence lacks an applied component which severely limits its usefulness to those who are not going on to physics and math. As you say, when given a real world function of interest, you are generally not given an algebraic expression for it. Rather, in order to apply the methods of calculus in a quantitative way, there needs to be a step where you create a mathematical model of the function. I was never taught how to do this step myself, and it seems not to be at all trivial... – Pete L. Clark Jan 29 '11 at 22:38
  • 2
    Once or twice I tried to remedy this by asking exam questions like: "Write down an explicit function which has local minima at $\pm 2$ and approaches infinity as $x$ approaches $\pm infinity$." I was hoping that a student would see that a simple function satisfying these conditions is a fourth degree polynomial with double roots at $\pm 2$, thus $f(x) = (x-2)^2 (x+2)^2$. But they had a lot of trouble with this, and the answer to my question "Does teaching the standard curriculum give them the tools to answer this question?" was "No." But wouldn't it be great if students could do this? – Pete L. Clark Jan 29 '11 at 22:44
  • 2
    @Pete: I have been thinking a lot about how the calculus we teach is too neat and not applied enough these days, and I've been thinking about it not because of my calculus classes, but because of my higher-level courses where I've had to use "messy" or "exotic" stuff like Taylor expansions pretty often to motivate some avenues of investigation. I wish my students were more comfortable with this, and expected less of the neat closed-form formula results. Though of course, we should teach them to appreciate when closed-form stuff happens too! – Thierry Zell Mar 12 '11 at 01:25
  • 3
    I appreciate your point, but your examples are all discontinuous functions, so in fact the notion of a limit doesn't quite work. For example, the US deficit can only change in steps of one cent, the amount of water only in steps of one molecule. These could all be used as examples to show that the Cauchy-Weierstrass limit does not actually connect to reality. Perhaps a better point to make with these examples would be that students could benefit by understanding discrete/numerical calculus as well as the classical calculus of continuous real functions. –  Oct 06 '12 at 02:47
19

The definition of derivatives is useful in exercises about functional equations. Ever solve $f(x+y)=f(x)f(y)$ ? A more elaborate one is $[f(x):f(y):f(z):f(t)]=[x:y:z:t]$, functions preserving the cross-ratio (= anharmonic ratio).

However, we should not neglect the interest of the black-box side of mathematics. We should remember that it is this aspect which has made mathematics so much unavoidable in Science. Somehow, it participates to the ``unreasonable effectiveness of Mathematics in the Natural Sciences'' (E. Wigner's famous statement). After all, the definition of derivatives has the same status as the constructions of ${\mathbb Z},{\mathbb Q},{\mathbb R},{\mathbb C}$. One can spend a year without thinking about them, while using these fundamental objects every hour, by applying rules. Do you remember the construction of the polynomial algebra $k[X]$ ? How would you define $\pi$ ? In a more advanced situation, chemists have efficient rules to deal with characters of representations of finite groups, and they do not need to read a justification, or to remember it, even though the first Chapter of J.-P. Serre's book was intended to be read by his chemist wife. Mathematics is the tool box of Science. It is even a tool box for itself, in the sense that new topics use the older ones. To go further, we must accept older truths. Of course, it is way better to accept them for good reasons, that is, because we have completely understood the definitions. But if the half of a classroom, who does not intend to do mathematical research, neglects the definition and prefer focussing on the rules, there is no problem at all, provided they apply the rules correctly. There are many ways to learn rules, one of them being solving a lot of exercises.

Denis Serre
  • 51,599
  • 1
    +1. But, I think a point some other people have been making is that students should at least see the formal definition. But I agree that you shouldn't beat it over the head of students who aren't happy... – Matthew Daws Sep 27 '10 at 12:07
  • 5
    If scientists want their students to learn a sequence of formal rules, I think they should be the ones to teach them. Personally I want to teach mathematics, and mindless manipulation of symbols is not math. – Steven Gubkin Nov 11 '10 at 15:31
17

An example I like is $\exp(-\frac{1}{x^2})$ and the "bump functions" one can construct with it.

First of all, this example is important in differential geometry (e.g. Whitney's embedding theorem) and complex analysis (as an example of a real $C^\infty$ function which isn't holomorphic).

In second place, even in first year calculus it's an important illustration of the concept of derivative and of Taylor's theorem. It's important in my opinion to understand why all derivatives at zero are zero (i.e. because it goes to zero faster then any polinomial) but even so the function is changing values.

Pablo Lessa
  • 4,194
  • 4
    The example is well-chosen, but your parenthesis sounds misleading to me :there exist functions that go to zero faster than any polynomial at zero, while they are not even twice derivable. – Benoît Kloeckner Sep 27 '10 at 15:53
  • 1
    You're right! Thank you. To be honest, I didn't consider this possibility at the time of writing. I'll leave the parenthesis as is, since it still has some content and you're comment is right below. – Pablo Lessa Sep 27 '10 at 16:57
  • 1
    @ Benoi: Are there examples of such functions which are continuous in a neighborhood of 0? I ask because the only examples I can find are along the lines of the characteristic function of the rationals time exp(-1/x^2). – Steven Gubkin Nov 02 '10 at 22:59
  • 4
    Take $f(x) = \exp(-1/x^2)$ and $g$ a continuous nowhere differentiable function. The function $h(x) = f(x)g(x)$ is continuous, goes to zero faster then any polynomial when $x \to 0$ but isn't differentiable at any point other than $0$. Hence $h$ isn't twice differentiable. – Pablo Lessa Nov 09 '10 at 12:10
  • $\exp(-1/x^2)$ is at the heart of all distribution theory. – Jochen Wengenroth Sep 09 '20 at 09:41
16

Since this is community wiki, I'll feel free to share a possibly relevant anecdote; feel free to delete if you don't think this is an answer.

I once had a freshman calculus student ask me if they'd be required to learn the "Greek method" for calculating derivatives. When I looked puzzled, he explained to me that the "Greek method" involved taking a limit as $h$ goes to zero, and that this was the method the ancient Greeks had used back before anyone realized that all you have to do is bring down the exponent and then lower it by one.

  • 9
    Of course the Greek method uses a $\Delta$ rather than an $h$. –  Dec 15 '14 at 06:49
  • 6
    @quid: Well, of course. The Greeks were very primitive and had not yet discovered the letter $h$. – Steven Landsburg Dec 15 '14 at 06:56
  • 7
    We always learn by students. Not really in the topic, but I need to tell this recent one. A student showed me a couple of exercises on limits he had done (a bit worried). "Is it 1?" "Correct!" I said. "And this one? I got -2..." "Correct too, very good" said I. And he... "But isn't this in contradiction with the principle of uniqueness of the limit?". – Pietro Majer Dec 15 '14 at 13:36
13

This is a question that I also struggle with sometimes. On the one hand, I understand the value of sweeping things under the carpet when students are not ready for them yet. When I learned Calculus in High School, we talked about -- but never properly defined -- limits (I'm can't recall if we did the limit derivatives). Yet, we managed to go pretty far into the material, e.g. establishing recurrence relations for integrals of the type $\int_a^b e^{-x}\sin(n x)$. This lack of definition was a very frustrating point for me, and when I finally learned about $(\epsilon,\delta)$ two years later, a wave of relief washed over me. Yet, I'm pretty sure that my cohorts did not feel the same way, hence my sympathy for teachers who want to keep things simple by hiding the definition.

At the same time, I don't want my Calc course to be a series of magic tricks, so I always insist on the logical construction of the course: we want to investigate slopes of tangents. We want to work exactly, not approximately. This is why we'll get into limits in the first place (not very historical, but a logical development). So what do I do?

  • I briefly cover $(\epsilon,\delta)$ without really applying it. Just to show the difference between a "wordy" definition and a mathematical one.
  • I insist on the fact that the limit laws are derived fro this rigorous definition. (You can sketch the proof for the sum of limits for instance).
  • This sets up for students how the mathematical edifice is built: abstract definitions to formalize intuition, big gun theorems proved rigorously from these definition (limit laws, derivative laws...). Add a few examples to the mix and then you're set up for practical, mechanical computations (the stuff that computers do).
  • I am upfront about the fact that I don't expect my students to use the $(\epsilon,\delta)$ definition, though I like them to memorize it. The payoff will be later.
  • I also stress that, historically, calculus was done without this definition for a long time: so it can be done, they will be able to do it, but it also has its limitations when dealing with more abstract material.

In a course that is set up in this way, it is quite natural to cover the limit definition of derivatives. There are a lot of good reasons why one should do that anyway, some of which have already been addressed. Functions which are defined piecewise do require this, and that includes important examples like $\exp(-1/x^2)$ and fun ones like $x^2\sin(1/x)$. The rigorous derivation of the derivative of $\sin x$ is another good example.

There are also wrong ways of doing this. In the comments, Holger pointed to the case study in the Notices article Teaching mathematics graduate students how to teach. Here, the problem asked to use the definition of derivative to compute the slope of a certain cubic at a point. By the time the exam rolls around, you have easier ways of doing this, so of course the students would feel that this is an arbitrary and confusing question.

[Actually, I took so long to write this that I've been ninja'd by Pietro on this example.] One example that I have yet to see though is Taylor series: defining the derivative in this way makes it obvious that $$ f(a+h) \approx f(a)+f'(a)h+o(h)$$ and sets you up for the higher order ones. Yes, you can see that from the graph too, but at that level most of my students have a terrible time reasoning from an abstract graph.

Given how fundamental these ideas are, especially in Physics, I can never stress enough these kinds of relationship in my course.

Thierry Zell
  • 4,536
12

I hope my answer is read as a response to the question asked, rather than as either a defense of or disagreement with the choices the pedagogists (is that a word?) make.

I think one of the main reasons to teach derivatives in terms of the $h\to 0$ limit is that it captures the dual notions of "instantaneous velocity" and "slope", which are respectively physical and geometric.

(Ok, now I will mention some personal opinions about teaching calculus. I love physics, and sometimes pretend to be a physicist, so for me the geometric/physical meanings of calculus are very important. So I would love if they were emphasized more. Unfortunately, we do not do enough in introductory calculus classes in that direction, and it is very hard to present functions and ask students to find the slopes of their graphs without essentially teaching them these black-box techniques. So I don't know whether it's worth it: maybe we should just do the algebraic part of calculus --- it's the only thing we tend to test anyway. I also don't really think that MO is the best place to get into that discussion, though, and I don't think that OP intended as such.)

  • Interesting answer. I must stress the connections with Physics than I realized, since a student asked me just last week if I taught Physics also. You mention that we tend to only test the algebraic part of calculus; clearly, we should put our money where our mouth is. Because I emphasize the theory, I make sure there are at least some non-algebraic questions on my tests (you can make up some easy ones). If a prof thinks that the audience can only handle the algebraic stuff, I'm ok with it, but then you should be teaching only that. – Thierry Zell Sep 27 '10 at 12:00
  • I agree that there should be at least somewhat more physics in calculus courses than their currently is. To do it otherwise seems like another example of the decortication of calculus: the subject takes place in a certain intellectual context and as a response to certain scientific problems. A lot of students get really worried when you talk physics though, since they think you're expecting them to have some outside knowledge that they in fact don't have... – Pete L. Clark Jan 29 '11 at 22:18
  • 1
    ...The irony of this for me personally is that I haven't taken a physics class since high school, so all the physics I now know is that which comes up when explaining closely related topics in mathematics. For instance, if I am supposed to talk about first order differential equations, I feel compelled to talk at least a little about second order differential equations, especially those of the form F(x) = c x'', because this is the best motivation I know as to what solving differential equations means and why it's important:... – Pete L. Clark Jan 29 '11 at 22:21
  • ...the equation gives a law governing the behavior of some object over time, and the solution to the equation tells you what the consequences of that law are for the behavior of the object. In particular, my favorite example is m x'' = - kx. To solve this equation without writing down Hooke's Law, a picture of a spring, carefully explaining the physical significance of the minus sign...what a disservice that would be! By making it physical, you give the student a chance to use her intuition: "Well, if I were forced to satisfy this differential equation, what would I do?" – Pete L. Clark Jan 29 '11 at 22:27
  • Look at Morse and Feshbach's Methods for Theoretical Physics for an intuitive explanation of the form of the Schrodinger equation for the wave function for a free particle. – Tom Copeland Mar 22 '16 at 06:13
12

So far no one's mentioned (or did I miss it?) that if you make students compute $$ \lim_{w\to 5} \frac{w^6 - 5^6}{w - 5}, $$ then some of them will use L'Hopital's rule to do that, if you don't tell them not to.

Here's an example of something I have students do with the limit definition of the derivative: http://wnk.hamline.edu/~mjhardy/1170/notes/quiz.10.19.pdf

They find all sorts of creative ways of getting things wrong when doing this. Here's another: http://wnk.hamline.edu/~mjhardy/1170/homework/13th.pdf

I think after they've done several like this, they actually do learn what this is for, and that it's not being used as a way to avoid quick and efficient ways of computing derivatives.

But I have them thinking about instantaneous rates of change without using limits on the first day of the course: http://wnk.hamline.edu/~mjhardy/1170/handouts/September.8.pdf

Michael Hardy
  • 11,922
  • 11
  • 81
  • 119
10

Well, the definition of derivative is probably one of the best application of the notion of limit, from a didactical point of view. If you define the derivative as a limit process then students who understand it will not miss the geometric flavour: the slope of the tangent line is the limit as $h\rightarrow 0$ of the slope of the line through $(x, f(x))$ and $(x+h, f(x+h))$. I think this is beautiful and relatively simple, once you get the students to think about it for a minute. Plus, it answers the question "When do we agree that the graph of $f$ admit the existence of a tangent line at $(x, f(x))$"?

Of course one has to keep in mind that for most students the useful thing to learn is how to compute practically a derivative without using the definition but rather applying a collection of rules. Nevertheless I think it is important to give them an idea of where all these rules come from. Think about those students who want to get a a math major? No?

In Italy in the so called "scientific high school", the schools that provide you with the widest and most basic education (you learn a bit of everything) with a focus in math, physics, chemistry perhaps, ecc.. we are taught the limit using the $\epsilon-\delta$ definition, and the derivative from its definition. This is to say that I think it is possible to have students learn this theoretical aspects of calculus, if high school kids do.

  • 6
    As you probably know, the Italian government is now planning to gradually change the high school teaching programs into three main topics: "Religion", "Use of guns", "Commercials". So the content of the maths programs of the Italian high school is becoming soon a historical topic. If this is the trend, I guess (I hope) Italy itself will become soon a historical topic too :-( – Pietro Majer Sep 27 '10 at 12:22
  • I also learnt calculus in high school using the $\epsilon$-$\delta$ definition. This was an ordinary public (state-run) high school in the United States. – Toby Bartels Apr 03 '11 at 21:11
  • It can create issues for students (due to their typical lack of logic skills in first year) to think about derivatives giving a criterion for the existence of tangent lines. This is because the first thing they think of after a line is a circle, and the tangent to a circle is hard to describe in this way. – Glen Wheeler Oct 21 '14 at 04:03
  • @GlenWheeler, as far as the concept of the slope of the tangent line via a limit goes, it works just as well to consider $\lim\limits_{\substack{(x, y) \in C \ (x, y) \to (x_0, y_0) \ x \ne x_0}} \dfrac{y - y_0}{x - x_0}$; this makes sense for any curve, though it fails for points on vertical segments. – LSpice Mar 07 '20 at 22:27
10

A problem I like to give students to solve shortly after introducing the derivative is to evaluate $f'(2)$ for $f(x) = x^x$. Of course, this function can be rewritten as $f(x) = e^{x\ln x}$ but in my experience students don't think of this. In fact, students who have seen Calculus before almost universally reach the solution $f'(2) = 4$ which they get from the mistaken idea that $f'(x) = x\cdot x^{x-1} = x^x$. The only students that usually get this problem correct are those that haven't yet learned any of the computational methods and only know the definition.

I teach the limit definition and emphasize the physical and geometric interpretations, and then move from that to the concept of the tangent line and linear approximation. I think these concepts encapsulate most of what is significant (intuitively) about the definition. I dislike exam questions that require students to compute derivatives using the limit definition when they know a "better" way to do it. It isn't too hard to write a problem where no formula for the function is given and ask students questions about the sign or approximate magnitude of the derivative or whether or not the function should even have a derivative. For students who to do not intend to pursue mathematics, this seems appropriate to me. Even those who become mathematicians will almost surely see these ideas again in complete detail in an elementary analysis course.

  • 2
    How do you differentiate x^x using only the limit definition? This seems like a tall order to me. – Steven Gubkin Nov 11 '10 at 15:35
  • 4
    I only meant that they would numerically estimate it at x = 2 using the limit definition. It is easy to estimate using the definition, but if they try to differentiate and plug in 2 they will probably get the wrong answer. – Jeremy West Nov 11 '10 at 18:13
8

The standard definitions of limit, continuity and derivative are things of beauty mathematically - flexible and well-honed like fine woodworking tools. But to get calculus students to care, and appreciate their meaning and significance, takes some motivation.

A pretty good way to motivate $\epsilon$-$\delta$ is that it has to do with determining what control on input error ($\delta$) is needed to guarantee meeting a given tolerance for output error ($\epsilon$). How accurately do you have to aim a spacecraft to ensure it enters Martian orbit without burning up the way Beagle 2 did, costing hundreds of millions? Students can appreciate this is a serious question, and that it is fair to insist they be able to handle simple examples like $f(x)=-100x+50$, $\epsilon=10^{-2}$. (In large lectures for freshmen, I wouldn't do much more than Lipschitz examples or something carefully designed so $\delta$ is easy to find without cases. Many calculus students are adults but, ahem, need practice with inequalities.)

One can tell engineering students who just want the formulas that they'll be surprised to find that in a couple of years they'll be estimating ``sensitivity coefficients'' numerically from black-box software or experiment. Gee, sensitivity coefficients are just derivatives, and they'll be estimated from the definition, not symbol-pushing.

Speaking of which, it's nice to express the error in the definition explicitly: $$ \frac{f(x)-f(a)}{x-a} = f'(a)+E_a(x), $$ and do the algebra that occurs to few to write $$ f(x)=f(a)+ f'(a)(x-a)+ E_a(x)(x-a) $$ This makes the nature of linear approximation a bit more apparent.

8

One way to avoid limits without losing too much is to teach the calculus of finite differences. Conceptually, the move from numbers to lists-of-numbers as first-class mathematical objects is easier than the move from numbers to real-valued-functions-of-a-real-variable, and the easier move also forms a good stepping stone to the harder one. One can develop the calculus of finite differences mutatis mutandis and thereby make the transition to infinitesimal calculus essentially painless. (So, for example, one should work not with polynomials per se, but with linear combinations involving rising or falling powers).
All the black box rules have their analogues, and all are reasonably easy to see and/or prove. Passing the limit, when it happens, comes as a welcome simplification.

Aside from the conceptual challenge of functions themselves, students find limits difficult because of their quantifier complexity. I have never understood why standard algebra pedagogy suppresses quantifiers, thus, for example, leaving many students unable to distinguish between unknowns (literals bound by existential quantifiers), variables (literals bound by universal quantifiers) and constants (literals that belong to the language itself). Students who miscalculate the derivative of $\pi^2$, mentioned elsewhere, don't get this distinction. People who become mathematicians usually "got it" without anyone spelling all this out, and then they learned about quantifiers studying logic in college, so they regard quantifiers as sophisticated and advanced. But most students don't "get it," and I think this accounts for the huge attitude downturn when they get to algebra.

David Feldman
  • 17,466
6

It's funny that actually many students believe that the symbiosis is always the other way around, i.e. derivatives are used to compute limits (l'Hopital etc.). My favorite example of an elegant calculation of a derivative using the limit definition comes from basic physics. Ask the students why the acceleration of an object performing uniform circular motion is always perpendicular to the velocity. One could come up with a non-elegant solution by writing the equations of motion and using a derivatives table, or one could observe the nice geometric proof of considering an infinitesimal isosceles triangle formed by the two velocity vectors that are a few seconds apart and notice that $\Delta \vec{v}$ is the base of this triangle and points toward the center of the circle.

Gjergji Zaimi
  • 85,056
  • 2
    If you use the derivative table for a scalar product (applied to the square product), from v.v=cte, you get 2v.a=0 -- that doesn't look that non-elegant. In fact, I use this very example to show my students (which are more into physics) that this rule gets results they're already well-acquainted with. That convinces them more than the proof... – Julien Puydt May 19 '12 at 13:16
6

$\frac{\sin x}{x}$ at $x = 0$ should be a good example.

P.S.: Talking about esoteric definitions, if you can introduce stationary point without derivatives, you can then introduce derivatives using sheaves, like you would introduce vectors on a smooth manifold. It would broaden the consciousness of your freshmen, he-he ^_^

6

I believe it gives a good conceptual or practical reasoning to why we would want to study calculus in the first place. As it was first introduced to me, we can always talk about the average speed a car has traveled over a certain distance. Even very small distances, but apart from a speedometer, how can we say 'I'm travelling XX km/h right now'. There enters the limit definition, where we want the instantaneous rate of change!

If we only presented the formal rules for differentiation, we run in to the same problem as high school students who dislike math present "But my calculator can just do it! Why do I need to learn this?!". If the fundamentals are not taught, one day they will be forgotten.

There are certainly other rigorous approaches to the derivative out there. The delta-epsilon method, which most students in their first year simply struggle to grasp as easily as the $h \rightarrow 0$ definition. This approach is typically reserved for the math majors who go on to take a course in analysis, not the general first calculus course for all science majors.

While I do not use this definition in practice, I am primarily not calculating derivatives, so take that for what it's worth I suppose.

Alex
  • 362
6

It is worth noting that there is a lot of historical precedent for teaching it as a limit, which occurs already in Euclid. I.e. Euclid characterizes the tangent to a circle as the unique line such that between it and any other line through the same point, one can interpose a secant (Prop. 16, Book III). (Strictly, he says equivalently that one cannot interpose another line between the tangent and the circle itself, i.e. every other line through the point is a secant.) Thus the tangent is the limit of those secants. Thus I believe one can easily say that the limiting point of view is the original one of Euclid. From this point of view, the idea of limit is the one used so fruitfully by the Greeks, and the contribution of the mathematicians of later times is to make that notion more precise.

On the other hand, if you want to avoid the conceptual difficulty students have with limits, you can follow Descartes instead, at least for derivatives of polynomials, and characterize the tangent line as the unique line such that subtracting its equation from the original function gives a polynomial with a double root at the given point. This leads to motivating the Zariski cotangent space, as M/M^2.

Both points of view also have a nice dynamic interpretation as realizing the tangent as the unique line intersecting the curve doubly at the point, understood as the limit of the two secant intersections,and measured by the presence of a double root.

But if you want a defense of limits, I suggest Euclid Prop. 16, Book III as ample precedent.

If you want a defense of making students practice using the limit definition, I propose that as noted above, this is the only way to get them to appreciate the fundamental theorem of calculus. That theorem cannot be appreciated by memorizing rules for derivatives, One must understand the definition and apply it to an abstractly defined area function. I suggest that one reason many students do not understand why the fundamental theorem of calculus is true, is that (again as noted above) they have not grasped either what an abstractly defined function is, nor what a derivative truly means.

So if you want them to understand the relation between the derivative and the integral, then I agree with others that they need to know what a function is and derivative is. The reasoning here is that once someone understands something, he can use it in more settings than could possibly be covered by any set of rules.

Another practical benefit of testing the use of the h-->0 definition to obtain derivatives of simple functions, is that it forces practice in algebra, trig identities, and exponentials, skills which most of my students are almost completely lacking.

However, I recommend you teach it any way that makes sense to you. after all you understand it, so whatever you say based on that understanding will be useful. Make up your mind what seems important to you, and go for it!

roy smith
  • 12,063
6

The answer I give my students is that mathematicians want to know what a word (in this case 'derivative') means in all cases, and the definition of the derivative is a communal agreement about what to say in strange cases such as the absolute value function. (Well, since I banish symbolic stuff from the first two weeks, I say 'function whose graph has a sharp corner like this one (draws on board)'.)

If students press further, I point out that in a literature class they are expected to learn the communal agreement on the difference between a 'simile' and a 'metaphor'. It helps that I am at a liberal arts institution and not a technical one.

Let me also use this opportunity to share a pedagogical trick:

I find it helpful (third time I've tried it) to break up the definition of $f^\prime(2)$ into two parts:

1) Define a new function $E_2$ by the formula $$E_2(x)=\frac{f(x)-f(2)}{x-2}.$$ 2) Take the limit of $E_2$ at 2.

To pull this off, you do need to take the function $E_2$ somewhat seriously; graph it, write formulas for it, et c.

Rationale:

1) It always helps to break up complicated definitions into smaller pieces.

2) It emphasizes that you take limits of functions (in the sense of machines that accept a single number as input and gives a single number of output) rather than of symbolic expressions.

3) Students get to really understand why a discontinuous function or something like the absolute value function is not differentiable (at the relevant point).

  • This semester I am teaching Calculus I following Rogawski's textbook. The chapter on limits spends some serious effort on getting comfortable with average rates of change, going as far as creating a table of values (easy to do with a graphing calculator) and numerically estimating the limit, before going on to derivatives in the next chapter devoted to derivatives. By the way, I don't see how this approach could accomplish a worthy goal stated in Rationale 3. – Victor Protsak Sep 29 '10 at 07:12
5

If it is just a question of definition but not a question of computation, I have heard when I was a student the following definition:

Let $f$ be a real continuous function, class $C^0$. Let $$ \Delta f(x,x') = \frac{f(x') - f(x)}{x'-x} \quad\mbox{defined on}\quad {\bf R}^2 -\{x=x'\} $$ If $\Delta f$ admits a continuous extension on the diagonal $\{x=x'\}$ then it is unique, and $f$ is said to be of class $C^1$. The function $f'x) = \Delta f(x,x)$ is then called the derivative of $f$

Of course this is the standard definition, nothing new under the sun, but the $\epsilon-\delta$ calculus if hidden, and of course, not for long :-)

Patrick I-Z
  • 2,259
  • 3
    It's an interesting alternative to the usual, but ultimately I'm afraid it's even more challenging than the usual one. Of course, here I'm not talking about using the definition to perform computations, but even to grasp the intuitive meaning, resorting to functions of two variables strikes me as more than a student can comfortably handle. – Thierry Zell Jan 27 '11 at 16:14
  • 1
    This also has the technical problem of not covering the case when a function is differentiable but not continuously so. (Of course, this might appear in a context where one is uninterested in such functions.) We can fix this by making it slightly more complicated: ask whether $\Delta{f}(a,-)$, the restriction of $\Delta{f}$ to a given vertical line ${x = a}$, admits a continuous extension; if so, then this is unique, and we define $f'(a)$ to be the new value $\Delta{f}(a,a)$. – Toby Bartels Apr 03 '11 at 21:22
4

Another alternative way of teaching calculus is via infinitesimals (for example the book Elementary Calculus, An Infintesimal Approach by Keisler). The way of thinking about calculus via infinitesimals is obviously very natural, and mathematicians (e.g. the pioneers of calculus, Euler etc...) have used arguments using infinitesimals long before they should have been really allowed to do so. Keisler's book (and in general the area of `Non-standard analysis') makes rigorous our intuition regarding infinitesimals, and is a set of rules that teach us how to formally reason with them. In my opinion this system is intuitive, but the student can never really have a proper understanding of what they are doing "from the ground up" with out some basic knowledge of model theory. The limit approach is less intuitive, but at least a student doesn't have to just accept some rules without truly understanding what's behind them. Possibly this infinitesimal approach is a half way house between teaching it properly with limits and just teaching rules of differentiation to people who aren't interested.

Adam Harris
  • 1,895
  • 1
  • 17
  • 18
  • Edward Nelson's "IST" version of non-standard analysis is vastly more user-friendly. Alain Robert's book on non-standard analysis takes that viewpoint, and is a marvel of lucidity. – paul garrett Dec 14 '14 at 23:33
3

As an alternative to the definition of the concept of derivative by using limits, there is also the definition used in a book title Calculus Unlimited.

The value of the derivative $f'(a)$ is $\ge m$ if (but not only if) there is some open interval about $a$ within which $f(x) \left\{ \begin{array}{c} > \\[4pt] < \end{array} \right\} f(a) + m(x-a)$ according as $x \left\{ \begin{array}{c} > \\[4pt] < \end{array} \right\}a$.

But as for using any definition to find $\dfrac{d}{dx} x^3$, one could simply omit that nonsense and give them problems like this:

$$ (fg)'(x) \overset A = \lim_{w\to x}\frac{f(w)g(w)-f(x)g(x)}{w-x} \overset B = \lim_{w\to x}\frac{\overbrace{f(w)g(w) - f(w)g(x)} + \overbrace{f(w)g(x) - f(x)g(x)}}{w-x} $$

$$ \overset C =\lim_{w\to x} \left(f(w)\frac{g(w)- g(x)}{w-x} + g(x)\frac{f(w)-f(x)}{w-x} \right) $$

$$ \overset D =\left( \lim_{w\to x} f(w)\right) \left(\lim_{w\to x} \frac{g(w)-g(x)}{w-x}\right) + \left(\lim_{w\to x} g(x)\right)\left(\lim_{w\to x} \frac{f(w)-f(x)}{w-x}\right) $$

$$ \overset E = f(x)g'(x) + g(x) f'(x). $$

(a) What statement is proved by the argument above?

(b) One of the steps labeled $A$ through $E$ above uses the definition of "derivative" twice? Identify it and explain your choice.

(c) One of the steps uses the definition of "derivative" just once. Identify it and explain your choice.

(d) Two of the steps use only algebra and require no knowledge of limits. Identify them and explain.

(e) One of the steps uses properties of limits discussed in Chapter 2. Identify it and explain.

(f) One of the steps uses the fact that differentiable functions are continuous. Identify it and explain.

Michael Hardy
  • 11,922
  • 11
  • 81
  • 119
  • I think that the first definition should be characterising when $f'(a) > m$, not when $f'(a) \ge m$. Otherwise, with $m = 0$, it incorrectly tells us that there is an interval around $x = 0$ where $f(x) = x^2\sin(1/x)$ is always greater than $0$ on one side, and always less than $0$ on the other side. – LSpice Sep 08 '20 at 12:10
  • @LSpice : You're quite mistaken. If $a=0$ and $f(x) = x^3,$ then the line $y=0$ goes from above the curve to the left of $0$ and below to the right, but the slope of that line is not less than $f'(a).$ And nothing in the first definition above implies $x^2\sin(1/x)$ is always positive on one side of $0$ and negative on the other side. Rather than definition says that for every line through $(0,0)$ with positive slope, there is an interval about $0$ for which the line lies above $x^2\sin(1/x)$ to the right of $0$ and below to the left, and similarly for lines with negative slope. $\qquad$ – Michael Hardy Sep 08 '20 at 16:22
  • 1
    I read your 'if' statement as an equivalence, which was why I objected; and so I was wrong because I misread but also, as you point out, because my proposed correction doesn't work. Thank you for the clarification (of my misreading) and for your counterexample. – LSpice Sep 08 '20 at 17:29
2

Since this was recently bumped up to the top of the list, I would challenge the basic assumption expressed in the title of the question. In fact we don't all teach calculus using limits; I teach it using infinitesimals. The basic ingredient that replaces epsilon-delta limits in this approach is the shadow relating an infinitesimal-enriched continuum and an Archimedean continuum. Once students understand the basic notions of the calculus such as continuity and derivatives, we present the epsilon-delta paraphrases of the infinitesimal definitions.

Mikhail Katz
  • 15,081
  • 1
  • 50
  • 119
  • 1
    What is the background of students whom you teach this way? What is their intended future trajectory? – LSpice Sep 08 '20 at 12:11
2

There was a recent article in the American Math Monthly, Analysis with Ultrasmall Numbers, that might be of interest. For a summary of its implementation in a high school classroom, see http://maths.york.ac.uk/www/sites/default/files/odonovan-slides.pdf.

A quick skim of its implementation seems to suggest that it provides a groundwork for some of the informal manipulations used in calculus-based physics classes.

Jason
  • 2,762
1

I'm interested in the differentials-based approach advocated by Dray and Manogue at Bridging the Vector Calculus Gap. This is for multivariable calculus, but they do discuss the one-variable version (pdf). As they mention, people have reviewed calculus (especially for science courses) in these terms, but has anybody lately taught it this way?

(Also, the theory behind this approach is a little unclear beyond the first derivative, which is what led me to this question.)

Toby Bartels
  • 2,654
  • I suppose that I should mention now that I have taught it this way, twice, in a terminal mathematics course intended for non-STEM majors (especially business): http://tobybartels.name/MATH-1400/. – Toby Bartels Dec 24 '11 at 06:44
  • Coming back to this years later, I've now been teaching the STEM Calculus sequence (Calculus 1, Calculus 2, Calculus 3), and I make heavy use of differentials in Calculus 1 and Calculus 3 (while for Calculus 2, where it makes less of a difference, this depends on whether they've already taken Calculus 1 with me). See http://tobybartels.name/calcbook/. – Toby Bartels Feb 28 '23 at 17:44
1

I am a STUDENT in 11th grade who has just finished BC Calculus.

I don't have a PhD or even a high school diploma, obviously.

But I think that beginning with h->0 is essential. Otherwise, we don't have any definition of a derivative.

But more to the point, I don't think it's strictly necessary to learn the concepts completely before you do symbolic calculations.

How would the teacher make sure that the students eventually learn them before it's too late, then: Leibniz's lovely notation.

Newton's notation is essentially a meaningless shorthand. Prime is arbitrarily chosen to mean a derivative. I don't like that. (I like it as a shorthand, but there is no real meaning behind it).

But Leibniz's has actual meaning: dy/dx is analogous to delta-y/delta-x.

If we are always used to writing dy/dx=... or df(x)/dx=..., then it is no great stretch to write things like df(x)=...dx. And this leads us nicely into differential equations by separation of variables, and concepts such as substituting variables when you integrate.

In my humble opinion, using Newton's notation should be avoided as much as possible, because it doesn't make it clear what you are doing and turns people into robots, mindlessly following the rules of differentiation.

I don't think that Newton's method id all bad. I think it may be good when you are taking derivatives of higher orders, because once you have the concept of derivatives down, it's more important to see that you are taking the derivative of another derivative. (The "exponents" in Leibniz's notation make it a bit confusing).

If I were a Calculus teacher (and I very well may become one someday), I would all but scrap Newton's notation.

  • What puzzled me as a calculus student decades ago was the meaning of the dx and the dy in dy/dx. I thought I'd solved the problem by understanding these components not to have independent meaning outside the "quotient" dy/dx . But then I met differentials in the context of multi-variable calculus and felt despair. Even worse, I saw differentials presented very formally - as "expressions" that "varied" in particular ways under coordinate changes. Only years later, in graduate school, did I finally understand dy and dx as linear variables living in cotangent spaces. – David Feldman May 19 '12 at 06:49
  • 1
    Which is all to say Leibniz's notation has its pedagogical dangers too! – David Feldman May 19 '12 at 06:49
  • What's wrong with viewing a dx as "a small change in x" and a dy as "a small change in y (resulting from the small change in x)"? – Steven Gubkin May 19 '12 at 07:22
  • 4
    The primes for derivatives are not Newton's notation. That is Lagrange's notation. Newton used a dot for derivatives, which is absent from math books but still used by physicists. See http://en.wikipedia.org/wiki/Notation_for_differentiation – KConrad Jun 30 '13 at 16:38
0

Euler's method for ODEs is a straightforward application of the definition of a derivative, and I have always introduced it that way. Of course, if you teach undergraduates in the US, you should also point out that Euler never played hockey for Edmonton, but should have.

-1

I try to answer here only to this part of the question: "does anyone out there actually use this definition to calculate a derivative that couldn't be obtained by a known symbolic rule?".

You wrote that using functions defined piecewise by different analytical expressions is no good for you, but I wouldn't be so categorical. This nice little educational paper by H.T. Kaptanoglu shows how far you can go using a function defined by a very simple closed-form expression everywhere except at one point: https://www.researchgate.net/publication/265989067_In_praise_of_yx_a_sin1_x

(I add that I completely agree, on the more general question concerning why to present the derivative that way, with the answer by Deane Yang.)

-4

Dear friends (and foes;-), limits are not needed to understand and do calculus. You can read my article at http://www.mathfoolery.com/Article/simpcalc-v1.pdf and my recent translation of a 1981 talk by V.A. Rokhlin at http://mathfoolery.wordpress.com/2011/01/01/a-lecture-about-teaching-mathematics-to-non-mathematicians/ I hope it will get you thinking.

The following is my response to fedja, it may be of interest to those who haven't read my article. First, let me indicate briefly my suggestions on how to approach calculus. The most important principle (of V.I. Arnold, explicitly stated by him in his recent Lectures on Partial Differential Equations) is to concentrate on examples, calculations, and applications, staying away from the generalities before they become necessary and the ideas behind them are well understood in concrete situations.

I will mostly discuss differentiation, see my article and my talk slides at http://www.mathfoolery.com/talk-2010.pdf for more details. In high school they teach kids how to factor algebraic expressions, long and synthetic division of polynomials, the fact that if $f(a)=0$ then $x-a$ divides $f(x)$ (for a polynomial $f$). Why not use it and develop differentiation of polynomials? Indeed, if you ask a high school student to make sense of $\frac{x^2-a^2}{x-a}$ for $x=a$, (s)he is very likely just to factor the numerator, cancel $x-a$ and stick in $x=a$ to get $2a$ (or $2x$). This stuff is purely algebraic and all the differentiation rules are immediate.

Parenthetically, once they are established for polynomials, they are forced upon us for any reasonable understanding of differentiation, at least for uniformly differentiable because of the Weierstrass approximation theorem, for example. Also differentiation as an aspect of factoring becomes apparent.

We also can similarly develop differentiation of rational functions and roots, and use implicit differentiation to do other algebraic functions. This gives us already a lot to play with and to apply. The sine function can be treated geometrically, as an aspect of kinematics of the uniform rotation. This broadens the range of potential applications.

To get to the geometric and intuitive meaning of differentiation, we can notice that $x^n-a^n-na^{n-1}(x-a)$ has a double root at $x=a$, or we can look at the expression $f(x+h)$, $f$ being a polynomial, as a polynomial in $h$ with coefficients depending on $x$. It has a constant term $f(x)$, the linear term $f'(x)h$ and all the higher order terms, so $f(x+h)-f(x)-f'(x)h=h^2r(x,h)$ where $r$ is a polynomial. This way, if we restrict $x$ and $h$ to some finite interval, we arrive at our basic estimate, uniform in $x$ and $h$:

$$|f(x+h)-f(x)-f'(x)h| \le Kh^2$$

that indicates how close is a polynomial to its affine approximation using its differential.

This inequality allows us to explain why polynomials with positive derivatives are increasing. We simply notice first that if $f' \gt C$ and $0 \lt h \lt C/K$, then $f(x) \lt f(x+h)$, and therefore $f(A) \lt f(B)$ when $A \lt B$. Then, by applying this fact to $f(x)+Cx$, we see that $f(B)-f(A) \gt C(A-B)$ for any $C \gt 0$ when $f' \ge 0$, and therefore $f(A) \le f(B)$. This is called the monotonicity principle, and it is the most complicated theorem in this approach to calculus. Everything else follows from it.

Now, to broaden our scope, we promote our basic estimate to the definition status and call the functions that satisfy this definition (uniformly) Lipschitz differentiable (LD). Derivatives of polynomials are polynomials, and differentiation of polynomials is related to their factoring. Likewise, derivatives of LD functions are Lipschitz. Indeed, we can rewrite our basic estimate as $|\frac{f(x)-f(a)}{x-a}-f'(a)|\le K|x-a|$, then notice that $ \frac{f(x)-f(a)}{x-a} = \frac{f(a)-f(x)}{a-x}$ and conclude that $|f'(x)-f'(a)| \le 2K|x-a|$. Moreover, $f$ is LD if and only if $f(x)-f(a)$ factors through $x-a$ in the class of Lipschitz functions of 2 variables, $x$ and $a$. Differentiation rules are straight forward.

I suggest to develop integration in parallel with differentiation (since they work and are understood better together) starting with simple examples of Newton-Leibniz theorem, and working our way to approximating definite integrals by approximating the integrands by the functions that are easy to integrate, say, piecewise-linear functions showing integrability of, say, Lipschitz function, positivity of integral and proving Newton-Leibniz.

Now I can get to fedja's objections. The Lipschitz theory takes care of all the piecewise-analytic functions, and that's almost everything that we deal with in elementary calculus. When we run into functions that don't fit into the Lipschitz theory ($x^{3/2}$, for example), we broaden our definitions by replacing $h^2$ in our basic estimate (with $|h|^{3/2}$ for our example). Then we observe that the theory still holds for the weaker estimates. The Holder estimates i.e. the ones we get by replacing $h^2$ with $|h|^{1+\gamma}$, $0 \lt \gamma \lt 1$ gives us much more room to play. All moduli of continuity are not needed for any problem involving only a finite number of functions, and it covers the vast majority of problems we encounter in calculus.

Now, in the classical treatment the extreme value theorem is used to prove the Lagrange theorem that is used to prove that a function with positive derivative is increasing. But in our approach we have a direct proof of this fact, so we don't need it. And we don't need the intermediate value theorem to prove the Newton-Leibniz since it can be proven directly using positivity of the integral. By the way, both of these theorems are non-constructive.

One may ask about minima and maxima within this approach. The monotonicity principle takes care of this topic, since it assures us that the point where the derivative changes its sign form plus to minus will be a local maximum; the similar obvious result is true for a local minimum.

Fedja also said that the inverse function theorem fails miserably within Lipschitz functions. My guess is that he was talking about the theorem that says that the inverse of a monotonic continuous function on a closed interval is continuous. This is true, of course, but it may be a good thing, since it raises our awareness of the fact that the inverses of very nice functions can be computationally horrendous. It also can be a motivation to consider some other moduli of continuity. As for the inverse function theorem about local invertibility of the differentiable functions, its treatment within the Lipschitz class is not much different from the standard, and is somewhat simpler.

In any case, fedja and I should probably take our dialogue elsewhere. Thanks for all the comments.

I also want to mention that similar approaches to calculus and introductory analysis have been tried with a good measure of success by Hermann Karcher at Bonn University and Mark Bridger of Notheastern University. See Karcher's lecture notes with an English summary at http://www.math.uni-bonn.de/people/karcher/MatheI_WS/ShellSkript.pdf and 2007 book "Real Analysis: a Constructive Approach" by Mark Bridger, where he defines differentiation via factoring of $f(x)-f(a)$ into $x-a$ in the class of continuous functions. Karcher said (in a recent e-mail to Dick Palais): "I taught my last Calculus course by first using only Lipschitz continuity. At the end of the first semester I reached uniform convergence of functions and continuity. From the second semester on the procedure was the standard one. The students liked it a lot, I still meet one or the other and they still smile." There is also a nice book by Peter Lax "Calculus with Applications," where he deals with uniform instead of the pointwise notions.

Added on 1/28/2011. I have just got an e-mail from Hermann Karcher. It says: "it was nice to hear from you again. I read Rokhlin's talk. Maybe one has to go even farther: Sometimes I think, everybody needs his own explanation, and a successful teacher is good in guessing what each individual child needs.I also read your paper. You won't be surprised that I am familiar with your arguments. I am now retired for 7 years. My last three semester beginners course was my most successful one. Today I attended the PhD colloquium of the student of a younger colleague. That student (and some of the younger people in the audience) had been in that last course of mine. They were happy to see me again and say how much fun those three semesters had been. For various reasons I was then in a situation where I could completely ignore, how the standard course in analysis proceeds. During the first semester I did my own stuff,ending up at continuous functions and their uniform convergence at the END of the first term. The next two semesters proceeded as usual - but all the fun we had came from building the foundations differently and the fun stayed with us. I wish you similar experiences. Don't try to convert too many grown ups, just enjoy teaching. Best regards --Hermann Karcher."

By the way, an English translation of Karcher's lecture notes is in progress. I also heard today via facebook form Ursula Weiss who is a math professor in Germany (we were both graduate students at Brandeis in the late 1980s) that she has just finished translating the first chapter.

Misha
  • 1
  • 2
    As you noticed yourself, Lipschitz is not enough. You need all moduli of continuity. Now, the modulus of continuity (even in your exposition) is a function that is blah-blah-blah and continuous at 0. But what does that last condition mean if there are no limits or epsilon-deltas? I thought of teaching this way but the intermediate value theorem and the fact that every continuous function attains its maximum do not get any easier for Lipschitz functions and the inverse function theorem just fails miserably. – fedja Jan 25 '11 at 12:44
  • Dear fedja, thanks for reading my article. Your impulse to dismiss it is understandable since the article makes your pet general notions of limits and continuity and your beloved pure existence theorems about intermediate values and attainment of maximum value look less relevant. But these belong to introductory analysis rather than calculus, and they are indeed rather irrelevant to the computational content of the subject. I will get to the inverse function theorem later, when I counter your objections one by one in the remarks I will add to my answer. – Misha Jan 25 '11 at 21:06
  • What does the Weierstrass approximation theorem tell us about differentiation of continuous functions? – Pete L. Clark Jan 26 '11 at 01:12
  • Also, the link in your third paragraph does not work for me. – Pete L. Clark Jan 26 '11 at 01:16
  • @Pete L. Clark: the Weierstrass approximation theorem tell us about differentiation of uniformly (=continuously) differentiable functions that differentiation rules still hold for them because they hold for polynomials. Sorry for broken link, will fix it asap. – Misha Jan 26 '11 at 01:42
  • @Misha: I would like to understand some of your ideas, but your writing (both in this answer and in the linked documents I was able to access) is rather obscure to me. Is there some reason you do not write things more carefully? For instance: "we promote our basic estimate to the definition status and call the functions that satisfy this definition (uniformly) Lipschitz differentiable (LD)." An inequality is not a definition: you need, for instance, quantifiers. It would be very helpful to include a sentence beginning "A uniformly Lipschitz differentiable functions is..." – Pete L. Clark Jan 26 '11 at 01:47
  • @Misha: it does? Could you give a precise statement and a reference? I take it you are aware of the fact that uniform convergence of a sequence of functions does not imply convergence of the sequence of derivatives. – Pete L. Clark Jan 26 '11 at 01:49
  • 1
    @Pete L. Clark The estimate is uniform (in some interval), and a uniform estimate can be taken for a definition. Look, if you try to understand the ideas, look for the ideas, not for some petty inaccuracies that can be corrected in a rather obvious way. From what I do with this inequality it's obvious that it meant to be a uniform estimate in both x and h. I write for people, not for computers. On Weierstrass: we approximate uniformly the derivatives, then everything works out, see any book where term-by-term differentiation of a sequence of functions is treated – Misha Jan 26 '11 at 03:09
  • @Misha: Your response is very disappointing. I am trying to understand what you're writing. I would have thought that as an author that was your goal: to be understood. When I say that more precise definitions would improve my understanding, I mean just that. Regarding Weierstrass approximation: once again, derivatives do not appear in the statement of this theorem, and the metric involved is not one which guarantees anything about convergence of the derivatives. If you mean a result involving uniform convergence of the derivatives, please say so... – Pete L. Clark Jan 26 '11 at 03:13
  • ..."Then everything works out." Again, because this is a very vague statement, I don't understand at all what you mean. Don't you want me to understand what you mean? If not, why are you posting here? – Pete L. Clark Jan 26 '11 at 03:14
  • The little problem is that those "pet existence theorems" I mentioned are actually very "computational". The existence of extrema theorem is the basis for Rolle, Rolle is the basis for mean value and second order Taylor and those are bread and butter for all estimates from determining concavity to the Newton method. If you don't have Rolle, you are stuck with "there exists some constant", which is computationally useless. As to the IVT, I use bisection far more often than Newton when I need real roots of something. – fedja Jan 26 '11 at 04:41
  • @Pete L. Clark: It was a parenthetical remark of a philosophical nature that is not used anywhere else in the post. I didn't mean to supply all the details. Picking on it is totally unfair. The hint I gave you, that we approximate the derivatives, should be sufficient for anybody who took an introductory analysis to figure it all out. I assumed that it would be enough for you, since you are a professor of mathematics. – Misha Jan 26 '11 at 04:58
  • 2
    Another thing forget is that in the differential calculus the task is not to pass down the differentiation tool but to convey the general idea of "local approximate linearity", which you can't do without explaining what the words "local" and "approximate" mean, which inevitably leads to some version of the epsilon-delta language. I gave up on an idea similar to yours (it is a natural reaction to BC students and you are by no means the first who came up with this) not as much because I had problems with some particular issue, but because I couldn't put the whole thing together. – fedja Jan 26 '11 at 05:13
  • @fedja: About IVT: a very similar result holds for a Lipschitz (or any given modulus of continuity) function. It says that by bisection you can always get an approximate solution $a$ to $f(x)=0$, not in the sense that if is close to the exact solution, but in the sense that $f(a)$ is as small as you want. The approximate Rolle also holds in a similar sense, that you can find $a$ such that $f'(a)$ is as small as you want. It takes care of both of your objections. You can see a nice critique of IVT in the 2007 book "Real Analysis: a Constructive Approach" by Mark Bridger (pages 67 and 163). – Misha Jan 26 '11 at 05:50
  • @fedja: Locally uniform differentiability with a given modulus of continuity makes it perfectly clear what local approximate linearity means. It is the basic estimate from our definition of the derivative! Where there is (uniform) epsilon-delta, there is a (uniform) modulus of continuity, differentiability, etc. Of course, tracking the moduli may be cumbersome, but at the beginning they work well. Hermann Karcher of Bonn University taught an intro analysis like this, but he jumped from Lipschitz to general uniform continuity. Ask Dick Palais. I am in Cambrige, MA if u r @BU we can meet. – Misha Jan 26 '11 at 06:11
  • 9
    "As small as you want" is exactly the expression you tried to eliminate. Once you start saying "there exists $a$ such that $f'(a)$ is as small as you want", I see no difference with "for every $\varepsilon>0$..." and you defeat the whole purpose of the alternative exposition. Moreover, in place of one standard "for every ..., there exists...", you introduce several nonstandard ones. The main aim of the whole exercise is to kill this construction, not to multiply it. – fedja Jan 26 '11 at 06:21
  • @fedja: There is a nice way to treat continuity by (Susan Bassein, An Infinite Series Approach to Calculus) defining first that an increasing function is continuous if it doesn't skip any values (i.e. you can draw its graph in 1 stroke). Then you can use it as a modulus of continuity to define continuity in general. Of course you want your moduli of continuity to be nice (say, subadditive) for the theorems to be pretty. But in practice (especially in introductory calculus) Holder estimates take cake of almost anything. – Misha Jan 26 '11 at 06:24
  • Tracking arbitrary moduli in compositions is a nightmare, besides, as I said, you cannot even define the abstract modulus of continuity, so we've got to stick to Lipschitz, which doesn't allow even to take square roots of non-negative functions. You can, probably, figure out how to teach each particular thing (except the inverse function) this way, but I'm very far from being convinced that it'll be consistent as a whole and will not be a dead end if someone decides to extend his knowledge of analysis. It still seems cheaper to me to spend 2 lectures on the standard definition of the limit. – fedja Jan 26 '11 at 06:33
  • Well, you are saying essentially that we can not deal with any approximations without dealing with limits, and I am not buying it. Moreover, jumping into limits, or paying a lip service to them by using the notation before students learn anything of substance, is much worse than introducing them when they become necessary and after the simpler instances of approximation are well digested. – Misha Jan 26 '11 at 06:33
  • You don't have to define them before you use the concrete examples, such as Lipschitz and Holder. Subadditivity of the moduli of continuity eliminates the nightmares, I have worked it out. – Misha Jan 26 '11 at 06:37
  • ----defining first that an increasing function is continuous if it doesn't skip any values---- OK, I'll sort of buy that, but by now it is a full fledged analysis course, just turned inside out: what is usually taught as various consequencies of continuity becomes a sophisticated definition of it. And how exactly do you propose to check that $x^{2/3}$ doesn't skip values? – fedja Jan 26 '11 at 06:38
  • 2
    ---You don't have to define them before you use the concrete examples, such as Lipschitz and Holder. --- Hah? It is not about "before" or "after". It is about whether you can "ever" do it consistently. I'm totally fine with doing a few exercises with Lipschitz functions first (sum, product, composition, etc.), then asking what exactly was important about that $|x-y|$ that popped up everywhere, but at this stage epsilon-delta come into play and I redo all we did with the standard definition. This trick does work and I do not argue about that. The question is how to avoid the second part. – fedja Jan 26 '11 at 06:46
  • 2
    About 90% of the students will never use limits as such, and they will never understand them. If you teach them at the beginning. If you teach them after you work with, say Lipschitz estimates, the students will have a much better chance. Don't forget about convergence of the series etc. These are the limits you can not avoid. I am just against teaching pointwise derivatives as limits and doing generalities before examples. Compare the explanation of $f$ increasing if $f'>0$ in the standard and simplified approach and see the difference! – Misha Jan 26 '11 at 06:52
  • You don't need the second part to work with piecewise analytic functions. If you are still in the Boston area we can meet. – Misha Jan 26 '11 at 07:05
  • You can solve $x^{2/3}=y$ explicitly, $x=y^{3/2}$ most students will have no problem with it, especially after they are done with Lipschitz-based theory. – Misha Jan 26 '11 at 07:11
  • ----If you teach them after you work with, say Lipschitz estimates, the students will have a much better chance.---- Sure. Have I ever claimed the opposite? I only claimed that one needs to do it, not that it should be done during the first five minutes of the first lecture. Two-three Lipschitz theorems that do work well can easily be done first, as I said, but I prefer to pass from that to standard continuity. By the way, once you have continuity, you can say that $f'(a)=L$ if the function defined as the difference ratio for $x\ne a$ and $L$ for $x=a$ is continuous at $a$, if you want. – fedja Jan 26 '11 at 07:11
  • ----Compare the explanation of $f$ increasing if $f'>0$ in the standard and simplified approach and see the difference!---- OK, let's do it. Standard: $f(y)-f(x)=f'(z)(y-x)$ (MVT) and the product of two positive numbers is positive. What's "simplified"? ----If you are still in the Boston area we can meet.---- Sure, the only little problem is that I am about 900 miles away from Boston. – fedja Jan 26 '11 at 07:17
  • Look, most students want to learn how to use calculus in solving problems in the fields of their interest, they care little about fine points of mathematical ideology. After they get some experience on the elementary level and develop some intuition and skills, they can refine their ideology when needed. All the problems of substance together with the solutions stay the same, it's the ideology that is rearranged a little. Uniform estimates correspond to our intuition and work much better than pointwise derivatives and pointwise continuity by limits. It kills the subject to start with them. – Misha Jan 26 '11 at 07:24
  • But the existence of such $z$ in MVT is very mathematically subtle, although intuitively obvious. In the Lipschitz theory everything is explicit, you get naturally from polynomials to it, it is still earnest mathematics, not hand-waving. – Misha Jan 26 '11 at 07:31
  • 9
    @Misha: If you interpret "What do you mean when you write X?" as picking on you, we cannot have a conversation. It's also distressing that you don't find the fact that a professor of mathematics finds your writings on calculus hard to follow as being anything other than criticism. As I said, I came to this answer with some interest in your point of view. After receiving your reaction, this is no longer the case. – Pete L. Clark Jan 26 '11 at 08:02
  • 1
    Sorry, you don't come across as a person who wants to understand the ideas and is willing/able to fill in the gaps from hints and context. You come across as a person who picks on formal minutia. – Misha Jan 26 '11 at 09:08
  • 10
    @Misha: please don't say that I don't want to understand the ideas. I have said multiple times that I do want to do so, so suggesting otherwise is essentially accusing me of intellectual dishonesty and/or bad faith. That's pretty insulting given the demonstrable amount of time I have put into asking and answering questions on this website. If you want to claim I am not competent to follow your arguments: that's fine; go ahead. Perhaps you could clarify what your audience is, though, if you do not intend your calculus writings to be easily understandable by a PhD mathematician. – Pete L. Clark Jan 26 '11 at 11:34
  • ----It kills the subject to start with them.---- Not the "subject" but the "weak students". The subject is still in fairly good health :). Now, how about $\sqrt[3]{x+x^5-1}$? We'll need to use intervals not coming close to the (unknown!) zero to have uniform estimates. Suppose I just want to explain that it is differentiable wherever it is not $0$. That is not a fine point but a very basic claim. How does it reconcile with your definitions? I cannot say that it is "uniformly differentiable" there (it is false) and I cannot explicitly tell the admissible domains of uniformity either. – fedja Jan 26 '11 at 12:03
  • 5
    Also, Misha, you don't understand Pete at all. When teaching, we have to take care not of the "big picture" (I have a pretty clear one of the entire undergraduate analysis sequence) but of "pesky details". The standard approach is tuned up in this respect. The one you propose is not (or, at least, not yet). Pete just wanted to see if some points that are clearly out of tune can be tuned without putting the already tuned ones out of tune. I want to see pretty much the same. – fedja Jan 26 '11 at 12:10
  • 1
    @fedja: ----It kills the subject to start with them.---- Not the "subject" but the "weak students". 1)Don't you want the weaker students to understand it too, maybe not all the technicalities, but the essence of it, so they could use it in their further studies/research? Then why push the inessential technicalities first? 2)----The subject is still in fairly good health :) ----Not really, those weak students are smarter than you think, and many strong students get turned off and decide not to pursue mathematics. A prof. from Kyoto complained that smart kids don't go into math any more – Misha Jan 26 '11 at 18:07
  • @fedja: Now, how about $1/(x+x^5−1)$? Stay away from singularities if you want to stay uniform, your epsilons and deltas also deteriorate there. One-size-fits-all definitions is a mathematical fallacy, it may be good for grandiose mathematical theorizing, but when you get to anything practical or more specialized, they don't fit that well. – Misha Jan 26 '11 at 18:17
  • 2
    The epsilons and deltas are allowed "to deteriorate" (i.e., to be not uniform) by the nature of the double quantifier definition and that is the whole point of having them the way they are. Anything practical, you say? Let's discuss it. Give me an example of a practical problem for which the sort of real analysis training you proposed would be more beneficial than the traditional one. Or just any example of a practical problem you dealt with.

    Strong students do turn off for many reasons but epsilon-delta definitions is certainly not one of them. They can digest it.

    – fedja Jan 27 '11 at 02:15
  • 1
    Weak students? Yes, but the task is to bring them up to the level of modern science, not to reduce the modern science to their level. "Grandiose mathematical theorizing" is there for a reason and the reason is that, for all we know, the nature loves complex structures more than simple ones. But let's go back to your approach and restrict ourselves to the basic real analysis. You want to explain to your students that some functions are nice and some aren't. So, what base class of nice functions will you start with? The classics is "continuous at a given point". – fedja Jan 27 '11 at 02:32
  • 2
    I found this post very stimulating, and voted it up. – Emerton Jan 27 '11 at 03:27
  • @fedja:---The epsilons and deltas are allowed "to deteriorate"---And you have to take it into account when you calculate with such functions. The classical theory sweeps it under the rug. ---Give me an example---Different moduli of continuity prepare strong students to come to grips with variety of functional spaces that are useful in PDEs, for example. It shows them that definitions are not carved in stone but should be flexible, depending on the problem. Differentiation as factoring draws parallels with algebraic geometry and functional analysis.---Practical---Stock market fluctuations – Misha Jan 27 '11 at 04:24
  • @fedja:---real analysis training you proposed---I am more concerned about calculus on an elementary level.---Weak students?...the task is to bring them up to the level of modern science, not to reduce the modern science to their level---You call pointwise notions "modern science?" To me they look like a dubious myth from 19th century. Maybe we can use the ideas of modern mathematics to make calculus simpler? That's what I have been trying to do.---So, what base class of nice functions will you start with---Polynomials, Lipschitz, analytic, Holder, (locally) uniformly continuous. – Misha Jan 27 '11 at 04:58
  • @feddja: Here are a some references for untraditional treatments of calculus and elementary analysis: 1)Real Analysis: a Constructive Approach by Mark Bridger, 2007 2)Ch.1&2 of Analysis by Its History by Hairer and Wanner, 1996. 3)Lecture notes by Hermann Karcher (in German with an English summary, hopefully will be translated into English soon) at http://www.math.unibonn.de/people/karcher/MatheI_WS/ShellSkript.pdf – Misha Jan 27 '11 at 05:12
  • OK, we'll discuss the philosophy later (don't worry, I'll not try to escape from it; I just do not want to create a thread with two things in parallel: nobody will be able to read it). ---So, what base class of nice functions will you start with---Polynomials, Lipschitz, analytic, Holder, (locally) uniformly continuous.--- Fine. The next thing is to formally define your class so that the students can recognize the membership. The classics is epsilon-delta. What definition(s) will you use? (you are always welcome to reduce your list any time you want but keep it non-empty). – fedja Jan 27 '11 at 11:53
  • @fedja yesterday---By the way, once you have continuity, you can say that $f'(a)=L$ if the function defined as the difference ratio for $x \ne a$ and $L$ for $x=a$ is continuous at $a$, if you want.–--Yes, absolutely, and even more, $\lim_{x \rightarrow a} f(x)=L$ if the function defined as $f(x)$ for $x \ne a$ and $L$ for $x=a$ is continuous at $a$. Continuity first, like E.Cech did it in the 30s (according to Jerry Uhl)! – Misha Jan 27 '11 at 13:43
  • So, fedja, where do you want to take our discussion, if you want to continue? – Misha Jan 27 '11 at 16:18
  • @fedja:----The classics is "continuous at a given point".---You got it! Now, for polynomials, $x-a$ divides $f(a)-f(a)$, and the ratio, evaluated at $x=a$, gives you $f'(a)$. If he same is true in the class of functions continuous at $a$, you call $f$ differentiable at $a$. Differentiation is just factoring! Mark Bridger in his 2007 book defines $f$ as uniformly differentiable if this factoring holds in the class of functions uniformly continuous in both $x$ and $a$. We can play the same game with Lipschiz, Holder, or any other class of functions admitting some given modulus of continuity. – Misha Jan 27 '11 at 16:20
  • @fedja:---What definition(s) will you use?---Polynomials are just that, and they are Lipschitz, $|f(x)-f(a)|\le L|x-a|$, uniformly in $x$ and $a$. For any other modulus of continuity $m$ we take $|f(x)-f(a)|\le Lm(|x-a|)$. If take the union of all these classes, we get the class of all the uniformly continuous functions. The fact that pointwise continuous functions on compact sets are uniformly continuous is a matter of compactness. I anticipate that you will say that the constant in the inequality will deteriorate near singularities, but we have already discussed that, right? – Misha Jan 27 '11 at 16:38
  • Do you think anybody besides you and me is reading this? If not, it's sort of a waste, do you want to e-mail me, and then we can decide where to continue? – Misha Jan 27 '11 at 16:51
  • Wait a bit... So, you give up on analytic and locally uniformly continuous. You are left with Polynomials, uniformly Lipschitz on the domain and uniformly continuous with modulus omega. Let's say that on this first run you just say that we can replace $|x-y|$ with some other similarly behaving $\omega(x-y)$ to be specified later. You see, I'm just trying to put one lecture together and see if your method can survive it. The next thing is arithmetic. The classical base survives all 4 arithmetic operations with natural prohibition to divide by 0. What will you say about division? – fedja Jan 27 '11 at 19:10
  • 8
    As to your second question, I believe that yes, this is being read, and, moreover, you, probably, have about as much attention now from wide mathematical audience to your teaching ideas as you are ever going to get. You certainly turned some people away (Pete, say) by the moment and what happens next depends on what you say, but I find public discussions more useful than private e-mails (full public record of speeches and independent arbitrage are big pluses. Besides, you can really win some people you don't know to your side). Of course, if you do not care, we can stop here. – fedja Jan 27 '11 at 19:21
  • @fedja:---you give up on analytic and locally uniformly continuous.--- No, converging power series are locally Lipschitz on their convergence intervals. "Locally" means that "on any closed subinterval," (or, more generally, on any compact subset). This can be done by algebra of power series and checking the convergence. Since I have explained what "locally" means, I don't give up on locally uniformly continuous either (well, "uniformly" is a bit redundant here, right?).---about division?--- $f/g$ is locally nice on the set where $g \ne 0$. "Locally" allows for deterioration of the constants. – Misha Jan 27 '11 at 21:55
  • fedja:---You certainly turned some people away (Pete, say)--- Oh, well, I probably understood him a bit better than he cared for... I actually e-mailed him my apologies with some references and the files he could not access. He hasn't replied yet, some people are touchy.... – Misha Jan 27 '11 at 22:05
  • Emerton wrote me in an e-mail: In my own education, I know that notions like Lipschitz and other concepts that focus on uniformity and rates of convergence came late, and I didn't appreciate them for a long time; on the other hand, in applications it seems that having an awareness of the role of uniform bounds, or quantitative estimates in general, is often very important, and what I like about your approach is that it brings these concepts out right away. – Misha Jan 27 '11 at 23:33
  • 8
    «I probably understood him a bit better than he cared for...» Well, that's surely going to attract good will! – Mariano Suárez-Álvarez Jan 27 '11 at 23:57
  • You seem to misunderstand me. At this moment, the students know neither power series, nor compact sets, nor convergence. All they know is what you have told in your introduction. The classical way is to tell the definition of continuity at a point (one basic notion) and then to demonstrate that it survives arithmetic and composition (so "normal" operations on nice functions result in nice functions). So far the classical lecture is smooth and consistent. You started with several classes out of which only 2 can be defined right away. Now you face division and composition. How to present those? – fedja Jan 28 '11 at 06:03
  • 3
    @Mariano Yeah, if I could only figure out what this particular phrase was supposed to mean... Nevermind, I'm not "touchy" :)

    @Misha It is all not about whether your suggestion has good sides or not. Nobody objects that something in this spirit would be nice (like it would be nice to float in the air over an emerald forest at dawn, supported above the ground just by your desires and not by some clumsy, noisy, and stinky machinery). The issue at stake is the technical possibility. The "classical" solution is to learn to live with and to operate the machinery and to enjoy the flight then.

    – fedja Jan 28 '11 at 06:25
  • Composition: Lip(Lip)=Lip, Hol(Lip) and Lip(Hol) is Hol with the same power, Hol(Hol) is Hol with a different power, UC(UC)=UC with a different modulus of continuity. Near singularities, your constants deteriorate. Division: constants deteriorate near zeroes of the divider, away from them you are fine. It is the same way with classical theory if you keep track of your epsilons and deltas. I (and Karcher) suggest doing general continuity later in the course, when Lip and maybe Hol (and maybe power series) are well understood. On compactness: students know about close intervals, it's enough. – Misha Jan 28 '11 at 06:56
  • Again, I am mostly concerned about introductory calculus for people who not necessarily plan to become mathematicians. Polynomials and Lipschitz give you a good start, Holder helps if you insist on going over algebraic singularities, more refined generalities can be treated later in the course (when the students get more eexperience) or relegated to more advanced courses. Read Karcher's English summary. I am also concerned about high school students who are not ready for the abstractions, but would be fine with a more elementary,but still mathematical treatment. – Misha Jan 28 '11 at 07:14
  • 1
    All these small details you are concerned with can be ironed out. It can be done by the students in the problem sets that would be rather elementary, still would teach them how mathematics is done. The point of teaching mathematics is give the students an opportunity to do it, not to give them all the details. If you like the ideas, you can make them work. I am a little annoyed that you would not reveal your identity. – Misha Jan 28 '11 at 07:31
  • 5
    @Misha Have you actually tried teaching this way? Do you really think that intro calc students who have "no desires to become mathematicians" will absorb any of this? I ask because I think it often happens that students completely ignore the theoretical underpinnings of the course and simply memorize formulas. If your students are doing this, I see no reason to present it your way over the standard way. What would be exciting is if you find that your approach somehow forces a greater percentage of the class to actually think about the mathematics. – Steven Gubkin Jan 28 '11 at 15:30
  • 1
    @Steven Unfortunately not at the college level. I have done some volunteer teaching of interested high school students at MIT, they seemed to be rather pleased with my approach to the subject. Trying to push them into figuring things out for themselves didn't work that well, though, although a few liked it. Mark Bridger and Hermann Karcher (see the main body of my answer) taught it to engineering students and reported good results. I know a tutor that used my approach to explain calculus to a business student with no brain for math, and he got A, so some evidence is there. – Misha Jan 28 '11 at 16:01
  • @Steven I gave a 2 hour lecture to high school students in China in 2009, pretty much following the talk at http://mathfoolery.com/talk-2004.pdf and they said it was too easy. It is true that many students just memorize the formulas. These students will have a better chance if you start by demonstrating these formulas for polynomials and then saying that the rules work in general, using them for problems of interest and then explaining some Lipschitz stuff, using polynomials as a motivation. Doesn't it look more reasonable than hitting them on the head with limits? – Misha Jan 28 '11 at 16:16
  • 6
    I just don't think that limits are that much of a barrier to most students. I personally think the greatest barrier is a system which encourages memorization over thought. No matter what content you teach (as long as what you say is true) getting students to think is the most essential thing. So I think the presentation of material is really kind of irrelevant - especially if students are not going on in mathematics. The goal should just be to get students to think mathematically. If you can do that with your approach - more power to you. But I don't think that everyone would be able to – Steven Gubkin Jan 28 '11 at 17:08
  • 6
    teach the subject your way, or that it would help for people to try. The goal of a lecturer is to guide students down a path to understanding, and as a lecturer you cannot use any understanding other than your own. So I think the most important thing is to let lecturers present the material in their own way, so that students see natural thought. – Steven Gubkin Jan 28 '11 at 17:10
  • @Steven Gubkin.---So I think the presentation of material is really kind of irrelevant - especially if students are not going on in mathematics.--- You are absolutely wrong, the simpler the material -- the more likely the students to follow. Factoring $p(x)-p(a)$ into $x-a$ is much easier to understand than limits. You are just trying to justify your inaction. – Misha Jan 28 '11 at 19:22
  • 4
    I am not trying to justify my inaction. Quite the contrary. In my own teaching, I am very careful to present things the way I actually understand them. For example, I am currently teaching a multivariable calculus class. The textbook (Stewart) does not present any information on linear transformations, but I believe very strongly that you cannot understand the derivative without first understanding linear maps. So I have written some handouts presenting this point of view to my students. In fact, I was trying to say that I admire your bravery for stepping outside the norm to teach in a – Steven Gubkin Jan 28 '11 at 19:48
  • 2
    that makes sense to you. Clearly Stewart did not think that having linear transformations around was neccisary, and I do not doubt that he does a very good job teaching the material out of his own book. But I cannot teach the material well the way that he presents it - I have to present it from my own perspective. I am trying to suggest that I would not be able to teach differentiation the way you do, because it does not speak to me or inspire me. But, as it clearly does inspire you, there is a good chance that you could teach students a lot this way. – Steven Gubkin Jan 28 '11 at 19:51
  • 3
    I think that a lot of instructors lack the bravery to step outside of the norm and get creative in their presentation of concepts. Some departments actively discourage this. I am somewhat saddened that you interpreted my praise as a "justification of my own inaction". Best of luck to you. – Steven Gubkin Jan 28 '11 at 19:54
  • 1
    btw - how do you talk about the total derivative of a multivariable map from your perspective? "Locally linear approximation" is really the only way I have ever thought about it. – Steven Gubkin Jan 28 '11 at 20:00
  • 1
    Might I also say that I think my approach is more geometric and your approach more algebraic. So I think the students learning style is going to have a big impact on which one is "much easier to understand". – Steven Gubkin Jan 28 '11 at 20:05
  • 1
    @fedja:---It is all not about whether your suggestion has good sides or not...----Sorry, I missed that. I don't suggest to totally expunge limits and continuity from all the analysis. I suggest to start with a lighter and more elementary tools. Taking your analogy, you can use a biplane or a hang glider when it's possible, you also don't have to drive a MAC truck to a grocery around the corner, you can walk or ride a bike. Also limits are the tools of mathematical theorizing, and uniform estimates are much more useful in practical calculations. Drop this one-size-fits-all ideology and enjoy. – Misha Jan 29 '11 at 07:26
  • 1
    @Steven: I am glad you are doing the right thing explaining some linear algebra. As for Stewart, it has a lot of nice problems in it, but the theory is awful. What's more, it is mostly irrelevant, there is a bad disconnect there. I'd suggest exposing the ideas the way you like and your students can understand, and then using Stewart as a problem book. That's what mostly happens in practice. It's like that exercise in the first edition of Lang's Algebra: "Take any book on homological algebra and prove all the theorems without looking at the proofs in the book." – Misha Jan 29 '11 at 07:52
  • @Steven:---how do you talk about the total derivative of a multivariable map--- The basic estimate works fine, you only have to replace the absolute value with some norm. That's what you could call "locally uniform linearity." You can also look at it as factoring, see the section on many variables from my paper referred to in my answer, it works out really nicely. A big part of calculus is the interplay between algebra and geometry. You often can use one to understand the other, but you need both of them. Don't take this learning style doctrine too rigidly. Good luck to you too. – Misha Jan 29 '11 at 08:33
  • Misha. If you claim that all things can be ironed out, iron out division and composition simultaneously so that everybody can see a clear presentation, not just patches that are more or less obvious. Then I'll move to the next concern. I believe that you'll have to introdice some version of epsilon-delta at this level already and this is the very beginning. I tried to make a consistent presentation in a similar spirit but failed, so I'm not buying your "details can be made fine" statement. Sorry for my disappearance for the weekend :). – fedja Feb 01 '11 at 16:33
  • @fedja:I thought I have explained already how to proceed in my comment on Jan 28 at 6:56. I don't intend to write a detailed treatise in comments. As you said on Jan 27 at 2:15, "epsilons and deltas are allowed "to deteriorate" (i.e., to be not uniform)." I agree, and you have to take this deterioration into account when you do numerical calculations. What's the big difference? Sometimes we have to work with specific moduli of continuity, and sometimes less specific classical notions are fine and even preferable. We can do plenty with Lipschitz, Hoder etc. What are you still arguing about? – Misha Feb 03 '11 at 17:21
  • I have been away for a few days because i was notified that my account had been suspended till 3/2/2011. Looks like it's back on now, at least I can add comments, although the "edit' option of my answer is not there. Meanwhile, Ursula has translated 3 (out of 10) sections of Karcher's lecture notes, I have edited them to be more readable. I'm still having troubles with pictures, as soon as I get them right, I will start putting them up somewhere. – Misha Feb 03 '11 at 17:48
  • @fedja: Please move on to your next concern, I think we're pretty much done with composition and division, the estimates just deteriorate when you get close to the singularities. If you disagree, what would you want me to clarify? You said you once tried to develop calculus/analysis in a similar way, but gave up. What were your difficulties? Maybe I can help you to deal with them. – Misha Feb 03 '11 at 20:23
  • 2
    ---We can do plenty with Lipschitz, Hoder etc. What are you still arguing about?--- I claim that we cannot consistently do division with the uniform Lipschitz/Holder and I'll stick to it unless you show how to do it. Your own notes avoid the discussion of division altogether.

    ---What were your difficulties? Maybe I can help you to deal with them--- Isn't this funny? I'm telling you exactly what they were one by one and you just say "it can be done in general" and show only the parts that are obvious.

    – fedja Feb 04 '11 at 13:30
  • Fedja, the key word in your statement is "consistently." To keep your estimates for $1/g$ uniform you need $g$ to be bounded away from zero, and it is the same way in classical theory. Yes, $1/g$ is continuous at every point where $g$ is continuous and not zero, but this generality is illusory. When you pick your epsilon, your delta will go to zero near zeroes of $g$. We have already agreed on all that. You say it is all obvious. Good, let's move on to something not obvious. – Misha Feb 04 '11 at 18:10
  • @fedja, on Jan 26 at 4:41 you said: "The existence of extrema theorem is the basis for Rolle, Rolle is the basis for mean value and second order Taylor and those are bread and butter for all estimates from determining concavity to the Newton method." In my response on Jan 26 at 5:50 I presented a weaker form of the Rolle, but there is a better answer. We don't need Rolle because we can derive the estimates based on it by using the monotonicity principle that says that a function with non-negative derivative is non-decreasing and that is proven directly in our (locally) uniform approach. – Misha Feb 05 '11 at 14:45
  • fedja, on Jan 26 at 4:41 you also said: "As to the IVT, I use bisection far more often than Newton when I need real roots of something." ---The bisection method works fine in the uniformly continuous theory, and the argument justifying it does not need IVT. The critique in Bridger's book says that you may have more and more troubles deciding the sign of $f(x)$ when $f(x)$ gets small, and end up never getting your answer to any good accuracy. In fact, if you disregard Bridger's objections (which is the classical way), IVT for continuous functions stay valid (with its proof). – Misha Feb 05 '11 at 15:11
  • ---When you pick your epsilon, your delta will go to zero near zeroes of $g$--- It seems like there is some impenetrable wall between us here. My division theorem is "if $g$ is continuous at each point, $1/g$ is continuous at each point where $g$ is not $0$" and that is OK because the definitions fit together. What is your statement here? If you resort to "$1/g$ is uniformly Lipschitz on every set where $g$ is uniformly bounded away from $0$", you'll have to introduce exhaustions, etc. when trying to say that $1/x$ is nice on $(0,1)$ and the whole thing becomes a mess. – fedja Feb 06 '11 at 01:00
  • In general, I want the following. There is a class of nice functions that is closed under standard arithmetic operations (+,-,*,/) and composition with the usual agreement about the domain of the result of each operation. "Continuous at each point of the domain" is such a class. "Uniformly Lipschitz/Holder" on the domain is not. – fedja Feb 06 '11 at 01:07
  • 2
    The trade here is between learning a harder definition once and having nothing to worry afterwards and learning an easy definition and getting many caveats later (like $x^2$ is defined on the entire line but not "good" there (but $x$ is and you cannot even multiply without worry, as it turns out), the division does not result in a "good" function on the natural domain, roots cannot be taken in Lipschitz category and the inverse function is out of control no matter what you do). IMHO, one hard definition is easier to fight through than it is to memorize 10 easy caveats. – fedja Feb 06 '11 at 01:14
  • And, yes, I want a consistent presentation, not just the one that conveys the ideas but fails to put the details straight. You say that my consistency is illusory because it fails to embrace uniformity, but I can also say that your uniformity is illusory because it fails to lead to a consistent set of definitions and statements. – fedja Feb 06 '11 at 01:23
  • At last, I do not see why uniformity is something that is always there in reality. When driving, the vehicle reacts to steering differently when the road is icy but you still can guide it on virtually every road if you choose your speed right depending on the road condition and turn sharpness. Somehow the students get the idea that your delta always exists but depends on both x and epsilon on the road (well, perhaps, some don't, but they get eliminated naturally). Why cannot they learn this in a math. class? – fedja Feb 06 '11 at 01:28
  • @fedja: ---IMHO, one hard definition is easier to fight through than it is to memorize 10 easy caveats.--- Memorize? I thought we wanted to cultivate understanding, not memorization and regurgitation of definitions and theorems. Easier to memorize, but harder to comprehend. It looks like you prefer mathematics as some sort of monotheistic religion where everything is neat and clean, but the world (and even mathematics) is messy, and you better get used to it. – Misha Feb 06 '11 at 04:06
  • @fedja:--If you resort to "$1/g$ is uniformly Lipschitz on every set where $g$ is uniformly bounded away from $0$ etc. when trying to say that $1/x$ is nice on $(0,1)$ and the whole thing becomes a mess --- No, I am no saying that $1/x$ is nice on the whole $(0,1)$, it becomes a mess near $0$, I will have to use locally uniform notions here. Things become messy near singularities, and even you will have to resort to some tricks, like improper integrals, to handle it. – Misha Feb 06 '11 at 04:29
  • @fedja:---In general, I want the following. There is a class of nice functions that is closed under standard arithmetic operations (+,-,*,/) and composition with the usual agreement about the domain of the result of each operation.---Sorry, man, division is a messy operation, things get messy near singularities, nothing we can do about it, but I confront it earnestly by introducing the notion of locally nice, and you sweep it under the rug. – Misha Feb 06 '11 at 04:38
  • 1
    @fedja:---it fails to lead to a consistent set of definitions and statements.--- You state it as a proven fact, yet you haven't caught me with any inconsistency yet. I see mathematics as a messy bag of tricks to solve all kinds of messy problems, not as "a set of definitions and statements." When a definition doesn't fit, you adjust it. Trying to fit too much into one definition makes it so broad that it doesn't mean much any more – Misha Feb 06 '11 at 05:18
  • @fedja:--At last, I do not see why uniformity is something that is always there in reality. When driving, the vehicle reacts to steering differently when the road is icy...---Global no, but local yes. You go point by point and then use compactness as a crutch, I go interval by interval, and also sometimes have to use compactness as a crutch. But I don't need compactness right away, and you do, because your definitions of continuity and differentiability are too weak. – Misha Feb 06 '11 at 07:09
  • @fedja:--At last, I do not see why uniformity is something that is always there in reality. When driving, the vehicle reacts to steering differently when the road is icy...---Global no, but local yes. You go point by point and then use compactness as a crutch, I go interval by interval, and also sometimes have to use compactness as a crutch. But I don't need compactness right away, and you do, because your definitions of continuity and differentiability are so weak. Pointwise may be nice mathematical myths, locally uniform is closer to what we do in practice, especially on an elementary level. – Misha Feb 06 '11 at 07:59
  • 1
    ---Sorry, man, division is a messy operation, things get messy near singularities, nothing we can do about it--- OK, you finally admitted this with your approach. With the standard one, division is neat and clean. ---but I confront it earnestly by introducing the notion of locally nice, and you sweep it under the rug--- "Locally nice" is the same as epsilon-delta. And I do not "sweep it under the rug", on the contrary, I emphasize that delta depends on the point as well as on epsilon and make it clear in every proof what exactly determines it so one can easily trace uniformity if needed. – fedja Feb 06 '11 at 16:04
  • 2
    ---I see mathematics as a messy bag of tricks to solve all kinds of messy problems, not as "a set of definitions and statements.--- Here we totally disagree. I see mathematics as a well-organized toolbox to solve all kinds of messy problems. "Definitions and statements" are just labels on various compartments and if you screw your labeling, you'll have hard time finding the right tool. – fedja Feb 06 '11 at 16:08
  • ---Pointwise may be nice mathematical myths, locally uniform is closer to what we do in practice, especially on an elementary level--- OK, now it is "locally uniform", not just "uniform". That is a great step towards where we can possibly converge. How exactly do you suggest to define "local uniformity" to the students? – fedja Feb 06 '11 at 16:15
  • ---but the world (and even mathematics) is messy, and you better get used to it--- Yes, the world and even mathematics are messy in places but somehow I belong to the school that prefers to "fight the mess with order", not to "get used to the mess". Probably, I had some childhood problems and my parents and teachers told me a few wrong things that got stuck too well. It is too late to try to change that now :). – fedja Feb 06 '11 at 16:23
  • @fedja:---That is a great step towards where we can possibly converge. How exactly do you suggest to define "local uniformity" to the students?--- Well, the estimates may deteriorate near singularities and you stay away from them or adopt a more permissive modulus of continuity. In some intervals the constants may be better, and you may take advantage of that. It's just common sense. I am not into writing a Bourbaki-style treatise, it's your cup of tea. I go by problems and examples. BTY, why do you want to converge? – Misha Feb 06 '11 at 16:45
  • 1
    Have you read the Rokhlin's lecture? I had an interesting e-mail discussion with Oleg Viro who took part inn transcribing the tapes. He said, "The problem with calculus is that it occupies a strategic position in teaching of mathematics being quite ugly. It creates negative, foolish image of mathematicians. I do not know how to resolve the situation. A partial step would be turning calculus into a sort of mathematics where everything can be understood by the real students. In particular, eliminate limits that cannot be understood by a student with insufficient training in logic." – Misha Feb 06 '11 at 16:55
  • 1
    Also: "The notion of continuity is quite foreign in calculus. Monotone or piecewise monotone functions are much more at home. For them continuity is local surjectivity." And: "A strategic solution would be eliminating the calculus, but it is dangerous socially: mathematicians all would get unemployed.Should be replaced eventually but very cautiously. Meanwhile it may be made a bit more intelligible. What makes definitions of limits and continuity bad is chains of 3 quantifier. There are standard ways of fighting with them. Introduce the notion of neighborhood and one quantifier is out. – Misha Feb 06 '11 at 17:08
  • @Fedja:---"Locally nice" is the same as epsilon-delta.--- Not really. You can develop all the uniform theory, and then go interval by interval if you have to. You don't start with "locally," you get to it gradually. By the way, my account is still suspended till 3/2/2011 and I can't edit my answer any more. Four chapters (out of 10) of Karcher's lecture notes are already translated into English, after ch. 5 is done, we'll put it on the web somewhere and you can read them. He's jumping to epsilons and deltas too early to my taste, but you may like it more for that. – Misha Feb 06 '11 at 17:25
  • It looks like you take the ideological part of mathematics way too seriously, and I don't. As Michael Atiyah said, "...the axiomatic era has tended to divide mathematics into special branches, each restricted to developing the consequences of a given set of axioms. Now I am not entirely against the axiomatic approach so long as it is regarded as a convenient temporary device to concentrate the mind, but it should not be given too high a status..." More on https://micromath.wordpress.com where among 2 top posts one is put there at my suggestion, the other is a link to the Rokhlin's lecture. – Misha Feb 06 '11 at 17:44
  • @fedja:---How exactly do you suggest to define "local uniformity" to the students?---I know what you are trying to push me into. You want me to say "local means in some, probably small, neighborhood of every point, and then we can get global estimates for any compact subset by using the finite sub-cover property." And then you would say: "Aha! it's the same as epsilon-delta! Aha! You need compactness too!" But this would be at the point of my exposition when the uniform theory is well developed and understood already. So it's not the same. – Misha Feb 09 '11 at 02:30