What technical and/or theoretical challenges are involved in automatically extracting proofs from books and papers into Coq code?

Question

Over the years, advances in machine learning has allowed us to communicate and interact, using the same natural language, more and more semantically with computers, e.g. Google, Siri, Watson, etc. On the other side, proof automation and formalization has gained more and more steam, culminating at, for example, recent efforts by Gonthier et al. to encode the Feit-Thompson theorem, and this trend will not stop as homotopy type theory has become a hot research topic. It then only seems natural to ask what open problems or technical challenges still lie ahead before we arrive at a Google Books style of automatically extracting human proofs written on paper (or just in latex) to formalized and checked proofs in something like Coq?

Ideally, answers regarding both the mathematical side and the more empirical machine learning side (or perhaps other viewpoints I've not considered here) are welcome, but I'm not quite sure if the latter is on topic on mathoverflow. And ideally, I should think we want to avoid babbling philosophy here, e.g. what's the point of mathematics, etc.

I think this is a great topic of discussion, perhaps in a semester-long graduate course, and one that I am eager to hear answers to. I worry that it is too broad for MO, where there's a lot of emphasis placed on questions having a "correct" answer. But I hope to be mistaken. — Theo Johnson-Freyd, Jul 06 '15 at 02:28
You say "extracting ... to" instead of "extracting ... from", so that it's not clear to me whether you're more interested in extracting human proofs from Coq code or extracting Coq code verifications from human proofs. I'm guessing the former (?). — Todd Trimble, Jul 06 '15 at 03:34
I'm interested in both, but I intended the question to mean the latter. For example, how do we go from a paper book like Atiyah-MacDonald to a Coq file AtiyahMacDonald.v? Perhaps "conversion" is a better word here? "Converting human proofs written on paper (or just in latex) to formalized and checked proofs in something like Coq" — SorcererofDM, Jul 06 '15 at 03:56
Since you mention "Google Books style" as a goal (whatever that may mean), the present quality of Google Translate doesn't inspire much confidence for anytime soon. :) — grghxy, Jul 06 '15 at 04:56
Well, it's safe to say, the re-capchas are gonna be way more complicated... — Per Alexandersson, Jul 06 '15 at 06:48
The problem is really a composition of two problems: (1) How difficult is it to formalize real mathematical proofs for a person who perfectly understands the proof and the proof assistant?, and (2) How difficult is it for a computer to parse a proof written for humans, essentially guessing the missing steps and filling in the gaps left to the reader? Question (1) is about logic and programming paradigms (in particular, about our still imperfect understanding of constructivism and the tacit uses of univalence). Question (2) is an AI question (probably the simplest AI question, not that this ... — darij grinberg, Jul 06 '15 at 17:32
... means much); in its purest form, it assumes that computers already have a good grasp of the form of logical reasons that mathematicians use and only have difficulties with the terseness of their writing. Counterintuitively, it appears to me that (2) is simpler than (1), given how good Coq's automation and SSReflect's reflection are at filling in straightforward steps. — darij grinberg, Jul 06 '15 at 17:33
Thurston devoted a page or so to this question in "On proof and progress in mathematics." (http://arxiv.org/abs/math/9404236) I think what he wrote 20 years ago is still valid. — Alexander Woo, Jul 06 '15 at 18:53
The problems in understanding mathematical English, even in formally written papers, is well described in the book The Language of mathematics, by Mohan Ganesalingam, and many other examples are given in my website athttp://www.abstractmath.org/MM/MMLangMath.htm — SixWingedSeraph, Jul 07 '15 at 13:22
@SixWingedSeraph let me fix that: http://www.abstractmath.org/MM/MMLangMath.htm — David Roberts, Jul 08 '15 at 05:08

Andrej Bauer · Answer 1 · 2015-07-07T22:20:02.807

I am going to answer the question as if you asked about massive formalization of proofs, not automatic extraction of formal proofs from existing informal proofs written in books, because that's a fairly hopeless task. One of the major lessons learned by Gonthier et al. was that published proofs rely on large amounts of implicit knowledge that is insufficiently recorded by current ways of writing mathematical texts.

[Added 2015-07-07] To address the comments, let me explain what the "missing knowledge" is. It is not about missing parts of a proof or missing details, we know how to deal with a lot of those through automation. Rather, the missing knowledge is more like what Andy Putman's comments suggested. For instance, the text might not explain the overall structure of the argument. Or, without telling, it just does a special case, and never explains why that is sufficient. Or, it forgets to mention that all vector spaces appearing in a given construction are actually subspaces of an ambient space, and therefore it makes sense to speak of their intersections and inclusions of one in another. These things are typically obvious to an experienced mathematician, but are not written anywhere in the text. And so they are hard for computers and graduate students to understand.

One of the major points of Gonthier et al. effort was to make sure that the formalization stayed very close to the actual text. The formalized proofs follow very closely the reasoning in the text, with as little extra details added in as possible. It was difficult to do this and it lead to several new techniques of formalization. Essentially, the proof assistants were adapted so that they could do the same sort of guesswork that a mathematician does when presented with insufficient information. Proof assistants have gotten much better at understanding what the user wants without having all the details spelled out for them.

People who promote formalization do not expect anyone to actually waste their time on boring details. Of course we want our proof assistants to be able to follow high-level arguments and fill in the details automatically, and so the community has worked hard to develop tools and techniques for doing this sort of thing, but much remains to be done. Dismissing the proof assistant as if by necessity they are clumsy tools that hinder mathematical ideas is like dismissing lasers because they require a roomful of equipment to function. In those areas where proof assistants actually work well, their users will tell you that the first thing they do when they want to try a new idea is to type it into a proof assistant. This for example has been my experience with the Twelf system which is designed for the purposes of formalizing formal systems (so a special-purpose tool).

Our experience with formalization of mathematics shows that formalization is possible, but with a lot of extra effort with available technology. It requires rethinking about how to organize mathematics so that it is amenable to formalization. For a moderately large formalization effort the issues arising are not mathematical but software-engineering: how to organize libraries so that they can be reused, what naming conventions to use, how to search existing libraries of theorems, etc.

Adoption of formalized mathematics is relatively slow because mathematicians have little reason to formalize their proofs. They can easily publish them unformalized, they feel that formalization just obscures matters (when in fact it forces one to really clean up after oneself), and that it is generally not worth doing. And so formalization is left to a few enthusiasts and computer scientists who need formal proofs in their work.

It is instructive to look at sub-communities that adopted formalization as a standard. In theoretical computer science this has happened in the programming language community. There many, many proofs proceed by consideration of a large number cases, of which only a couple are usually interesting. This is the perfect situation for proof assistants which can dispose of the boring stuff while making sure nothing is missed. The first ones to formalize their proofs were people who actually implemented the proof assistants. Then it was their students, the other people's students, and once the young people knew how to use the tools, the senior people joined the bandwagon (and many still prefer to use students). At any rate, in a conference on programming languages it has become the norm to say "and we formalized our proofs". To give you an idea where we are now: students who are just beginning their PhD's in this area often already are accomplished proof assistant hackers.

If we had to engineer massive adoption of formalization in mathematics, I would recommend two things:

Better tools.
Teach young people how to formalize mathematics.

There is a lot of room for improvement of the tools we use, and we can't teach old dogs new tricks.

Let me also explain why I think formalization is worth doing. It is not because we will know for sure that all that math is correct – it's irrelevant for most published math whether it is correct because the most important purpose of publication is academic promotion and not dissemination of ideas (cynical but true, just think why Elsevier is possible). The great potential of formalized mathematics is that it will give humans the ability to solve mathematical problems whose proofs are so complicated that no single human or small group of humans can comprehend or oversee using traditional methods. Another great promise is that formalized mathematics can be understood and used by machines. I guess I am looking forward to becoming the Borg.

Regarding the part of Gonthier et al.: what is an example of an insufficiently recorded mathematical knowledge? So you would say it's impossible for an AI to read all books starting from basic additions all the way to higher math, and be able to formalize everything along the way? Where do you think this implicit knowledge come from? e.g. 50% from interactions with peers (folklore, etc), 30% from doing the math, and on. — SorcererofDM, Jul 06 '15 at 17:37
I fear I find this somewhat inconcrete. In my limited experience with Coq, I have encountered some rather specific roadblocks, such as lack of documentation (even installation on Linux isn't something I can reliably do), a steep learning curve, and annoyances such as the fact that every other set I need is really a setoid and I'd have to learn a whole new yoga to work with them (without really having much pre-existing knowledge to learn from, since everyone else seems to either avoid the need for setoids or assume univalence to get rid of it). I think there are far more people who are ... — darij grinberg, Jul 06 '15 at 17:41
... prevented from formalizing their proofs in Coq by such problems than those who genuinely find proof formalization to be beneath them and obscuring. — darij grinberg, Jul 06 '15 at 17:42
I think that the words "implicit knowledge" might be misleading. It's not that there are folklore theorems that people are using. Rather, there are lots of routine arguments that people have seen over and over again during their education. When you write a paper, you don't spell these out, but just give the major steps. Experts can easily see what is going on. Students suffer a bit; this is why graduate students often spend days on single pages of papers. I should say that this is true even for good writing. If you give too many details, it can be hard to sort out where the "meat" — Andy Putman, Jul 06 '15 at 18:22
of the argument is. I've gotten annoyed at papers plenty of times for giving too many routine details. But proof assistants can't fill in the missing parts by themselves. You have to spend way too much time on the boring stuff. That's why I for instance am not interested in spending time trying to formalize my proofs. — Andy Putman, Jul 06 '15 at 18:23
@AndyPutman, I think the eventual goal for proof assistants is to do all the dirty laundry for you, and absolve you of all of these difficulties. The asymmetry between student readers and expert readers can also be solved by the ability of the proof assistant to expand automation (for example in Coq, the [red] tactic will do one step of reduction only, which can be helpful to the student). But of course to fully achieve that needs quite a bit of time. — SorcererofDM, Jul 06 '15 at 18:59
I feel like there must be a diagonalization argument showing that there cannot be an algorithm to reduce all "routine" arguments. — Alexander Woo, Jul 06 '15 at 19:30
@AlexanderWoo, sounds pretty plausible. But such a metatheorem would also constrain humans. We aren't asking to compute the halting set here, just doing better than what we did yesterday. — SorcererofDM, Jul 06 '15 at 21:13
I see a potential danger in formalism: loss of creativity. Mathematicians aren't always aware of the strength of the proof system needed to make a formal argument; they often are aware of the words needed to attempt communicating ideas and something resembling a convincing argument. If people spend a lot of time trying to formalize a proof so that someone else can explain it to a machine, that will be time taken away from creating new ideas and new realms for mathematicians to explore. Gerhard "Stays Off The Formalistic Bandwagon" Paseman, 2015.07.06 — Gerhard Paseman, Jul 07 '15 at 04:07
I see there is a lot of misunderstanding, I'll amend my answer to explain better what "implicit knowledge" is, and also address the feeling that formalization necessarily amounts to wasting time on details. Regarding the current state of tools and how cumbersome they are to use: my #1 suggestion for improvement was "better tools". — Andrej Bauer, Jul 07 '15 at 14:26
I have asked people from Gonthier's team to write an answer. Let's hear it from the horse's mouth. — Andrej Bauer, Jul 07 '15 at 14:59
Good answer! I would like to add that the mathematical and logic community should value good tools in this perspective. I think it is very hard to get a PhD on an improved tool. For a computer scientist a language like C is quite different than for instance Python. For mathematician/logician they are just a variant on the Turing machine. With such view the tools won't become better. — Lucas K., Jul 12 '15 at 15:00

Assia Mahboubi · Answer 2 · 2015-07-12T13:42:01.190

I think Andrej's opinion is very accurate. Here are a few more comments based on my experience as a contributor to the Coq proof of the Feit-Thompson theorem (sorry it's a bit long).

This formal development comprises three natures of libraries: some are for infrastructure purposes and no mathematician cares about the lemmas they prove (litanies of lemmas on booleans, lists or natural numbers, seemingly cryptic boilerplate code...), some contain constructions (permutation groups, rational numbers,...), and some contain meaningful theories (finite group theory, linear algebra, character theory,...). At least in this last category, which is by far the largest in size, the length of both statements and proof scripts are of the same order of size than the LaTeX code of their paper counterpart. The statements are kept as readable as possible by a choice of LaTeX-like notations which mimic the corresponding type-setting source code. Proof scripts on the other hand look rather different, but long ones (for long proofs) are usually structured with many explicitly stated intermediate lemmas, that are as readable as the main one.

Beside a (dumb and merciless) proof checker, interactive proof assistants actually include various tools that are there to help with the gap issue. As already mentioned in this thread, comprehensive and well-organized libraries of formalized mathematics are a mandatory ingredient. Also it is possible to automate the description of computational parts in proofs by implementing formal-proof-producing decision or normalization procedures (like for instance computer-algebra-like routines for algebraic expressions). But this is not enough --and by the way the libraries of the Feit-Thompson theorem actually barely use this latter form of automation.

One crucial point there was to be able to input mathematical sentences, for instance the formal statement of wannabe theorems, with as much facilities as one would write them on paper. Otherwise very soon you just cannot read them (nor are you able to input them in fact). But in that case you need to have the proof assistant automatically and silently decorate the objects with the implicit information carried by the standard mathematical notations. Once the appropriate infrastructure is set (which requires in particular understanding the common notational practices of the authors in the field), the routine arguments mentioned by Andy Putman require close to zero input from the user. For instance if you know that $K$ and $H$ are two subgroups of a same group, showing that $0 \lt \#(H \cap N(K))$ is just trivial (i.e. a two character Coq proof script), because the system knows that the normalizer of a group and the intersection of two groups are themselves groups, and that the cardinal of a group is always positive for a group has at least one (neutral) element. Indeed, these bureaucratic steps can be implemented by a program chaining some identified lemmas for you behind the scene. This either closes some branches in the proof or calculates the glue needed to make your current goal match the known result you want to use at this stage.

Crafting this infrastructure was one of the fun and creative parts of the formalization work (it is not just about tagging some lemmas) and I do not really see how automated tools could significantly help here. There is still much to be done: we do not have yet, in any proof assistant on the market, a comprehensive set of compatible formal mathematical libraries that would encompass a common socle of theories for graduate level in algebra and geometry and analysis etc. For instance the libraries we wrote do not address real and complex analysis at all. Although I am near to ignorant in AI or machine learning, this seems to me a prerequisite step to have automated tools produce formal proofs from any kind (script, paper, voice,...) of input description. Yet the advent of modern ways of interacting with proof assistants (to describe formal proofs, and search existing content) is probably one of the major challenges to overcome, in order for proof assistants to become as popular as computer algebra systems (or even type-setting environments) among researchers in mathematics.

Thanks for the insider view! Do you know if there is a beginner-friendly introduction into the use of these libraries these days? I must admit I am glad to hear that the "cryptic boilerplate code" appears cryptic even to its authors and not just to me; but I would be really happy to be able to understand at least the non-too-technical parts of it. Is there something that is less fast-paced than the 2009 "tutorial" (also newer -- two of its examples don't compile anymore) and less technical than the Manual? — darij grinberg, Jul 11 '15 at 18:12
Interesting: your answer made me realize I forgot the "seemingly" in my answer (now edited). For the time being, only a few tutorial-like references are available: An introduction to small scale reflection in Coq A.M. and G. Gonthier (2010, a bit outdated too); the material of a 2012 summer school; a recent course by Ilya Sergey, for a computer-oriented audience. — Assia Mahboubi, Jul 12 '15 at 09:25
The 2010 tutorial is rather nice, though I still cannot compile its Lemma edivnP. The "rewrite -{1}(subnK le_dm) -addSn addnA -mulSnr" instruction breaks, maybe because mulSnr has changed its statement, or rewrite has changed its behavior. I have tried reading Sergey but he was too terse for me at the important points... — darij grinberg, Jul 12 '15 at 10:13
You're right, the 2010 tutorial is outdated (at least for the Coq scripts), sorry about that. We're now trying to write an updated and more comprehensive document. But it will probably take an other few months to reach a readable state. Thus we should probably also try to maintain the old version alive meanwhile... — Assia Mahboubi, Jul 12 '15 at 14:07
"We're now trying to write an updated and more comprehensive document." That's wonderful news, thank you! — darij grinberg, Jul 12 '15 at 14:18

score 5 · Answer 3 · answered Jul 07 '15 at 18:27

5

There's a really nice survey article on this subject by Jeremy Avigad and John Harrison (both have had concrete experience in carrying out such formalizations "by hand"): Formally Verified Mathematics.

answered Jul 07 '15 at 18:27

cody

1,110

What technical and/or theoretical challenges are involved in automatically extracting proofs from books and papers into Coq code?

3 Answers3

Linked