Preface

About this book

This is a book dealing with how to write careful, rigorous mathematical proofs. The book is paired with code in the computer formalization language Lean.

The book’s focus is on technique, not on theory-building: rather few theorems are proved and then referred back to later. Instead, the core of the book is in examples. Over two hundred problems appear with solutions as examples in the text, and several hundred more problems appear without solution as exercises for the reader. Each problem and each solution is presented both in standard mathematical prose and in Lean (the book should be read with a computer open to the corresponding file of Lean code), and since these two “languages” are almost equally foreign for students who are new to proofs, most solutions are annotated with informal commentary.

Why Lean?

People have been expressing mathematical arguments in words for thousands of years, and mathematical language is by now very standardized, with many conventions to let mathematicians communicate with each other efficiently and unambiguously. The primary goal of this book is to teach you to read and write standard mathematical English prose.

Computer systems called interactive theorem provers (also known as proof assistants or systems for formalization) offer an alternative way to express a mathematical argument. Lean, an open-source project developed at Microsoft Research and elsewhere since 2013, is one such system, but they have existed since the early days of computing.

Interactive theorem-proving systems such as Lean have become steadily easier to use over the years, but they are not yet actually easy to use: they can be fussy and unforgiving, just like any other computer programming language. So why does this book propose that you use such a system in your first encounter with proofs?

First: Like it says on the tin, writing proofs in an interactive theorem prover is interactive. At each point in your proof, you can see a visual representation of the proof state: what you know (your hypotheses) and what you are currently working towards (your goal(s)). If you are new to writing mathematical proofs, you may be surprised at how hard you find it to distinguish hypotheses from goals on paper after a few steps alternating between forward and backward reasoning, or how easily you lose track of one case in a case division. Lean’s live-updating visual representation of the proof state frees you from needing to keep this in your working memory.

Second: Computer formalization systems, fussy unforgiving programming languages that they are, throw syntax errors when you make a mistake. Feedback is instant and you can keep iterating until you have something that works. Writing a solution in Lean ensures that it is completely correct: no substitutions of an inequality under a minus sign, no divisions by zero, no terms dropped in the algebra. This is especially useful in doing proof-based mathematics: make a small error halfway through a calculus problem and the rest of the solution probably won’t change much, but make a small error halfway through a proof and the rest of the solution may be useless.

Finally: You interact with Lean using tactics which each perform a single step of a certain mode of reasoning. The tactics which are taught in this textbook (some custom-written for this book) each constitute a single permissible “atom” of reasoning in the mental world of the book. This makes objective something which in a prose-only textbook can be subjective: what constitutes a fully-detailed proof. What’s more, the “atoms” I offer you are designed to nudge you toward certain styles of mathematical argument structure 1 which are conventional in standard mathematical prose but which students are often slow to adopt.

This is a book about mathematics which uses Lean as a tool. It is designed so that the learning curve is flatter for the Lean than it is for the mathematics 2 – partly by a careful choice of exercises, and partly by writing in a Lean “dialect” of my own, with a limited Lean vocabulary 3 which is just flexible enough for the mathematical needs of the book. (A summary of the major differences can be found in the appendix Transitioning to mainstream Lean.) My hope is that most of your intellectual effort in solving the problems will be devoted to the mathematics, not to Lean’s implementation details or quirks of syntax.

Contents and prerequisites

You are ready to read this book if (1) you know high school algebra inside out and (2) you have experience learning to carry out complicated mathematical algorithms. I have in mind, as a typical reader, a first- or second-year university student who has just taken Calculus II. But no calculus is actually used in this book.

The main novelty of this text is the “bilingual” presentation, juxtaposing prose mathematics with Lean code. But the design choices enforced by this presentation have shaped the text in other ways.

Chapter 1 contains an unusually careful treatment of “calculational proofs”. These proofs are a natural starting point for the book because they translate easily to Lean, but they are also worthy of attention as a topic which students at this level often struggle with. 4

Chapter 2 and Chapter 4 constitute a slow march through the rules of natural deduction, solving problems about \(\mathbb{N}\), \(\mathbb{Z}\), \(\mathbb{Q}\) and \(\mathbb{R}\) which feature increasingly many of the logical connectives and quantifiers. The requirement to translate everything to Lean keeps the book strictly honest through these chapters. The typical intro-to-proof textbook does not have this guardrail, and will commonly make minor abuses here – for example, presenting a good example of a proof by cases which implicitly also uses a not-yet-covered proof technique such as filling a witness to an existential.

Logic is not taught explicitly until Chapter 5, by which stage readers will already be comfortable with the various logical connectives/quantifiers and with translating mathematical statements back and forth between words and symbols. This permits the logic chapter to be relatively brief, with a focus on the concept of logical equivalence (presented primarily using natural deduction, to link with Chapters 2 and 4, rather than using truth tables). 5

The other chapters of the book are more recognizable as the usual subject matter of a first course on proofs. Sufficient familiarity with Lean has been established, and the mathematical presentation is no longer constrained.

Chapter 3 covers the basic notions of elementary number theory. This chapter uses only a limited toolbox of reasoning techniques, to permit it to be placed as a break between Chapter 2 and Chapter 4. Number-theoretic definitions and theorems continue to appear as examples in subsequent chapters, and the presentation of the subject concludes in Chapter 7 with the big results of Greek mathematics: infinitude of primes, Euclid’s lemma, and the irrationality of the square root of two.

Chapter 6 covers induction. The treatment is fairly comprehensive, including induction and recursion relative to various nontrivial well-founded relations on \(\mathbb{Z}\), \(\mathbb{N}\times \mathbb{N}\) and \(\mathbb{Z}\times \mathbb{Z}\).

Finally, Chapters 8, 9 and 10 cover functions, sets, and relations, in that order – we take the type-theoretic point of view that functions are the primitive notion and sets and relations are defined as functions into \(\left[\operatorname{true}/\operatorname{false}\right]\).

Note for instructors

This book is based on lecture notes from a course I taught in Spring 2023 at Fordham University. There were 20 students in the course, mostly first- and second-year students, with a median background of Calculus II. Many but not all had also taken a first course in computer programming.

The course had 75-minute classes, twice a week for 13 weeks, and in this time covered about 80% of the material in this book. A typical class structure might look like this:

  • 25 minutes traditional blackboard lecture;

  • 5 minutes screenshare lecture doing the same problems in Lean;

  • 20 minutes working in Lean in pairs, instructor circulating;

  • 25 minutes traditional blackboard lecture, perhaps more theoretical than the first.

The homework assignments for the course are available on request. They are relatively short (5-7 problems per week), but students were required to submit almost all problems both in writing and in Lean. Most students required support in office hours or by email to complete the homework assignments.

The course also featured oral examinations at the 5- and 10-week marks. These were 20-minute one-on-one interviews assessing Lean fluency: students solved previously-unseen Lean exercises (different exercises for each student), explaining their reasoning aloud. The grade breakdown for the course was 25% homework, 20% oral examinations, and 55% traditional written examinations (a midterm and a final, both completely Lean-free).

Clearly, the combination of in-class work with instructor circulating, homework support in office hours and by email, and oral examinations adds up to a significant amount of time spent interacting with individual (or small groups of) students. The student:staff ratio 20:1 was sustainable. I suspect that to go beyond this ratio, one would need very strong students or an experienced and enthusiastic TA.

The students ran Lean in a cloud development environment, to avoid needing to install Lean on their own computers. I used Gitpod for this (an alternative is GitHub Codespaces) – see the README of the book’s code repository for brief instructions on how to start using Gitpod. The students’ Lean homework was automatically graded using a Gradescope auto-grader (an alternative is GitHub Classroom). The Lean community’s teaching advice webpage provides instructions and troubleshooting for setting up this kind of course infrastructure.

Acknowledgements

My heartfelt thanks to

  • Microsoft Research, for a grant which supported the writing of the book;

  • My department at Fordham, for letting me teach the experimental course which this book grew from;

  • The intrepid students in that course, Math 2001 L01 Spring 2023, for their enthusiasm and resourcefulness;

  • Matthew Hertz, for setting up the Sphinx infrastructure for the book and typesetting the first chapters;

  • The mathlib community, and particularly Mario Carneiro, Gabriel Ebner, Scott Morrison, Thomas Murrills and David Renshaw, for their work on the Lean 3 to Lean 4 port in fall 2022 and winter 2023 prioritizing the parts of the library I needed for the course;

  • Mario Carneiro (also) for marathon hacking sessions which produced the most interesting custom tactics used in the book;

  • Jeremy Avigad, Rob Lewis and Patrick Massot, for sharing technical infrastructure for Lean-based courses and for many conversations about the dream of teaching mathematics with Lean.

Footnotes

1

For example, the uses of calculation blocks for most algebraic reasoning, and a preference for forward over backward reasoning.

2

If you are looking for the reverse, Mathematics in Lean is the canonical introduction to mathematical Lean. But note that that book expects more mathematical experience than this one does: writing idiomatic Lean code, even to prove elementary statements, requires some mathematical maturity.

3

The algebraic-reasoning tactic vocabulary of ring, rw, numbers (AKA norm_num), rel (custom-written for this book but now in mathlib proper), extra (custom-written) and cancel (custom-written) suffices for pretty much all algebraic reasoning over the integers, including nonlinear inequalities. The training in this vernacular presented over the course of Sections 1.2 - 2.1 pays off later, by avoiding the need to invoke the endless lemmas such as mul_le_mul_of_nonneg_left, pow_pos or le_of_pow_le_pow by name. Other custom automation lightly streamlines work with induction principles, well-foundedness justifications, product types, and sets. In total, fewer than fifty lemmas are invoked by name in the book.

4

It’s in fact quite possible to get through the equality-heavy reasoning of many intro-to-proof courses without really mastering this mode of reasoning, but it’s nearly impossible to reason about inequalities without mastering calculational proofs, and students who don’t pick up the skill in an intro-to-proof course will find it come back to haunt them when they reach real analysis.

5

The expert reader may enjoy the problems in Section 5.2, which introduces classical reasoning; they are new (to my knowledge), and more elementary than the usual textbook examples.