Proof in Code traces efforts to digitally verify mathematical truths

Proof in Code traces efforts to digitally verify mathematical truths

A journalist chronicles the rise of computer programs that can solve difficult math problems

The cover of the book The Proof in the Code against a green background.

The proof in the code
Kevin Hartnett
Quanta Books, $30

In 2024, the International Mathematics Olympiad welcomed an unusual participant. Google Deepmind had put AlphaProof, a newly formed AI program, on questions for that year’s competition, albeit as an unofficial participant. In the competition, top math students from around the world solve six advanced math problems over two days. AlphaProof made headlines and reached the silver medal threshold by scoring 28 out of 42 points.

AlphaProof is a mathematical theorem solver, a system for proving mathematical statements. Just four years ago, training AI systems to automate mathematical reasoning was a daunting challenge. But Google Deepmind and other groups hoped their efforts would equip AI systems with broader reasoning skills that could one day be applied to real-world problems, using logic to potentially free AI tools from hallucinations, instances of invented information.

Many of these programs owe their success to the Lean Proof Assistant program. Leo de Moura launched Lean as a software code review tool in 2013 while he was a software engineer at Microsoft Research. But today, mathematicians and AI researchers are the biggest supporters of Lean. In his new book THE Evidence in the Codejournalist Kevin Hartnett traces this evolution.

Hartnett recounts de Moura’s persistence in developing software that had no immediate commercial benefit and the dogged determination of a small group of mathematicians who persuaded their field to adopt the program. Throughout the book, Hartnett introduces a host of characters from around the world who saw Lean as a new way to evaluate mathematical truths and played a role, large and small, in making Lean more user-friendly. Together, this makes for an inspiring story of collaboration.

In the early chapters, Hartnett peppers in explanations of the similarities between mathematics and coding, demonstrating how the use of Lean could so naturally be transplanted into mathematical research. “Both are written in exact syntax as a series of logical steps, each leading to the next,” Hartnett writes. “A gap in the logic of a proof is like a bug in the code of software.” A program executes when the code has the correct logic. A new mathematical theorem is the result of a correctly written proof.

Almost immediately after Lean launched, Jeremy Avigad of Carnegie Mellon University began implementing it for writing math tests. Lean and other proof assistance programs, also known as interactive theorem provers, can verify new mathematical proofs written by humans, which sometimes span hundreds of pages and can take months to review. Programs cannot come up with new proofs, but by helping to ensure that proofs are free of errors, they can enable mathematicians to establish new mathematical facts more quickly for use in newer proofs. Yet proof assistants were cumbersome software that required math to be written in a completely different way.

To work with proof assistants, mathematicians had to translate simple language problems into code and create libraries of coded definitions and theorems of basic mathematical concepts. For example, when Kevin Buzzard, a mathematics professor at Imperial College London, was writing problem sets to teach his undergraduate students how to work with Lean, he quickly ran into an unexpected obstacle. “Lean asked him to prove that 2 is not equal to 1,” Harnett writes. “It is a statement so clearly true that human beings, in normal conversation, would not waste a moment in justifying it.” But Lean asked him to prove the inequality before using it.

For a long time, there simply wasn’t enough math in Lean libraries to be useful to mathematicians. And it would be impossible to code more mathematics without more mathematicians using the program. Hartnett shares what it took for some mathematicians to popularize Lean. For example, in 2018, Buzzard and others set out to translate perfectoid spaces into Lean. Coding this sparkling new innovation in arithmetic geometry took them months of work and several thousand lines of code. And these efforts worked. By 2025, “tens of thousands of users in academia and technology were launching increasingly ambitious projects on top of Lean,” writes Hartnett. This included AI researchers, who found in Lean a vast library of advanced mathematics needed to train mathematically solving AI models such as AlphaProof.

De Moura and the mathematicians wanted to build a truth machine, “a computer program capable of providing a complete 100 percent guarantee that a logical chain is correct,” Hartnett writes. While for de Moura the truth he sought was knowing with certainty that the code of computer programs, like Microsoft Word, was correct and free of bugs, for mathematicians, a truth machine could make discoveries of mathematical proofs more rigorous, organized, and accurate.

By retracing this story, The proof in the code jumps between timelines, introducing characters and anecdotes without always clearly indicating their importance. This may be confusing for readers, but for the mathematically curious, the book adds a captivating texture to the history of Lean and its place in a recent new chapter in the history of mathematics.


Buy The proof in the code from Bookshop.org. Scientific news is affiliated with Bookshop.org and will earn commission on purchases made from links in this article.

Exit mobile version