✨ Visual Editor

close

palette Canvas & Background

Gradient:arrow_forward
Text Color:
135°

style Card Style

40px
16px

text_fields Typography

16px
Pietro Monticone
@PietroMonticone
AI is increasingly changing how we do mathematics.

Erdős Problem #650, open for over 60 years, was solved a few weeks ago through a collaboration between human mathematicians, an informal reasoning model (GPT 5.4 Pro @OpenAI) and a formal one (Aristotle @HarmonicMath). 🧵
Thread image
Pietro Monticone
@PietroMonticone
Let me tell you what happened and why I think it nicely illustrates how formal verification could significantly mitigate the risk of informal hallucination, whether from human or non-human agents.
Pietro Monticone
@PietroMonticone
Fix a positive integer m. Given any set A of m positive integers and any interval I of length 2·max(A), how many disjoint pairs (a,b) with a in A, b in I and a dividing b can you always find?

Call this f(m). Erdős asked: what is f(m)?
Thread image
Pietro Monticone
@PietroMonticone
In 1959, Erdős and Surányi proved f(m) ≥ √m.

About 20 years later, Erdős and Selfridge showed f(m) ≤ 2⌈√m⌉. Two bounds differing by a factor of two.

In 1995, Erdős asked whether the lower bound could be improved.

Nothing of note was proved, until this year.
Thread image
Pietro Monticone
@PietroMonticone
Prompted by Yixin He, Yanyang Li, and Quanyu Tang, GPT 5.4 Pro was able to rediscover the Erdős–Selfridge upper bound f(m) ≤ 2⌈√m⌉.

Inspired by this, Quanyu kept conversing with GPT 5.4 Pro, which claimed it could prove the exact formula f(m) = ⌈2√m⌉ for all m ≥ 4.
Pietro Monticone
@PietroMonticone
Thomas Bloom (@thomasfbloom) and Terence Tao reviewed the proof. They checked the main ideas, and provisionally concluded that the proof seemed to check out.

The final result: f(m) = min(m, ⌈2√m⌉) for all m.
Thread image
Pietro Monticone
@PietroMonticone
As is becoming standard practice, Wouter van Doorn submitted the informal proof to Aristotle to get it autoformalised in @leanprover and Aristotle succeeded.

So far, nothing seemed out of the ordinary.
Pietro Monticone
@PietroMonticone
However, while writing up the paper, the authors noticed that one step in the original informal argument was actually wrong.

You might think this should not be possible. After all, Aristotle had already formalised the proof. So it has to be correct, right?

Well, yes... and no!
Pietro Monticone
@PietroMonticone
Very roughly speaking, the proof splits I into halves Γ₋(S), Γ₊(S) and needs |S| ≤ |Γ₋(S)|·|Γ₊(S)|.

The natural injection φ(a) = (uₐ, uₐ+a) works when aₘ doesn't divide x, but when aₘ divides x it may not even be well-defined and naive fixes break injectivity.
Thread image
Thread image
Pietro Monticone
@PietroMonticone
To understand why a wrong definition of φ still led to a verified @leanprover proof, the authors had to closely look at the formalisation.

They discovered that, in fact, Aristotle had not used GPT's proposed φ in this case.
Pietro Monticone
@PietroMonticone
Aristotle had independently come up with a different function, a three-branch injection that handles all the edge cases, and used it in its formalisation.

Before anyone even realised there was an issue, Aristotle had already spotted the mistake and quietly taken care of it.
Thread image
Thread image
Pietro Monticone
@PietroMonticone
Aristotle did not merely check the proof.

It autonomously discovered a fix for a subtle edge case that expert human reviewers had skipped trusting that Aristotle's formalisation would take care of it.

That trust turned out to be well placed!
Pietro Monticone
@PietroMonticone
A thread like this one can only scratch the surface, but if you'd like to dive deeper, here are the two main references:

1. Informal @arxiv preprint: arxiv.org/abs/2603.28636

2. Formal @leanprover script: live.lean-lang.org/#project=mathl…
Pietro Monticone
@PietroMonticone
I also tend to see this story as a nice, concrete instance of @davidbessis's "System 3": the productive friction between informal intuition and formal rigour, except externalised across AI models rather than confined to a single mathematician's mind. davidbessis.substack.com/p/thinking-fas…
Generated by Thread Navigator
100%
view_carousel Carousel Studio NEW
Press + S to quick-export