Papers
Topics
Authors
Recent
Search
2000 character limit reached

Mathematics with large language models as provers and verifiers

Published 11 Oct 2025 in cs.CL, cs.AI, cs.LG, and cs.LO | (2510.12829v1)

Abstract: During 2024 and 2025 the discussion about the theorem-proving capabilities of LLMs started reporting interesting success stories, mostly to do with difficult exercises (such as problems from the International Mathematical Olympiad), but also with conjectures [Feldman & Karbasi, arXiv:2509.18383v1] formulated for the purpose of verifying whether the artificial intelligence could prove it. In this paper we report a theorem proving feat achieved by ChatGPT by using a protocol involving different prover and verifier instances of the gpt-5 model working collaboratively. To make sure that the produced proofs do not suffer from hallucinations, the final proof is formally verified by the lean proof assistant, and the conformance of premises and conclusion of the lean code is verified by a human. Our methodology was able to solve five out of six 2025 IMO problems, and close a third of the sixty-six number theory conjectures in [Cohen, Journal of Integer Sequences, 2025].

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (2)

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 6 tweets with 0 likes about this paper.