Proposing and solving olympiad geometry with guided tree search
Abstract: Mathematics olympiads are prestigious competitions, with problem proposing and solving highly honored. Building artificial intelligence that proposes and solves olympiads presents an unresolved challenge in automated theorem discovery and proving, especially in geometry for its combination of numerical and spatial elements. We introduce TongGeometry, a Euclidean geometry system supporting tree-search-based guided problem proposing and solving. The efficient geometry system establishes the most extensive repository of geometry theorems to date: within the same computational budget as the existing state-of-the-art, TongGeometry discovers 6.7 billion geometry theorems requiring auxiliary constructions, including 4.1 billion exhibiting geometric symmetry. Among them, 10 theorems were proposed to regional mathematical olympiads with 3 of TongGeometry's proposals selected in real competitions, earning spots in a national team qualifying exam or a top civil olympiad in China and the US. Guided by fine-tuned LLMs, TongGeometry solved all International Mathematical Olympiad geometry in IMO-AG-30, outperforming gold medalists for the first time. It also surpasses the existing state-of-the-art across a broader spectrum of olympiad-level problems. The full capabilities of the system can be utilized on a consumer-grade machine, making the model more accessible and fostering widespread democratization of its use. By analogy, unlike existing systems that merely solve problems like students, TongGeometry acts like a geometry coach, discovering, presenting, and proving theorems.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Explain it Like I'm 14
Overview
This paper introduces TongGeometry, a computer system that can both create and solve hard geometry problems like the ones seen in math olympiads. Think of it as a “geometry coach” that not only knows how to solve problems but can also come up with new, elegant ones and explain why they work.
What questions does the paper try to answer?
The paper explores a few main questions in simple terms:
- Can an AI reliably invent challenging and beautiful geometry problems, similar to those used in competitions?
- Can it also solve those problems with clear, step-by-step reasoning (not just guessing)?
- Can it do this fast and on a normal computer, without super expensive hardware?
- Can guiding the system with smart hints from LLMs (like advanced chatbots) make it better?
How does TongGeometry work?
TongGeometry uses ideas from both logic and AI, combined with a search method. Here are the key parts, explained with everyday analogies:
- Tree search: Imagine solving a maze. Each fork in the maze is a choice, and the system explores many paths. A “tree” is just a map of all those choices. TongGeometry searches this tree to find the right route to a proof.
- Proposing problems (backward tracing): Creating a good geometry problem is like designing a puzzle. TongGeometry starts with interesting goals (like “prove these two lengths are equal”) and works backward to figure out what setup and clues would make that goal challenging and elegant.
- Solving problems (forward chaining): Solving is like starting at the puzzle’s given facts and moving forward step by step, applying known rules until you reach the goal.
- Auxiliary constructions: In geometry, sometimes the trick is to add helpful points, lines, or circles that aren’t given at the start. These are “auxiliaries.” Think of them as extra tools you draw to make hidden relationships visible.
- Deductive engine: This is the rule-following brain. It applies geometry facts and theorems (like “If two angles are equal, then…”). TongGeometry uses a fast, human-readable rule system called a deductive database to build proofs.
- Replay buffer: Like saving bookmarks while exploring a maze. When the system finds a promising partial setup, it stores it and can revisit it later to build more complex problems.
- LLM guidance (policy and value models): TongGeometry fine-tunes two AI models to act like smart guides:
- Policy model: Suggests “which direction to explore next” (like a coach pointing to a clever auxiliary to try).
- Value model: Estimates “how close we are to finishing” (like a coach guessing how many steps remain to the solution).
This “actor-critic” style is similar to how top game-playing AIs combine move suggestions with estimates of future success.
What did the researchers find, and why does it matter?
Here are the main results, explained plainly:
- A massive collection of geometry knowledge: TongGeometry discovered 6.7 billion theorems that need auxiliary constructions. Of these, 4.1 billion are symmetric (symmetry often makes problems elegant and competition-worthy). This is the largest geometry theorem repository ever created.
- Real olympiad impact: The system proposed 10 problems to real competitions, and 3 were accepted or shortlisted by major events in China and the US. That means experts judged them strong enough to be used in actual contests.
- Beating top human performance on a standard benchmark: On a set of 30 famous International Mathematical Olympiad (IMO) geometry problems, TongGeometry solved all 30. This outperformed the average IMO gold medalist and beat the previous top AI system. Even better, it did this on a consumer-grade machine within 38 minutes per problem.
- Efficiency and accessibility: Unlike earlier systems that required massive computing clusters, TongGeometry runs on a regular high-end desktop (32 CPU cores and one GPU). This makes advanced geometry AI more available to schools, coaches, and students.
- Better strategy for hard geometry: The value model was especially helpful on the hardest problems. Together with its rule engine and auxiliary construction skills, TongGeometry showed that combining logic with smart guidance works well.
What does this mean for the future?
- A coach, not just a student: Most systems only solve problems. TongGeometry also invents them, judges their difficulty and elegance, and proves them clearly. That’s closer to how human experts teach and create math.
- Education and training: Because it’s efficient and can run on normal hardware, teachers and students could use it to explore geometry, get high-quality problems, and study clean, step-by-step proofs.
- Research in math and AI: This is a step toward automated discovery in mathematics. It shows that blending rule-based reasoning with learned guidance can crack very tough, creative tasks.
- Better tools for competitions: Organizers and coaches could use systems like TongGeometry to generate fresh, fair, and beautiful problems, and to check their solutions rigorously.
In short, TongGeometry demonstrates that AI can both create and solve competition-level geometry with speed, elegance, and reliability—bringing high-level mathematical thinking closer to everyone.
Collections
Sign up for free to add this paper to one or more collections.