Papers
Topics
Authors
Recent
Search
2000 character limit reached

TheoremForge: Scaling up Formal Data Synthesis with Low-Budget Agentic Workflow

Published 24 Jan 2026 in cs.AI | (2601.17332v1)

Abstract: The high cost of agentic workflows in formal mathematics hinders large-scale data synthesis, exacerbating the scarcity of open-source corpora. To address this, we introduce \textbf{TheoremForge}, a cost-effective formal data synthesis pipeline that decomposes the formalization process into five sub-tasks, which are \textit{statement formalization}, \textit{proof generation}, \textit{premise selection}, \textit{proof correction} and \textit{proof sketching}. By implementing a \textit{Decoupled Extraction Strategy}, the workflow recovers valid training signals from globally failed trajectories, effectively utilizing wasted computation. Experiments on a 2,000-problem benchmark demonstrate that TheoremForge achieves a Verified Rate of 12.6\%, surpassing the 8.6\% baseline, at an average cost of only \textbf{\$0.481} per successful trajectory using Gemini-3-Flash. Crucially, our strategy increases data yield by \textbf{1.6$\times$} for proof generation compared to standard filtering. These results establish TheoremForge as a scalable framework for constructing a data flywheel to train future expert models. Our code is available \href{https://github.com/timechess/TheoremForge}{here}.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (2)

Collections

Sign up for free to add this paper to one or more collections.

GitHub

Tweets

Sign up for free to view the 1 tweet with 0 likes about this paper.