Papers
Topics
Authors
Recent
Search
2000 character limit reached

Formalizing the Prime-Field Singer Construction and Sidon Set Infrastructure in Lean 4

Published 5 May 2026 in math.CO | (2605.03274v1)

Abstract: Erdős Problem 30 asks for sharp asymptotics of the Sidon extremal function $h(N)$, and Singer's construction is the classical source of lower-bound examples matching the main term. We present a Lean 4 formalization of Singer's Sidon set construction for prime fields, together with reusable Sidon-set infrastructure for additive combinatorics. For every prime $p$, we prove the existence of a Sidon set modulo $p2+p+1$ of cardinality $p+1$. The proof proceeds through a non-trivial algebraic chain: construction of the Galois field $\mathrm{GF}(p3)$, analysis of the trace kernel as a 2-dimensional subspace, a geometric argument via subspace intersections establishing the multiplicative Sidon property in the quotient group, and a combinatorial bridge transferring this to modular integer arithmetic. Around this central result, we develop a reusable Sidon set library for additive combinatorics. It comprises interval Sidon sets, modular Sidon sets, the extremal function $h(N)$, Lindstrom's cross-difference inequality, a Johnson-route shift-incidence upper bound of the form $h(N) \leq \sqrt{N} + N{1/4} + O(1)$, exact representation-function identities, and unconditional two-sided $h(N)=Θ(\sqrt{N})$ bounds with exact floor-rounded finite statements for $N \geq 5$. We further formalize a conditional reduction: subpolynomial prime gaps together with a full subpolynomial upper-error hypothesis for $h(N)$ imply the Erdős Problem 30 estimate $h(N)=\sqrt{N}+O_\varepsilon(N\varepsilon)$ for every $\varepsilon>0$. The core Singer/Sidon and transfer development comprises 6,382 lines of Lean 4 with zero active uses of sorry. We describe the mathematical lessons learned, focusing on how formalization clarifies the precise scope of classical arguments and forces explicit treatment of the algebraic-combinatorial interface.

Summary

  • The paper establishes a formal Lean 4 framework for Singer's Sidon set construction in prime fields.
  • It verifies algebraic constructions and exact sumset identities for Sidon sets, facilitating extremal combinatorics analysis.
  • Developed reusable infrastructure aids in future research on additive combinatorics and ErdÅ‘s's Problem 30.

Formalization of the Prime-Field Singer Construction and Sidon Set Infrastructure in Lean 4

Overview and Motivation

The paper "Formalizing the Prime-Field Singer Construction and Sidon Set Infrastructure in Lean 4" (2605.03274) presents a mechanized formalization in Lean 4 of Singer's Sidon set construction for prime fields. The motivation is anchored in additive combinatorics, specifically addressing Erdős's Problem 30 concerning the asymptotic growth of the extremal Sidon set function h(N)h(N), defined as the maximal size of a subset A⊆{1,…,N}A \subseteq \{1, \ldots, N\} such that all pairwise sums a+ba+b (a≤ba\le b) are distinct.

This development formally verifies the algebraic construction, combinatorial transfer, and extremal properties of Sidon sets arising from finite field theory, and provides a reusable infrastructure for additive number theory in Lean. The formalization covers explicit algebraic constructions, modular and interval analogues, exact sumset and representation function identities, and conditional reductions connecting analytic number theory hypotheses with combinatorial extremal results.

Mathematical Context: Sidon Sets and Erdős Problem 30

A Sidon set, or B2B_2-set, is a set of integers where no two unordered pairs of elements sum to the same integer. The growth of h(N)h(N), the maximal cardinality of a Sidon subset of {1,…,N}\{1,\ldots,N\}, has been the subject of intense investigation. Classical results yield matching upper and lower bounds up to subpolynomial terms:

  • Lindström's inequality: h(N)≤N+N1/4+O(1)h(N) \le \sqrt{N} + N^{1/4} + O(1).
  • Algebraic constructions using Singer difference sets yield h(N)≥N−O(Nθgap)h(N) \ge \sqrt{N} - O(N^{\theta_{\mathrm{gap}}}), where θgap\theta_{\mathrm{gap}} is derived from prime-gap estimates.

Erdős's Problem 30 asks whether A⊆{1,…,N}A \subseteq \{1, \ldots, N\}0 for all A⊆{1,…,N}A \subseteq \{1, \ldots, N\}1. The formalization makes explicit the logical dependencies: unconditional upper bounds, lower bounds conditional on prime gap hypotheses, and the interface between algebraic group constructions and combinatorial incidence.

Formalization of Singer's Construction in Lean 4

Algebraic Foundations

The development establishes the classical Singer construction in the specific context of prime fields (A⊆{1,…,N}A \subseteq \{1, \ldots, N\}2), formalizing the creation of a Sidon set of size A⊆{1,…,N}A \subseteq \{1, \ldots, N\}3 modulo A⊆{1,…,N}A \subseteq \{1, \ldots, N\}4 for each prime A⊆{1,…,N}A \subseteq \{1, \ldots, N\}5, using only verified Lean code and a leanprover-community Mathlib base.

Key algebraic components:

  • Construction of A⊆{1,…,N}A \subseteq \{1, \ldots, N\}6: The extension is realized via the GaloisField API for fixed primes, interpreting A⊆{1,…,N}A \subseteq \{1, \ldots, N\}7 as a 3-dimensional vector space over A⊆{1,…,N}A \subseteq \{1, \ldots, N\}8.
  • Trace Kernel Analysis: It is formally proven that A⊆{1,…,N}A \subseteq \{1, \ldots, N\}9 is a 2-dimensional subspace, and explicit representatives of its projective lines are constructed.
  • Invariant Subspace Analysis: There is a formal, irreducibility-based proof that multiplication by any non-base field element acts irreducibly, ensuring the geometric structure required for Sidon set properties.
  • Grassmann Intersection: Theorems on intersection dimension of subspaces are explicitly invoked and checked.

Quotient-Group Sidon Property

By considering the quotient a+ba+b0, the construction produces a cyclic group of order a+ba+b1. The image of the nonzero trace-kernel elements forms a perfect a+ba+b2-difference set, where a+ba+b3 and a+ba+b4, thus a Sidon set. The paper formalizes that all nontrivial equalities of products in the quotient correspond to unordered pair equality, establishing the Sidon property in this context.

Modular-to-Interval Transfer

A substantial component is the "combinatorial bridge": converting modular Sidon sets (in cyclic groups) into interval Sidon sets within a+ba+b5, suitable for extremal function analysis. This leverages:

  • Construction of explicit injective projective representatives.
  • Cyclic isomorphism with the explicit calculation of indices in a+ba+b6.
  • A cyclic window transfer lemma that, via averaging and distinct gap counts, can transfer modular Sidon sets to interval Sidon sets, with precise cardinality loss control.

Reusable Lean Infrastructure

The authors develop and verify a suite of reusable Sidon set infrastructure in Lean, including:

  • Definitions of Sidon, modular Sidon, and interval Sidon predicates.
  • Sumset and difference set cardinality exact formulas.
  • Representation function bounds and identities (e.g., a+ba+b7, additivity energy lemmas).
  • Machinery for upper and lower bounds, exact transfer theorems, and conditional interfaces.

Formalized Upper and Lower Bounds on a+ba+b8

Unconditional formalized results include:

  • Upper Bounds: Classical pair-difference and Lindström upper bounds are formalized with exact integer-rounded expressions:
    • a+ba+b9 for all a≤ba\le b0.
    • a≤ba\le b1 for all a≤ba\le b2.
  • Lower Bounds: Using Singer's construction and Bertrand's postulate, the formalization yields

a≤ba\le b3

  • Two-sided Bounds: For a≤ba\le b4,

a≤ba\le b5

The precise integer form is essential, as floor and rounding effects are notationally and mathematically significant in the formal context.

Conditional Reduction and Dependency on Analytic Hypotheses

A core feature is the conditional reduction from prime-gap hypotheses to sharp Sidon extremal bounds:

  • Subpolynomial Prime Gap Hypothesis (weaker than Cramér's Conjecture): For every a≤ba\le b6, there is always a prime in the interval a≤ba\le b7 for large enough a≤ba\le b8.
  • If one combines this with the Singer family construction, the formalization yields the lower bound a≤ba\le b9 for all B2B_20, establishing ErdÅ‘s 30 conditionally.

All implications, bounds, and logical structures are explicitly mechanized, making arguments about necessity and sufficiency of hypotheses and transfer steps machine-checkable.

Lessons from Formalization

The formalization surfaces several structural insights not always explicit in informal mathematics:

  • Explicit Projective Representatives: Mechanization requires explicit basis selection and functional representatives for projective lines in the trace-kernel subspace, making precise what is typically implicit.
  • Intermediate Type Bridges: The modular Sidon predicate functions as a natural intermediate between algebraic group constructions and combinatorial sets in B2B_21.
  • Non-constructivity in B2B_22: The extremal function B2B_23, though defined via a maximum over a finite set, is necessarily formalized non-constructively (via choice) and is not computable in Lean.
  • Explicit Floor Effects: Integer rounding and thresholds in transfer theorems carry real mathematical import in the formal proofs.

Implications and Future Directions

This Lean 4 formalization provides a reliable foundation for further research in formal additive combinatorics:

  • Theoretical Value: The decomposition of the problem into formalized algebraic, combinatorial, and analytic components clarifies the precise mathematical interfaces and hypotheses required at each stage of the proof.
  • Practical Value: The Sidon set infrastructure module in Lean is a reusable asset for future formalizations in extremal combinatorics, and supports efforts toward a machine-verified solution of ErdÅ‘s-type extremal problems.
  • Extensions: Possible directions include optimizing the formalized lower bounds with stronger analytic number-theoretic input (Baker–Harman–Pintz bounds), generalizing the construction to prime powers (B2B_24) via field towers, and extending the theory to modular and Fourier-analytic methods for more general extremal and representation results.

Conclusion

The paper realizes a fully verified, flexible formal Lean 4 artifact for the prime-field Singer construction, associated Sidon set extremal combinatorics, and the machinery to transfer algebraic constructions into combinatorial statements relevant to Erdős's Problem 30. This work both enforces rigorous accounting of all classical dependencies and serves as a robust base for ongoing formalization in additive combinatorics and related domains.

Reference:\ "Formalizing the Prime-Field Singer Construction and Sidon Set Infrastructure in Lean 4" (2605.03274)

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 7 likes about this paper.