Extremal descendant integrals on moduli spaces of curves: An inequality discovered and proved in collaboration with AI
Abstract: For the pure $ψ$-class intersection numbers $D(\textbf{e})=\langle τ{e_1} \cdots τ{e_n} \rangle_g$ on the moduli space $\overline{\mathcal{M}}_{g,n}$ of stable curves, we determine for which choices of $\textbf{e}=(e_1, \ldots, e_n)$ the value of $D(\textbf{e})$ becomes extremal. The intersection number is minimal for powers of a single $ψ$-class (i.e. all $e_i$ but one vanish), whereas maximal values are obtained for balanced vectors ($|e_i - e_j| \leq 1$ for all $i,j$). The proof uses the nefness of the $ψ$-classes combined with Khovanskii--Teissier log-concavity. Apart from the mathematical content, this paper is also meant as an experiment in collaborations between human mathematicians and AI models: the proof of the above result was found and formulated by the AI models GPT-5 and Gemini 3 Pro. Large parts of the paper were drafted by Claude Opus 4.5, and a part of the argument was formalized in Lean with the help of Claude Code and GPT-5.2. The paper aims for maximal transparency on the authorship of different sections and the employed AI tools (including prompts and conversation logs).
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Explain it Like I'm 14
What is this paper about?
This paper looks at a special kind of number that comes from geometry, called a “descendant integral” (also called a “psi-class intersection number”). These numbers come from studying all possible shapes of curves (like loops) that have marked points on them. The main discovery is surprisingly simple: if you keep the total “amount” fixed and choose how to split it across the marked points, then:
- You get the smallest number when you put everything on one point.
- You get the biggest number when you split things as evenly as possible across all points.
The paper also highlights that this result was first guessed and then proved with the help of AI, and parts of the logic were checked by a proof assistant (Lean).
What question are they trying to answer?
Imagine you have a fixed total number (call it ) and buckets (the marked points). You decide how many “tickets” to put in each bucket, as long as . For each such split, geometry gives you a number .
The paper asks:
- For which splits is as big as possible?
- For which splits is it as small as possible?
They prove:
- Minimum: when all tickets are in one bucket and the others have zero.
- Maximum: when the tickets are as evenly spread as they can be (all within 1 of each other). This is called a balanced split.
How do they approach it? (With simple analogies)
First, a bit of friendly translation:
- “Moduli space of curves” = the grand catalog of all shapes of curves with labeled points.
- “Psi-classes” = geometric data attached to each marked point that you can think of as measuring “how the curve is allowed to wiggle near that point.”
- “Intersection number” or “descendant integral” = a way to combine these wiggle-measures across the points and get a single number that summarizes something about all curves in the catalog.
Key ideas in the proof:
- Think of moving tickets between just two buckets at a time and watching how the number changes. If you plot the values you get while sliding tickets from one to the other, you get a “hill-shaped” sequence: it increases up to the middle and then decreases. In math, this is called log-concavity.
- The “hill-shaped” behavior comes from a powerful geometric inequality (the Khovanskii–Teissier inequality), which says these numbers behave nicely when you mix contributions from two points. You can think of it like: carefully blending two ingredients yields a smooth curve with a single peak, not many bumps.
- Symmetry also helps: swapping two marked points doesn’t change the result.
- From these two facts (hill-shape and symmetry), you can prove two “moves”:
- Balancing move: if one bucket has at least 2 more tickets than another, moving one ticket from the bigger to the smaller does not decrease ; it usually increases it. Repeating this makes the split more even and pushes upward toward its maximum.
- Concentrating move: to make small, you keep moving tickets toward the largest bucket. Repeating this concentrates everything into one bucket and pushes downward toward its minimum.
Finally, they use a known formula to compute the exact value of the minimum.
What did they find?
- Minimum: is smallest when all tickets sit in one bucket and the rest are zero. Its exact value is
You can think of this as the “single-point” case.
- Maximum: is largest when the tickets are split as evenly as possible, meaning all are equal or differ by at most 1. For example, if and , then balanced choices are permutations of .
- In the simplest world (genus 0), there’s an exact formula showing this “even split is best” behavior very clearly. For higher-genus curves (more complicated shapes), there’s no simple universal formula, but the same “even is best” rule still holds thanks to the geometric inequality and symmetry.
Why this matters:
- These intersection numbers are central to deep theorems in geometry and mathematical physics (like the Witten–Kontsevich theorem), so knowing their extreme values and how they change when you move “weight” between points gives a new, clean understanding of their behavior.
- The result is simple to state but not obvious to prove for higher-genus curves, so it’s a nice conceptual discovery.
What’s the wider impact?
- For geometry: This gives a new “optimization viewpoint” on intersection numbers. It suggests similar “even-is-best” questions for other important counting problems in geometry (like Hodge integrals or Hurwitz numbers).
- For computation: It offers guidance on where to look for large or small values when exploring or testing algorithms.
- For AI in math: The result was conjectured, proved, and partially formalized with help from AI systems. The paper also proposes practical ways to credit AI contributions in mathematical writing. It shows that AI can help with nontrivial insights and that proofs can be partially checked by formal proof assistants, increasing confidence in the results.
- Open question: Is there a simple, general formula for the maximum value in the balanced case (like there is for the minimum)? In genus 0, yes; in higher genus, this remains a natural challenge.
Knowledge Gaps
Below is a concise list of concrete knowledge gaps, limitations, and open questions highlighted or implied by the paper. Each point is phrased to guide follow-up work.
- Explicit evaluation of the balanced maximum: Find a closed formula, recursion, or generating function for the balanced invariant when $3g-3+n=an+b$ (see equation (3)); in particular, in the regime , , reduce via the dilaton to determining the sequence and provide an explicit description.
- Strictness and plateau classification: Determine necessary and sufficient conditions for equality in the Khovanskii–Teissier (KT) step along two-point slices (when ), and classify all for which “plateaus” occur (i.e., when multiple non-permutation vectors attain the same maximal/minimal value).
- Uniqueness of extrema: Beyond symmetry, characterize when the maximizer/minimizer is unique; identify all local maxima/minima (are all local maxima permutations of balanced vectors and all local minima concentrated vectors?).
- Schur concavity and majorization: Elevate the balancing-step monotonicity to a full majorization result by proving that is Schur-concave on (and determining when strict Schur concavity holds). Provide a precise statement and proof (or counterexamples) for the global majorization order.
- Quantitative stability bounds: Derive explicit multiplicative/additive bounds comparing to the balanced maximum in terms of the L1-distance of from the set of balanced vectors; establish “robustness” inequalities quantifying how much improves per balancing move.
- Asymptotics of balanced invariants: Determine the asymptotic growth of in regimes such as (i) fixed , ; (ii) fixed ratio with ; (iii) jointly. Compare to the genus-0 multinomial formula and to KdV/Virasoro asymptotics.
- Structural properties of the multivariate generating polynomial: Study the polynomial (with ). Is Lorentzian/strongly log-concave/stable in the sense of Brändén–Huh? A positive answer would yield stronger inequalities and refined extremal structure.
- Alternative (integrable) proof strategy: Seek a proof of the extremal property using only Virasoro/KdV constraints or topological recursion, potentially yielding sharper inequalities or explicit formulae for balanced values, avoiding nefness/KT machinery.
- Full formal verification: Extend the Lean formalization to include the geometric input—intersection theory on Deligne–Mumford stacks, nefness of -classes, KT inequalities (or mixed Hodge–Riemann) on stacks, and finite level-structure covers—to complete a fully machine-verified proof of Theorem 1.
- Positivity axiom justification: Provide a direct geometric proof (independent of recursive/analytic methods) of strict positivity of all pure -integrals (or classify any exceptional zero cases if they exist). This would tighten the hypotheses of the abstract optimization theorem.
- Extension to mixed tautological integrals: Investigate whether analogous extremal results hold for intersections involving and/or classes or mixed –– products; determine when KT-type log-concavity (or its variants) still applies.
- Hassett (weighted) spaces and variations of stability: Generalize the extremal result to with weights , where the relevant divisor classes and constraints change; identify the appropriately “balanced” exponents relative to weights and prove maximality/minimality.
- Other enumerative families: Formulate and study analogous extremal questions for Hodge integrals, double Hurwitz numbers, and Gromov–Witten invariants of targets (e.g., ), identifying axioms (symmetry, log-concavity) under which balanced exponents (or analogues) maximize.
- Equality conditions in KT on : Analyze when proportionality/equality cases of KT or mixed Hodge–Riemann can occur for restricted to complete intersections defined by the remaining -classes; connect to geometric loci producing plateaus.
- Refined monotonicity under string/dilaton moves: Map how behaves under the addition/removal of and insertions while maintaining balancedness; identify monotonicity trends and bounds across as varies via string/dilaton equations.
- Algorithmic and computational advances: Develop fast algorithms leveraging symmetry, Virasoro recursions, and the extremal structure to compute balanced values for large ; produce extended tables beyond and assess numerical stability/precision.
- General abstract setting: Identify broader classes of symmetric, log-concave functionals on weak compositions (beyond descendant integrals) where the same extremal phenomenon holds; provide necessary and sufficient axioms ensuring balanced maxima and concentrated minima.
- Phenomenology at small : Systematically catalog all instances of non-uniqueness and plateaus for small genera/markings; use these to conjecture general equality patterns and to test predictions about KT equality cases.
Practical Applications
Practical Applications Derived from the Paper
Below are practical, real-world applications that follow from the paper’s findings (extremizers of descendant integrals), its abstract optimization theorem and proof technique (symmetry + log-concavity ⇒ balanced maxima, concentrated minima), its formalization workflow (Lean blueprint), and its AI-assisted research methodology (IMProofBench, attribution practices).
Immediate Applications
- Academically tuned optimizers for enumerative geometry (academia, software)
- Use case: When searching over ψ-descendant invariants on M_{g,n}, the paper identifies where maxima (balanced exponents) and minima (fully concentrated) occur, immediately narrowing computational searches and bounding ranges.
- Tools/workflows: Add a “balance-first” optimization pass to Sage/admcycles scripts to prune search spaces; include automated checks that reduce to one-point invariants for minima using the string equation; unit tests built from the closed minimal value 1/(24g g!).
- Assumptions/dependencies: The result is specific to ψ-intersection numbers; automated pruning is valid when the log-concavity and symmetry conditions are known to hold for the target invariant family.
- Generic discrete allocation heuristic under symmetry and log-concavity (software, operations research)
- Use case: For any objective D(e_1,…,e_n) over integer allocations with (i) symmetry in coordinates, (ii) strict positivity, (iii) discrete two-coordinate log-concavity, the paper’s balancing step provides a provably monotone hill-climbing procedure to the global maximizer (a balanced vector), and a concentrating step to a global minimizer.
- Tools/products: A small library that exposes “balance” and “concentrate” primitives for discrete allocations; drop-in heuristic for combinatorial optimizers when objective verification (symmetry + log-concavity) is available.
- Sectors and examples:
- Cloud/compute: allocate identical tasks across identical servers when per-server throughput is concave in load; the heuristic says: equalize loads to maximize total throughput.
- Manufacturing/operations: split work across identical stations with diminishing returns; balance to maximize output; concentrate to probe worst-case stress.
- A/B/n experimentation: allocate traffic evenly across identical variants if response metric is concave in exposure.
- Assumptions/dependencies: Assets/resources must be homogeneous (symmetry); performance must be (discretely) log-concave in each two-coordinate “slice” (diminishing returns). Heterogeneity or non-concavity invalidates the guarantee.
- Portfolio and budget splits across identical options (finance, policy)
- Use case: When assets/programs are interchangeable and the performance/utility is concave in the allocation to each (and symmetric), equal-weight allocations maximize the objective; conversely, the most concentrated allocation minimizes it.
- Tools/workflows: “Balanced-by-default” portfolio or grant allocation policies with an explicit concavity/symmetry checklist; quick sensitivity analyses via the concentrate/balance extremes as bounds.
- Assumptions/dependencies: Interchangeability of options and diminishing returns must be defensible; otherwise balance may not be optimal.
- Load balancing and autoscaling rules of thumb (software, energy)
- Use case: In identical-server clusters or identical energy storage units with concave yield functions, enforce balancing moves (shift 1 unit from heavy to light) to monotonically improve total yield.
- Tools/workflows: Kubernetes/HPC schedulers that embed a certified “balancing step” as a local improvement move under concavity checks; grid/distributed-storage controllers default to equalizing state-of-charge when value is concave in charge.
- Assumptions/dependencies: Homogeneous resources; verified concavity of performance curves.
- Fast feasibility bounds for other intersection-theoretic families (academia)
- Use case: For Hodge integrals, Hurwitz numbers, or other birational-geometric invariants that can be shown to satisfy symmetry and slice log-concavity (e.g., via Hodge–Riemann or Alexandrov–Fenchel-type inequalities), immediately infer that balanced exponents maximize and concentrated minimize. Useful for sanity checks and bounding search.
- Dependencies: Requires proving the same nef/log-concavity properties for the specific classes/invariants.
- Lean template for optimization-on-compositions proofs (academia, software)
- Use case: Reuse the formalized “Optimization Theorem” (symmetry + LC + positivity ⇒ balanced maxima, concentrated minima) for other discrete optimization theorems in combinatorics or geometry.
- Tools/workflows: Import the BalancedVectors Lean blueprint as a lemma template; plug in a domain-specific verification of S/LC/P; get machine-checked extremizers “for free.”
- Assumptions/dependencies: Must confirm S/LC/P in the new setting; mathlib coverage dictates ease of integration.
- AI research workflow and attribution practices (academia, publishing)
- Use case: Adopt the paper’s methodology section structure, prompt links, and LaTeX attribution macros to disclose AI involvement in discovery, proof, and writing.
- Tools/products: The attribution-macros.tex file; “Methods/AI-use” section template; integration into journal LaTeX classes.
- Assumptions/dependencies: Editorial policy alignment; minimal friction for authors.
- Benchmarking and regression tests (software, academia)
- Use case: Minimal one-point value 1/(24g g!) and balanced-max characterization become regression tests for codebases that compute ψ-descendants.
- Tools/workflows: Continuous-integration checks in Sage/admcycles pipelines.
Long-Term Applications
- Certified “balance-or-concentrate” solvers for resource allocation (software, operations research, energy, cloud)
- Vision: A solver that (i) detects/learns symmetry and slice log-concavity of a black-box discrete objective, (ii) auto-derives extremizers (balanced maxima / concentrated minima), (iii) uses these as warm starts and bounds for global optimization.
- Dependencies: Robust, possibly data-driven, tests for discrete log-concavity; tooling for partial heterogeneity (near-symmetry) and uncertainty.
- Formalization of broader algebraic geometry in Lean (academia, formal methods)
- Vision: Extend machine-checked libraries to cover moduli of curves, nef cones, intersection theory, and Hodge–Riemann machinery, enabling push-button certificates for inequalities like Khovanskii–Teissier.
- Dependencies: Community effort to build mathlib coverage; efficient tactics for intersection theory.
- General certification frameworks for Schur-convex/-concave objectives (software, policy, finance)
- Vision: A practical certification toolkit that, under verifiable structural conditions (symmetry, LC/majorization), guarantees that “as equal as possible” allocations maximize a target metric. Useful in fairness-oriented policy design and aggregate risk management.
- Potential products: APIs that return both a certification report and a recommended balanced allocation; “fairness-by-concavity” audit modules.
- Dependencies: Domain-specific justification of concavity and interchangeability; careful handling of constraints and side-effects.
- Auto-discovery and proof loops for mathematics (academia, AI research)
- Vision: End-to-end pipelines combining evolutionary search (OpenEvolve-style), LLMs, CAS, and proof assistants to propose conjectures, generate proofs, and formally verify them.
- Dependencies: Stronger agentic tooling, improved math-aware LLMs, tighter CAS–Lean integrations, curated benchmarks (e.g., IMProofBench expansions).
- Extreme-value heuristics for complex scientific models with concavity structure (science, engineering)
- Vision: Identify model subspaces where outputs are symmetric and concave along “pairwise transfers,” then apply the balancing/concentrating principles to quickly locate best/worst cases, reducing the need for brute-force sweeps.
- Sectors: Epidemiology (identical subpopulations), materials design (identical units/components), reliability engineering (identical redundant systems).
- Dependencies: Structural validation of symmetry/diminishing returns; robustness to noise and model misspecification.
- Editorial standards and tooling for AI attribution (publishing, academia)
- Vision: Field-wide adoption of visual attribution for AI-authored content, standardized “Methods: AI use” sections, and archiving of prompts/artifacts for reproducibility.
- Dependencies: Journal policies; incentives for compliance.
- Extending extremal principles to new enumerative invariants (academia)
- Vision: Systematically test the balanced-max/concentrated-min paradigm for other intersection-theoretic sequences (Hodge integrals, double Hurwitz numbers), possibly revealing new inequalities or closed forms for balanced cases.
- Dependencies: Establishing nefness or mixed Hodge–Riemann-type inputs; scalable computational verification.
- Decision-support for equitable public resource allocation (policy)
- Vision: Build analytic modules that recommend equalized distributions when objective metrics are designed to be symmetric and concave, with transparent certificates. Provide sensitivity to heterogeneity and equity constraints.
- Dependencies: Clear, agreed-upon utility design; legal/ethical guidelines; interpretability tooling.
Notes on assumptions and dependencies (common across items):
- Symmetry/interchangeability: All “balanced is best” conclusions presuppose that options (servers, assets, programs) are indistinguishable in the objective.
- Diminishing returns/log-concavity: The objective must exhibit discrete log-concavity along two-coordinate transfers (a strong, testable diminishing-returns property).
- Positivity: The abstract theorem requires strictly positive objective values on the feasible set.
- Constraints: Additional constraints (capacity, indivisibilities beyond unit moves, heterogeneity) can break optimality of balance; use the balancing/concentrating outputs as bounds or warm starts rather than final answers when assumptions are only approximately met.
Glossary
- Alexandrov–Fenchel inequalities: Fundamental convex- and algebraic-geometry inequalities implying log-concavity for mixed intersections of nef classes. "The log-concavity in Step~2 is a special case of the Khovanskii--Teissier (or Alexandrov--Fenchel) inequalities for nef classes;"
- Balanced vector: An exponent vector with entries as equal as possible, differing by at most one. "A vector is balanced if for all ."
- Complete intersection surface: A surface obtained as the common zero locus of divisors; used to reduce intersection inequalities. "by restricting to a general complete intersection surface and applying the Hodge index theorem,"
- Cotangent line bundle: The line bundle whose fiber is the cotangent space at a marked point on a curve; its first Chern class is the -class. "For each marked point , the cotangent line bundle at that marking defines the -class ."
- Deligne–Mumford stack: A stacky generalization of a variety parameterizing families of curves with automorphisms; used here for moduli of curves. "The reduction from the Deligne--Mumford stack to a smooth projective variety can be made using a finite level-structure cover;"
- Descendant integrals: Intersection numbers of powers of -classes on moduli spaces of curves. "The descendant integrals (or intersection numbers of -classes)"
- Dilaton equation: A recursion relating descendant invariants with an insertion of to ones with fewer marked points. "the dilaton equation allows to reduce all invariants to"
- Double Hurwitz numbers: Enumerative invariants counting branched covers of the sphere with specified ramification over two points. "such as Hodge integrals or double Hurwitz numbers."
- Finite flat cover: A finite, flat morphism used to pass from stacks to schemes while preserving intersection computations up to degree. "Let be a finite flat cover of degree by an irreducible complete scheme "
- Fundamental class: The canonical homology class representing the whole space, used in pushforward/pullback formulas. "together with the pushforward relation of the fundamental class,"
- Hodge index theorem: A signature statement on the intersection form on surfaces, yielding convexity/concavity properties. "by restricting to a general complete intersection surface and applying the Hodge index theorem,"
- Hodge integrals: Intersection numbers involving Hodge classes (Chern classes of the Hodge bundle) on moduli of curves. "such as Hodge integrals or double Hurwitz numbers."
- KdV hierarchy: An infinite system of integrable PDEs; the generating function of descendants is a tau-function for it. "which establishes that their generating function is a -function for the KdV hierarchy."
- Khovanskii–Teissier inequalities: Intersection-theoretic inequalities for nef divisor classes implying log-concavity of certain sequences. "Khovanskii--Teissier inequalities {as described in Variant 1.6.2 of \cite{MR2095471} (equivalently, the mixed Hodge--Riemann bilinear relations) give the discrete log-concavity"
- Log-concavity: A property of sequences S with S_t2 ≥ S_{t-1} S_{t+1}; here derived from intersection inequalities. "S_t2 \geq S_{t-1} S_{t+1} \qquad (1 \leq t \leq q - 1) \qquad \text{(log-concavity)}."
- Mixed Hodge–Riemann bilinear relations: General relations in Hodge theory equivalent to certain intersection inequalities. "(equivalently, the mixed Hodge--Riemann bilinear relations)"
- Moduli space of stable curves: The parameter space of isomorphism classes of stable genus-g curves with n marked points. "The moduli space of stable curves of genus with marked points has dimension ."
- Nef (numerically effective): A positivity condition on divisor classes ensuring nonnegative intersection with every curve. "It is standard that each is nef on ."
- Orbifold variety: A space locally modeled on quotients by finite groups; appears when treating stacks like moduli spaces. "on a projective (orbifold) variety of dimension ,"
- Palindromicity: Symmetry of a sequence S with S_t = S_{q−t}; here from permuting marked points. "S_t = S_{q-t} \qquad \text{(palindromicity)}."
- Projection formula: A cohomological identity relating pushforward and cup product under a proper map. "Using the projection formula together with the pushforward relation"
- Psi-class: The first Chern class of the cotangent line bundle at a marked point on the moduli space. "defines the -class ."
- Pullback: The operation of transporting cohomology classes along a map, used to compare intersections. "this property is preserved under pullback by ,"
- Pushforward: The operation pushing homology/cohomology classes along a proper map, preserving degrees. "together with the pushforward relation "
- String equation: A recursion removing τ0 insertions from descendant integrals. "We apply the string equation"
- Symmetric group: The group of permutations of n elements, acting by relabeling marked points. "By the natural action of the symmetric group permuting the markings on ,"
- Tau-function: A generating function satisfying integrable hierarchy equations (e.g., KdV). "which establishes that their generating function is a -function for the KdV hierarchy."
- Unimodal sequence: A sequence that increases up to a point and then decreases; here derived from log-concavity and symmetry. "Then is unimodal with maximum at the center:"
- Weak composition: An n-tuple of nonnegative integers summing to a fixed total d. "Throughout this section, we work with weak compositions:"
- Witten–Kontsevich theorem: The result identifying the generating function of ψ-class intersections with the KdV tau-function. "These integrals play a central role in the Witten--Kontsevich theorem \cite{Witten1991, Kontsevich1992},"
Collections
Sign up for free to add this paper to one or more collections.