Assess whether macro-set selection heuristics align MathLib data with A_n log-density predictions
Investigate whether filtering the candidate macro set in MathLib by in-degree percentiles, restricting to definition-like elements (e.g., declarations whose resulting type is Sort), or related selection/optimization approaches produces a macro set for which the three measured relationships (log unwrapped length versus depth, wrapped length versus depth, and log unwrapped length versus wrapped length) better agree with the A_n log-density regime predictions.
References
Whether these or related approaches bring the three metrics into better agreement with the $A_n$ log-density predictions is an open question.
— Compression is all you need: Modeling Mathematics
(2603.20396 - Aksenov et al., 20 Mar 2026) in Subsubsection “Summary and Identifying the ‘Macro Set’” within Section 3