Mean estimation with covariates under synthetic contamination

Generalize the mean estimation framework to the setting where the target mean depends on a vector of covariates rather than remaining fixed across rounds, and develop unbiased estimators and variance characterizations under the iterative synthetic contamination process with parameter α.

Background

The current analysis treats the mean μ as fixed across rounds, focusing on how synthetic contamination affects the variance of estimators and the optimality of weighting strategies.

The authors identify extending the framework to covariate-dependent means (i.e., a regression-like setting) as an open direction, requiring new modeling and estimation techniques compatible with iterative contamination by previous-round models.

References

Interesting open problems for mean estimation include fully characterizing the minimum variance unbiased estimator, and allowing the mean to depend on a vector of covariates instead of remaining fixed in every round.

Learning from Synthetic Data: Limitations of ERM  (2601.15468 - Amin et al., 21 Jan 2026) in Conclusion