Papers
Topics
Authors
Recent
Search
2000 character limit reached

Data-driven Error Estimation: Upper Bounding Multiple Errors without Class Complexity as Input

Published 7 May 2024 in cs.LG and stat.ML | (2405.04636v3)

Abstract: Constructing confidence intervals that are simultaneously valid across a class of estimates is central for tasks such as multiple mean estimation, bounding generalization error in machine learning, and adaptive experimental design. We frame this as an "error estimation problem," where the goal is to determine a high-probability upper bound on the maximum error for a class of estimates. We propose an entirely data-driven approach that derives such bounds for both finite and infinite class settings, naturally adapting to a potentially unknown correlation structure of random errors. Notably, our method does not require class complexity as an input, overcoming a major limitation of existing approaches such as union bounding and bounds based on Talagrand's inequality. In this paper, we present our simple yet general solution and demonstrate its flexibility through applications ranging from constructing multiple simultaneously valid confidence intervals to optimizing exploration in contextual bandit algorithms.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (9)
  1. Reinforcement learning: Theory and algorithms. CS Dept., UW Seattle, Seattle, WA, USA, Tech. Rep, 32:96, 2019.
  2. P. Alquier. User-friendly introduction to pac-bayes bounds. arXiv preprint arXiv:2110.11216, 2021.
  3. Gaussian approximations and multiplier bootstrap for maxima of sums of high-dimensional random vectors. 2013.
  4. High-dimensional data bootstrap. Annual Review of Statistics and Its Application, 10:427–449, 2023.
  5. D. J. Foster and A. Rakhlin. Foundations of reinforcement learning and interactive decision making. arXiv preprint arXiv:2312.16730, 2023.
  6. E. Giné and J. Zinn. Bootstrapping general empirical measures. The Annals of Probability, pages 851–869, 1990.
  7. V. Koltchinskii. Oracle Inequalities in Empirical Risk Minimization and Sparse Recovery Problems: Ecole d’Eté de Probabilités de Saint-Flour XXXVIII-2008, volume 2033. Springer Science & Business Media, 2011.
  8. Game-theoretic statistics and safe anytime-valid inference. Statistical Science, 38(4):576–601, 2023.
  9. S. Shalev-Shwartz and S. Ben-David. Understanding machine learning: From theory to algorithms. Cambridge university press, 2014.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 2 tweets with 0 likes about this paper.