Papers
Topics
Authors
Recent
Search
2000 character limit reached

Best Arm Identification with Resource Constraints

Published 29 Feb 2024 in cs.LG | (2402.19090v2)

Abstract: Motivated by the cost heterogeneity in experimentation across different alternatives, we study the Best Arm Identification with Resource Constraints (BAIwRC) problem. The agent aims to identify the best arm under resource constraints, where resources are consumed for each arm pull. We make two novel contributions. We design and analyze the Successive Halving with Resource Rationing algorithm (SH-RR). The SH-RR achieves a near-optimal non-asymptotic rate of convergence in terms of the probability of successively identifying an optimal arm. Interestingly, we identify a difference in convergence rates between the cases of deterministic and stochastic resource consumption.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (41)
  1. Cost-aware multi-objective bayesian optimisation. arXiv preprint arXiv:1909.03600.
  2. Linear contextual bandits with knapsacks. In Advances in Neural Information Processing Systems, volume 29.
  3. Bandits with concave rewards and convex knapsacks. In Proceedings of the Fifteenth ACM Conference on Economics and Computation, page 989–1006. Association for Computing Machinery.
  4. Best arm identification in multi-armed bandits. In Proceedings of the 23rd Annual Conference on Learning Theory (COLT).
  5. Bandits with knapsacks. Journal of ACM, 65(3).
  6. Bayesian optimization over iterative learners with structured responses: A budget-aware planning approach. In International Conference on Artificial Intelligence and Statistics, pages 9076–9093. PMLR.
  7. Pasha: Efficient hpo with progressive resource allocation. arXiv preprint arXiv:2207.06940.
  8. Pure exploration in multi-armed bandits problems. In Algorithmic Learning Theory, pages 23–37, Berlin, Heidelberg. Springer Berlin Heidelberg.
  9. Tight (lower) bounds for the fixed budget best arm identification bandit problem. In 29th Annual Conference on Learning Theory, volume 49, pages 590–604. PMLR.
  10. Csiszár, I. (1998). The method of types [information theory]. IEEE Transactions on Information Theory, 44(6):2505–2523.
  11. Pac bounds for multi-armed bandit and markov decision processes. In Proceedings of the 15th Annual Conference on Computational Learning Theory, COLT ’02, page 255–270. Springer-Verlag.
  12. Multi-fidelity optimization via surrogate modelling. Proceedings of the royal society a: mathematical, physical and engineering sciences, 463(2088):3251–3269.
  13. Multi-fidelity cost-aware bayesian optimization. Computer Methods in Applied Mechanics and Engineering, 407:115937.
  14. Frazier, P. I. (2018). A tutorial on bayesian optimization. arXiv preprint arXiv:1807.02811.
  15. Best arm identification: A unified approach to fixed budget and fixed confidence. In Advances in Neural Information Processing Systems, volume 25.
  16. Optimal best arm identification with fixed confidence. In 29th Annual Conference on Learning Theory, volume 49 of Proceedings of Machine Learning Research, pages 998–1027. PMLR.
  17. Pareto-efficient acquisition functions for cost-aware bayesian optimization. arXiv preprint arXiv:2011.11456.
  18. Almost optimal variance-constrained best arm identification. arXiv preprint arXiv:2201.10142.
  19. Cost-aware adversarial best arm identification.
  20. lil’ ucb : An optimal exploration algorithm for multi-armed bandits. In Proceedings of The 27th Conference on Learning Theory, volume 35 of Proceedings of Machine Learning Research, pages 423–439. PMLR.
  21. Best-arm identification algorithms for multi-armed bandits in the fixed confidence setting. In 2014 48th Annual Conference on Information Sciences and Systems (CISS), pages 1–6.
  22. Non-stochastic best arm identification and hyperparameter optimization. In Artificial intelligence and statistics, pages 240–248. PMLR.
  23. Anytime exploration for multi-armed bandits using confidence information. In Proceedings of The 33rd International Conference on Machine Learning, pages 974–982. PMLR.
  24. Multi-fidelity bayesian optimisation with continuous approximations. In International Conference on Machine Learning, pages 1799–1808. PMLR.
  25. Almost optimal exploration in multi-armed bandits. In Proceedings of the 30th International Conference on Machine Learning, ICML, volume 28, pages 1238–1246. JMLR.org.
  26. On the complexity of best-arm identification in multi-armed bandit models. Journal of Machine Learning Research, 17(1):1–42.
  27. Bandit Algorithms. Cambridge University Press.
  28. Cost-aware bayesian optimization. arXiv preprint arXiv:2003.10870.
  29. A system for massively parallel hyperparameter tuning. Proceedings of Machine Learning and Systems, 2:230–246.
  30. Adaptive cost-aware bayesian optimization. Knowledge-Based Systems, 232:107481.
  31. The sample complexity of exploration in the multi-armed bandit problem. J. Mach. Learn. Res., 5:623–648.
  32. Multi-information source optimization. Advances in neural information processing systems, 30.
  33. Combinatorial semi-bandits with knapsacks. In Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics, volume 84 of Proceedings of Machine Learning Research, pages 1760–1770. PMLR.
  34. Practical bayesian optimization of machine learning algorithms. Advances in neural information processing systems, 25.
  35. Safe exploration for optimization with gaussian processes. In Proceedings of the 32nd International Conference on Machine Learning, volume 37 of Proceedings of Machine Learning Research, pages 997–1005. PMLR.
  36. Stagewise safe bayesian optimization with gaussian processes. In International Conference on Machine Learning.
  37. Multi-task bayesian optimization. Advances in neural information processing systems, 26.
  38. Best arm identification with safety constraints. In Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, volume 151 of Proceedings of Machine Learning Research, pages 9114–9146. PMLR.
  39. Practical multi-fidelity bayesian optimization for hyperparameter tuning. In Uncertainty in Artificial Intelligence, pages 788–798. PMLR.
  40. A resource-efficient method for repeated hpo and nas problems. arXiv preprint arXiv:2103.16111.
  41. Revisiting simple regret minimization in multi-armed bandits. CoRR, abs/2210.16913.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (2)

Collections

Sign up for free to add this paper to one or more collections.