EARL-BO: Reinforcement Learning for Multi-Step Lookahead, High-Dimensional Bayesian Optimization
Abstract: Conventional methods for Bayesian optimization (BO) primarily involve one-step optimal decisions (e.g., maximizing expected improvement of the next step). To avoid myopic behavior, multi-step lookahead BO algorithms such as rollout strategies consider the sequential decision-making nature of BO, i.e., as a stochastic dynamic programming (SDP) problem, demonstrating promising results in recent years. However, owing to the curse of dimensionality, most of these methods make significant approximations or suffer scalability issues, e.g., being limited to two-step lookahead. This paper presents a novel reinforcement learning (RL)-based framework for multi-step lookahead BO in high-dimensional black-box optimization problems. The proposed method enhances the scalability and decision-making quality of multi-step lookahead BO by efficiently solving the SDP of the BO process in a near-optimal manner using RL. We first introduce an Attention-DeepSets encoder to represent the state of knowledge to the RL agent and employ off-policy learning to accelerate its initial training. We then propose a multi-task, fine-tuning procedure based on end-to-end (encoder-RL) on-policy learning. We evaluate the proposed method, EARL-BO (Encoder Augmented RL for Bayesian Optimization), on both synthetic benchmark functions and real-world hyperparameter optimization problems, demonstrating significantly improved performance compared to existing multi-step lookahead and high-dimensional BO methods.
- On estimating the gradient of the expected information gain in Bayesian experimental design. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, pp. 20311–20319, 2024.
- Hpo-b: A large-scale reproducible benchmark for black-box hpo based on openml. arXiv preprint arXiv:2106.06257, 2021.
- A tutorial on Bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning. arXiv preprint arXiv:1012.2599, 2010.
- PG-LBO: Enhancing high-dimensional bayesian optimization with pseudo-label and Gaussian process guidance. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, pp. 11381–11389, 2024.
- Towards learning universal hyperparameter optimizers with transformers. Advances in Neural Information Processing Systems, 35:32053–32068, 2022.
- Non-myopic Bayesian optimization using model-free reinforcement learning and its application to optimization in electrochemistry. Computers & Chemical Engineering, 184:108624, 2024.
- Scalable global optimization via local Bayesian optimization. Advances in Neural Information Processing Systems, 32, 2019.
- SnAKe: Bayesian optimization with pathwise exploration. Advances in Neural Information Processing Systems, 35, 2022.
- Towards Gaussian process-based optimization with finite time horizon. In International Workshop in Model-Oriented Design and Analysis, pp. 89–96. Springer, 2010.
- Automatic chemical design using a data-driven continuous representation of molecules. ACS Central Science, 4(2):268–276, 2018.
- GLASSES: Relieving the myopia of Bayesian optimisation. In Artificial Intelligence and Statistics, pp. 790–799. PMLR, 2016.
- Decentralized high-dimensional Bayesian optimization with factor graphs. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 32, 2018.
- Reinforced few-shot acquisition function learning for Bayesian optimization. Advances in Neural Information Processing Systems, 34:7718–7731, 2021.
- Efficient global optimization of expensive black-box functions. Journal of Global Optimization, 13:455–492, 1998.
- Kushner, H. A new method of locating the maximum point of an arbitrary multipeak curve in the presence of noise. Journal of Basic Engineering, 86(1):97–106, 1964.
- Bayesian optimization with a finite budget: An approximate dynamic programming approach. Advances in Neural Information Processing Systems, 29, 2016.
- Efficient rollout strategies for Bayesian optimization. In Conference on Uncertainty in Artificial Intelligence, pp. 260–269. PMLR, 2020.
- End-to-end meta-Bayesian optimisation with transformer neural processes. Advances in Neural Information Processing Systems, 36, 2024.
- Data-efficient domain randomization with Bayesian optimization. IEEE Robotics and Automation Letters, 6(2):911–918, 2021.
- Gaussian processes for global optimization. 2009.
- Bayesian optimization as a flexible and efficient design framework for sustainable process systems. arXiv preprint arXiv:2401.16373, 2024.
- Puterman, M. L. Markov decision processes: discrete stochastic dynamic programming. John Wiley & Sons, 2014.
- Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017.
- Taking the human out of the loop: A review of Bayesian optimization. Proceedings of the IEEE, 104(1):148–175, 2015.
- Rtdk-bo: High dimensional Bayesian optimization with reinforced transformer deep kernels. In 2023 IEEE 19th International Conference on Automation Science and Engineering (CASE), pp. 1–8. IEEE, 2023.
- Practical Bayesian optimization of machine learning algorithms. Advances in Neural Information Processing Systems, 25, 2012.
- Gaussian process optimization in the bandit setting: No regret and experimental design. In International Conference on Machine Learning (ICML), pp. 1015–1022, 2010.
- Virtual library of simulation experiments: test functions and datasets. Simon Fraser University, Burnaby, BC, Canada, accessed May, 13:2015, 2013.
- Reinforcement learning: An introduction. MIT press, 2018.
- Multi-objective constrained optimization for energy applications via tree ensembles. Applied Energy, 306:118061, 2022.
- Sample-efficient optimization in the latent space of deep generative models via weighted retraining. Advances in Neural Information Processing Systems, 33:11259–11272, 2020.
- Attention is all you need. Advances in Neural Information Processing Systems, 30:5998–6008, 2017.
- Meta-learning acquisition functions for transfer learning in Bayesian optimization. In International Conference on Learning Representations, 2019.
- Recent advances in Bayesian optimization. ACM Computing Surveys, 55(13s):1–36, 2023.
- Gaussian processes for machine learning, volume 2. MIT press Cambridge, MA, 2006.
- Practical two-step lookahead Bayesian optimization. Advances in Neural Information Processing Systems, 32, 2019.
- Deep sets. Advances in Neural Information Processing Systems, 30, 2017.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.