Optimistic Online Non-stochastic Control via FTRL
Abstract: This paper brings the concept of ``optimism" to the new and promising framework of online Non-stochastic Control (NSC). Namely, we study how NSC can benefit from a prediction oracle of unknown quality responsible for forecasting future costs. The posed problem is first reduced to an optimistic learning with delayed feedback problem, which is handled through the Optimistic Follow the Regularized Leader (OFTRL) algorithmic family. This reduction enables the design of \texttt{OptFTRL-C}, the first Disturbance Action Controller (DAC) with optimistic policy regret bounds. These new bounds are commensurate with the oracle's accuracy, ranging from $\mathcal{O}(1)$ for perfect predictions to the order-optimal $\mathcal{O}(\sqrt{T})$ even when all predictions fail. By addressing the challenge of incorporating untrusted predictions into online control, this work contributes to the advancement of the NSC framework and paves the way toward effective and robust learning-based controllers.
- N. Agarwal, B. Bullins, E. Hazan, S. Kakade, and K. Singh, “Online control with adversarial disturbances,” in Proc. of ICML, 2019.
- E. Hazan, S. Kakade, and K. Singh, “The nonstochastic control problem,” in Proc. of ALT, 2020.
- A. Karapetyan, A. Iannelli, and J. Lygeros, “On the regret of h∞\infty∞ control,” in Proc. of IEEE CDC, 2022.
- D. Anderson, G. Iosifidis, and D. J. Leith, “Lazy lagrangians for optimistic learning with budget constraints,” IEEE/ACM Trans. on Networking, vol. 31, no. 5, pp. 1935–1949, 2023.
- N. Mhaisen, A. Sinha, G. Paschos, and G. Iosifidis, “Optimistic no-regret algorithms for discrete caching,” Proc. ACM Meas. Anal. Comput. Syst., vol. 6, no. 3, pp. 1–28, 2022.
- T. Si-Salem, G. Özcan, I. Nikolaou, E. Terzi, and S. Ioannidis, “Online submodular maximization via online convex optimization,” in Proc. of AAAI, 2024.
- F. Aslan, G. Iosifidis, J. A. Ayala-Romero, A. Garcia-Saavedra, and X. Costa-Perez, “Fair resource allocation in virtualized o-ran platforms,” Proc. ACM Meas. Anal. Comput. Syst., vol. 8, no. 1, 2024.
- E. Hazan, “Introduction to Online Convex Optimization,” arXiv:1909.05207, 2019.
- A. Rakhlin and K. Sridharan, “Online learning with predictable sequences,” in Proc. of COLT, 2013.
- M. Mohri and S. Yang, “Accelerating Online Convex Optimization via Adaptive Prediction,” in Proc. of AISTATS, 2016.
- O. Anava, E. Hazan, and S. Mannor, “Online learning for adversaries with memory: Price of past mistakes,” in Proc. of NeurIPS, 2015.
- S. Shalev-Shwartz, “Online learning and online convex optimization,” Foundations and Trends in Machine Learning, vol. 4, no. 2, 2012.
- P. Joulani, A. György, and C. Szepesvári, “A modular analysis of adaptive (non-)convex optimization: Optimism, composite objectives, and variational bounds,” in Proc. of COLT, 2017.
- P. Zhao, Y.-X. Wang, and Z.-H. Zhou, “Non-stationary online learning with memory and non-stochastic control,” in Proc. of AISTATS, 2022.
- H. Zhou, Z. Xu, and V. Tzoumas, “Efficient online learning with memory via frank-wolfe optimization: Algorithms with bounded dynamic regret and applications to control,” in Proc. of CDC, 2023.
- G. E. Flaspohler, F. Orabona, J. Cohen, S. Mouatadid, M. Oprescu, P. Orenstein, and L. Mackey, “Online learning with optimism and delay,” in Proc. of ICML, 2021.
- M. Simchowitz, “Making non-stochastic control (almost) as easy as stochastic,” in Proc. of NeurIPS, 2020.
- N. Agarwal, E. Hazan, and K. Singh, “Logarithmic regret for online control,” in Proc. of NeurIPS, 2019.
- D. Foster and M. Simchowitz, “Logarithmic regret for adversarial online control,” in Proc. of ICML, 2020.
- Y. Li, S. Das, and N. Li, “Online optimal control with affine constraints,” in Proc. of the AAAI, 2021.
- X. Liu, Z. Yang, and L. Ying, “Online nonstochastic control with adversarial and static constraints,” arXiv:2302.02426, 2023.
- P. Gradu, J. Hallman, and E. Hazan, “Non-stochastic control with bandit feedback,” in Proc. of NeurIPS, 2020.
- P. Gradu, E. Hazan, and E. Minasyan, “Adaptive regret for control of time-varying dynamics,” in Proc. of L4DC, 2023.
- N. Mhaisen and G. Iosifidis, “Adaptive online non-stochastic control,” arXiv:2310.02261, 2023.
- Z. Zhang, A. Cutkosky, and I. Paschalidis, “Adversarial tracking control via strongly adaptive online learning with memory,” in Proc. of AISTATS, 2022.
- G. Shi, Y. Lin, S.-J. Chung, Y. Yue, and A. Wierman, “Online optimization with memory and competitive control,” in Proc. of NeurIPS, 2020.
- G. Goel and B. Hassibi, “Competitive control,” IEEE Trans. Autom. Control, vol. 68, no. 9, pp. 5162–5173, 2023.
- M. Simchowitz, K. Singh, and E. Hazan, “Improper learning for non-stochastic control,” in Proc. of COLT, 2020.
- G. Goel, N. Agarwal, K. Singh, and E. Hazan, “Best of both worlds in online control: Competitive ratio and policy regret,” in Proc. of L4DC, 2023.
- A. Bemporad and M. Morari, “Robust model predictive control: A survey,” in Robustness in identification and control. Springer, 2007.
- Y. Lin, Y. Hu, G. Shi, H. Sun, G. Qu, and A. Wierman, “Perturbation-based regret analysis of predictive control in linear time varying systems,” in Proc. of NeurIPS, 2021.
- Y. Lin, Y. Hu, G. Qu, T. Li, and A. Wierman, “Bounded-regret mpc via perturbation analysis: Prediction error, constraints, and nonlinearity,” in Proc. of NeurIPS, 2022.
- H. B. McMahan, “A Survey of Algorithms and Analysis for Adaptive Online Learning,” J. Mach. Learn. Res., vol. 18, no. 1, 2017.
- N. Mhaisen, G. Iosifidis, and D. Leith, “Online caching with no regret: Optimistic learning via recomm.” IEEE Trans. Mob. Comput., 2023.
- F. Orabona, “A Modern Introduction to Online Learning,” arXiv:1912.13213, 2023.
- N. Chen, A. Agarwal, A. Wierman, S. Barman, and L. L. Andrew, “Online convex optimization using predictions,” in Proc. of SIGMETRICS, 2015.
- N. Chen, J. Comden, Z. Liu, A. Gandhi, and A. Wierman, “Using predictions in online optimization: Looking forward with an eye on the past,” SIGMETRICS Perform. Eval. Rev., vol. 44, no. 1, 2016.
- C. Yu, G. Shi, S.-J. Chung, Y. Yue, and A. Wierman, “The power of predictions in online control,” in Proc. of NeurIPS, 2020.
- Y. Li, X. Chen, and N. Li, “Online optimal control with linear dynamics and predictions: Algorithms and regret analysis,” in Proc. of NeurIPS, 2019.
- R. Zhang, Y. Li, and N. Li, “On the regret analysis of online lqr control with predictions,” in Proc. of ACC, 2021.
- C. Yu, G. Shi, S.-J. Chung, Y. Yue, and A. Wierman, “Competitive control with delayed imperfect information,” in Proc. of ACC, 2022.
- T. Li, R. Yang, G. Qu, G. Shi, C. Yu, A. Wierman, and S. Low, “Robustness and consistency in linear quadratic control with untrusted predictions,” Proc. ACM Meas. Anal. Comput. Syst., vol. 6, no. 1, 2022.
- O. Levy and Y. Mansour, “Optimism in face of a context: Regret guarantees for stochastic contextual mdp,” in Proc. of AAAI, 2023.
- E. Hazan and K. Singh, “Introduction to online nonstochastic control,” arXiv:2211.09619, 2022.
- A. Cohen, A. Hasidim, T. Koren, N. Lazic, Y. Mansour, and K. Talwar, “Online linear quadratic control,” in Proc. of ICML, 2018.
- Optimistic-NSC. [Online]. Available: https://github.com/Naram-m/Optimistic-NSC
- ICML 2021 NSC tutorial. [Online]. Available: https://icml.cc/virtual/2021/tutorial/10838
- NSC-tutorial - code/experiments. [Online]. Available: https://sites.google.com/view/nsc-tutorial/codeexperiments
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.