Papers
Topics
Authors
Recent
Search
2000 character limit reached

Online POMDP Planning with Anytime Deterministic Optimality Guarantees

Published 3 Oct 2023 in cs.AI and cs.RO | (2310.01791v4)

Abstract: Decision-making under uncertainty is a critical aspect of many practical autonomous systems due to incomplete information. Partially Observable Markov Decision Processes (POMDPs) offer a mathematically principled framework for formulating decision-making problems under such conditions. However, finding an optimal solution for a POMDP is generally intractable. In recent years, there has been a significant progress of scaling approximate solvers from small to moderately sized problems, using online tree search solvers. Often, such approximate solvers are limited to probabilistic or asymptotic guarantees towards the optimal solution. In this paper, we derive a deterministic relationship for discrete POMDPs between an approximated and the optimal solution. We show that at any time, we can derive bounds that relate between the existing solution and the optimal one. We show that our derivations provide an avenue for a new set of algorithms and can be attached to existing algorithms that have a certain structure to provide them with deterministic guarantees with marginal computational overhead. In return, not only do we certify the solution quality, but we demonstrate that making a decision based on the deterministic guarantee may result in superior performance compared to the original algorithm without the deterministic certification.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (11)
  1. Finite-time analysis of the multiarmed bandit problem. Machine learning, 47(2):235–256, 2002.
  2. POMDPs.jl: A framework for sequential decision making under uncertainty. Journal of Machine Learning Research, 18(26):1–5, 2017. URL http://jmlr.org/papers/v18/16-300.html.
  3. An on-line pomdp solver for continuous observation spaces. In IEEE Intl. Conf. on Robotics and Automation (ICRA), pages 7643–7649. IEEE, 2021.
  4. Planning and acting in partially observable stochastic domains. Artificial intelligence, 101(1):99–134, 1998.
  5. Sparse tree search optimality guarantees in pomdps with continuous observation spaces. In Intl. Joint Conf. on AI (IJCAI), pages 4135–4142, 7 2020.
  6. Generalized optimality guarantees for solving continuous observation pomdps through particle belief mdp approximation. arXiv preprint arXiv:2210.05015, 2022.
  7. Monte-carlo planning in large pomdps. In Advances in Neural Information Processing Systems (NIPS), pages 2164–2172, 2010.
  8. Despot: Online pomdp planning with regularization. In NIPS, volume 13, pages 1772–1780, 2013.
  9. Online algorithms for pomdps with continuous state, action, and observation spaces. In Proceedings of the International Conference on Automated Planning and Scheduling, volume 28, 2018.
  10. Adaptive online packing-guided search for pomdps. In M. Ranzato, A. Beygelzimer, Y. Dauphin, P.S. Liang, and J. Wortman Vaughan, editors, Advances in Neural Information Processing Systems (NIPS), volume 34, pages 28419–28430. Curran Associates, Inc., 2021. URL https://proceedings.neurips.cc/paper_files/paper/2021/file/ef41d488755367316f04fc0e0e9dc9fc-Paper.pdf.
  11. Despot: Online pomdp planning with regularization. JAIR, 58:231–266, 2017.
Citations (5)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.