2000 character limit reached
On the convergence of optimistic policy iteration for stochastic shortest path problem
Published 27 Aug 2018 in cs.LG and stat.ML | (1808.08763v2)
Abstract: In this paper, we prove some convergence results of a special case of optimistic policy iteration algorithm for stochastic shortest path problem. We consider both Monte Carlo and $TD(\lambda)$ methods for the policy evaluation step under the condition that the termination state will eventually be reached almost surely.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.