Papers
Topics
Authors
Recent
Search
2000 character limit reached

An Asymptotically Optimal Strategy for Constrained Multi-armed Bandit Problems

Published 3 May 2018 in math.OC, cs.LG, and stat.ML | (1805.01237v1)

Abstract: For the stochastic multi-armed bandit (MAB) problem from a constrained model that generalizes the classical one, we show that an asymptotic optimality is achievable by a simple strategy extended from the $\epsilon_t$-greedy strategy. We provide a finite-time lower bound on the probability of correct selection of an optimal near-feasible arm that holds for all time steps. Under some conditions, the bound approaches one as time $t$ goes to infinity. A particular example sequence of ${\epsilon_t}$ having the asymptotic convergence rate in the order of $(1-\frac{1}{t})4$ that holds from a sufficiently large $t$ is also discussed.

Citations (8)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (1)

Collections

Sign up for free to add this paper to one or more collections.