2000 character limit reached
An optimal algorithm for bandit convex optimization
Published 14 Mar 2016 in cs.LG and cs.DS | (1603.04350v2)
Abstract: We consider the problem of online convex optimization against an arbitrary adversary with bandit feedback, known as bandit convex optimization. We give the first $\tilde{O}(\sqrt{T})$-regret algorithm for this setting based on a novel application of the ellipsoid method to online learning. This bound is known to be tight up to logarithmic factors. Our analysis introduces new tools in discrete convex geometry.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.