Papers
Topics
Authors
Recent
Search
2000 character limit reached

Sequential Learning of the Pareto Front for Multi-objective Bandits

Published 29 Jan 2025 in stat.ML and cs.LG | (2501.17513v1)

Abstract: We study the problem of sequential learning of the Pareto front in multi-objective multi-armed bandits. An agent is faced with K possible arms to pull. At each turn she picks one, and receives a vector-valued reward. When she thinks she has enough information to identify the Pareto front of the different arm means, she stops the game and gives an answer. We are interested in designing algorithms such that the answer given is correct with probability at least 1-$\delta$. Our main contribution is an efficient implementation of an algorithm achieving the optimal sample complexity when the risk $\delta$ is small. With K arms in d dimensions p of which are in the Pareto set, the algorithm runs in time O(Kpd) per round.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 14 likes about this paper.