Survey Bandits with Regret Guarantees

Published 23 Feb 2020 in cs.LG, econ.EM, and stat.ML | (2002.09814v1)

Abstract: We consider a variant of the contextual bandit problem. In standard contextual bandits, when a user arrives we get the user's complete feature vector and then assign a treatment (arm) to that user. In a number of applications (like healthcare), collecting features from users can be costly. To address this issue, we propose algorithms that avoid needless feature collection while maintaining strong regret guarantees.