Papers
Topics
Authors
Recent
Search
2000 character limit reached

A Behavioral Model for Exploration vs. Exploitation: Theoretical Framework and Experimental Evidence

Published 3 Jul 2022 in math.OC | (2207.01028v3)

Abstract: How do people navigate the exploration-exploitation (EE) trade-off when making repeated choices with unknown rewards? We study this question through the lens of multi-armed bandit problems and introduce a novel behavioral model, Quantal Choice with Adaptive Reduction of Exploration (QCARE). It generalizes Thompson Sampling, allowing for a principled way to quantify the EE trade-off and reflect human decision-making patterns. The model adaptively reduces exploration as information accumulates, with the reduction rate serving as a parameter to quantify the EE trade-off dynamics. We theoretically analyze how varying reduction rates influence decision quality, shedding light on the effects of over-exploration'' andunder-exploration.'' Empirically, we validate QCARE through experiments collecting behavioral data from human participants. QCARE not only captures critical behavioral patterns in the EE trade-off but also outperforms alternative models in predictive power. Our analysis reveals a behavioral tendency toward over-exploration.

Summary

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (3)

Collections

Sign up for free to add this paper to one or more collections.