Papers
Topics
Authors
Recent
Search
2000 character limit reached

Disentangling Exploration of Large Language Models by Optimal Exploitation

Published 15 Jan 2025 in cs.LG, cs.AI, and cs.CL | (2501.08925v2)

Abstract: Exploration is a crucial skill for self-improvement and open-ended problem-solving. However, it remains unclear if LLMs can effectively explore the state-space within an unknown environment. This work isolates exploration as the sole objective, tasking the agent with delivering information that enhances future returns. Within this framework, we argue that measuring agent returns is not sufficient for a fair evaluation and decompose missing rewards into exploration and exploitation components based on the optimal achievable return. Comprehensive experiments with various models reveal that most struggle to sufficiently explore the state-space and weak exploration is insufficient. We observe a positive correlation between parameter count and exploration performance, with larger models demonstrating superior capabilities. Furthermore, we show that our decomposition provides insights into differences in behaviors driven by prompt engineering, offering a valuable tool for refining performance in exploratory tasks.

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.