Applicability of LLM-guided action curricula beyond Blackjack

Determine whether the LLM-guided action-subset curriculum that progressively introduces Blackjack actions by complexity is effective in reinforcement learning domains beyond Blackjack, particularly in environments with continuous action spaces or in settings where actions cannot be isolated into meaningful subsets of increasing complexity.

Background

The paper introduces an LLM-guided curriculum that structures learning by progressively enabling subsets of Blackjack actions (Hit/Stand, then Double Down, Split, etc.), with advancement governed by adaptive success thresholds set by Google Gemini 2.0 Flash. This approach yields substantial performance and efficiency gains for DQN and tabular agents in an 8-deck Blackjack environment.

Because the method relies on masking and staging discrete, separable actions, the authors note that it may not directly translate to domains with continuous action spaces or where actions cannot be partitioned into meaningful subsets. The generality of this action-based curriculum beyond Blackjack remains unresolved.

References

However, the applicability of this specific method to other domains is an open question.

Learning to Play Blackjack: A Curriculum Learning Perspective  (2604.00076 - Alasti et al., 31 Mar 2026) in Subsection 'Limitations and Future Work'