Adaptive Primitive Selection for Reflective Context Learning

Develop an adaptive selection mechanism for Reflective Context Learning (RCL)—a context-space optimization framework that updates a structured playbook via reflection and mutation to improve an LLM agent’s behavior—that dynamically chooses which optimization primitives (batching, grouped rollouts, improved credit assignment, auxiliary losses, failure replay, and optimizer state/momentum) to activate based on the current training phase or observed task properties, with the goal of reducing manual configuration while maintaining or improving performance.

Background

The paper introduces Reflective Context Learning (RCL), a unified framework for context-space optimization where an agent’s structured playbook is iteratively improved through reflection and mutation. The authors study classical optimization primitives adapted to context space, including batching, grouped rollouts, improved credit assignment, auxiliary losses, failure replay, and optimizer state/momentum.

They observe that the effectiveness of these primitives is regime-dependent and that composition is not additive, motivating mechanisms that can automatically select and configure primitives during training. In the conclusion, they explicitly note that adaptive primitive selection remains open as a direction to reduce manual configuration.

References

Several directions remain open. Adaptive primitive selection --- choosing which primitives to activate based on the current training phase or task properties --- could reduce the need for manual configuration.

Reflective Context Learning: Studying the Optimization Primitives of Context Space  (2604.03189 - Vassilyev et al., 3 Apr 2026) in Section 6 (Conclusion)