Adaptive Primitive Selection for Reflective Context Learning
Develop an adaptive selection mechanism for Reflective Context Learning (RCL)—a context-space optimization framework that updates a structured playbook via reflection and mutation to improve an LLM agent’s behavior—that dynamically chooses which optimization primitives (batching, grouped rollouts, improved credit assignment, auxiliary losses, failure replay, and optimizer state/momentum) to activate based on the current training phase or observed task properties, with the goal of reducing manual configuration while maintaining or improving performance.
References
Several directions remain open. Adaptive primitive selection --- choosing which primitives to activate based on the current training phase or task properties --- could reduce the need for manual configuration.