Principled Demonstration Construction and Selection for CBRL
Develop principled methods for constructing and selecting few-shot demonstrations used by Context Bootstrapped Reinforcement Learning (CBRL) to be prepended to training prompts, for the purpose of improving alignment between demonstrations and training instances and enhancing performance across tasks; investigate learned retrieval mechanisms to automate this process.
References
Second, developing principled methods for constructing and selecting demonstrations remains an open challenge; learned retrieval mechanisms could automate this process while improving alignment between examples and training instances.
— Context Bootstrapped Reinforcement Learning
(2603.18953 - Agashe et al., 19 Mar 2026) in Section: Future Work