Optimal ordering of the safety curriculum during pretraining
Determine the optimal temporal ordering for introducing safety pretraining interventions—specifically contextualized rephrasing, refusal training, and metadata-annotated examples enabling SafeBeam—within the pretraining token budget for large language models.
References
Finding an optimal 'ordering' for a curriculum remains an interesting open question.
— When Should We Introduce Safety Interventions During Pretraining?
(2601.07087 - Sam et al., 11 Jan 2026) in Additional Discussion, Appendix