Scaling INFUSION to frontier models

Determine whether INFUSION—the influence-function-guided framework for editing training data to induce targeted parameter shifts—can scale to frontier models, enabling effective behavior shaping in state-of-the-art, large-scale architectures.

Background

INFUSION proposes using influence functions to identify and perturb a small subset of training documents to induce targeted changes in model behavior. The framework shows strong effects in image classification (CIFAR-10) and measurable but attenuated effects in transformer and small LLM settings.

However, the method’s effectiveness weakens as model and dataset scales increase, and influence approximations become less faithful at larger scales. This raises the question of whether the influence-guided approach can be applied to frontier, state-of-the-art models, which have significantly higher capacity and training complexity.

References

Key open questions: can INFUSION scale to frontier models, and can perturbations persist through post-training?

Infusion: Shaping Model Behavior by Editing Training Data via Influence Functions  (2602.09987 - Rosser et al., 10 Feb 2026) in Section 7, Discussion — Defenses and future work