Develop training procedures for contextual expertise leveraging without sacrificing robustness
Develop training procedures for large language models used in self-organizing multi-agent teams that enable contextual expertise leveraging—meaning appropriate deference to identified expert agents during deliberation—while preserving robustness to adversarial team members and manipulative inputs. The goal is to achieve strong synergy (matching or exceeding the best individual agent) without losing the consensus-seeking protection against adversarial influence observed in current RLHF-aligned models.
References
While this provides robustness to adversarial input, developing training procedures that enable contextual expertise leveraging without sacrificing robustness remains an open challenge.
— Multi-Agent Teams Hold Experts Back
(2602.01011 - Pappu et al., 1 Feb 2026) in Limitations and Conclusion (final paragraph)