Web-scale parallel onboarding and budgeted routing calibration for MAS

Develop a Web-scale, cost-budgeted protocol for continuously and in parallel onboarding and expanding the agent pool in large language model-based multi-agent systems, including mechanisms to warm up newly added agents and calibrate router decisions so that end-to-end performance does not degrade during expansion.

Background

The paper addresses performance collapse in LLM-based multi-agent systems when new agents are added without expansion-aware familiarization and conservative router updates. MonoScale proposes agent-conditioned warm-up tasks and trust-region memory updates to ensure monotonic non-decreasing performance during sequential expansion.

While experiments demonstrate stable scaling up to 10 agents, the authors note that realistic Agentic Web settings involve thousands to millions of agents with retrieval-driven routing. Scaling continuous onboarding and calibration to this Web-scale regime under a controllable cost budget, while preventing performance degradation, remains unresolved.

References

How to continuously and in parallel onboard and expand the agent set at Web scale with a controllable cost budget—while warming up and calibrating routing to avoid degradation during expansion—remains an open challenge.

MonoScale: Scaling Multi-Agent System with Monotonic Improvement  (2601.23219 - Shao et al., 30 Jan 2026) in Section 6 (Discussion)