Performance of TALM on Large, Interconnected Codebases
Determine the performance and robustness of TALM (Tree-Structured Multi-Agent Framework with Long-Term Memory) when applied to system-level software engineering projects involving large, interconnected codebases spanning multiple classes, modules, or packages, which were not represented in the HumanEval, BigCodeBench, or ClassEval benchmarks used in the study.
References
Such large-scale contexts were not available in existing benchmarks, leaving open the question of how TALM would perform on truly large and interconnected codebases.
— TALM: Dynamic Tree-Structured Multi-Agent Framework with Long-Term Memory for Scalable Code Generation
(2510.23010 - Shen et al., 27 Oct 2025) in Section: Limitations