Appropriateness of tool-schema-based skill learning for dialogue benchmarks
Determine whether experience learning centered on tool-schema-based skills is the most appropriate formulation for user-centric, dialogue-focused benchmarks such as τ²-Bench.
References
More broadly, for user-centric benchmarks of this type (e.g., dialogue benchmarks), it remains an open question whether experience learning centered around tool-schema-based skills is the most appropriate formulation.
— SkillX: Automatically Constructing Skill Knowledge Bases for Agents
(2604.04804 - Wang et al., 6 Apr 2026) in Ablation Study on Three Components of AutoSkills