Inducing Programmatic Skills for Agentic Tasks

Published 9 Apr 2025 in cs.CL | (2504.06821v2)

Abstract: To succeed in common digital tasks such as web navigation, agents must carry out a variety of specialized tasks such as searching for products or planning a travel route. To tackle these tasks, agents can bootstrap themselves by learning task-specific skills online through interaction with the web environment. In this work, we demonstrate that programs are an effective representation for skills. We propose agent skill induction (ASI), which allows agents to adapt themselves by inducing, verifying, and utilizing program-based skills on the fly. We start with an evaluation on the WebArena agent benchmark and show that ASI outperforms the static baseline agent and its text-skill counterpart by 23.5% and 11.3% in success rate, mainly thanks to the programmatic verification guarantee during the induction phase. ASI also improves efficiency by reducing 10.7-15.3% of the steps over baselines, by composing primitive actions (e.g., click) into higher-level skills (e.g., search product). We then highlight the efficacy of ASI in remaining efficient and accurate under scaled-up web activities. Finally, we examine the generalizability of induced skills when transferring between websites, and find that ASI can effectively reuse common skills, while also updating incompatible skills to versatile website changes.

Abstract PDF Upgrade to Chat

Summary

The paper demonstrates that inducing verifiable programmatic skills significantly improves agent performance, with up to 23.5% higher success rates over static approaches.
The ASI method leverages executable program representations to ensure skill correctness and composability, enabling efficient adaptation across diverse web tasks.
Experimental results on the WebArena benchmark validate ASI’s effectiveness in reducing task steps, supporting scaled-up activities, and generalizing across websites.

Inducing Programmatic Skills for Agentic Tasks

This paper addresses the challenge of enabling agents to perform specialized digital tasks, such as web navigation, by inducing programmatic skills that adapt to various environments. The proposed method, Agent Skill Induction (ASI), demonstrates improved success rates and efficiency compared to existing approaches, particularly by utilizing executable programs as skill representations.

Overview of ASI

ASI serves as a dynamic mechanism for agents to learn and apply skills during web task interactions. These skills are represented as executable programs, allowing for verification during the induction phase. ASI's programmatic approach provides significant advantages over text-based skill representations by ensuring skill correctness and composability, contributing to a 23.5% and 11.3% improvement in success rate over static and text-skill agents, respectively.

Figure 1: Inducing programmatic skills and rewriting the trajectory from an episode.

The ASI framework operates by first generating action trajectories from natural language queries. It then induces higher-level programmatic skills, such as search_product(name), through a verification process that ensures their functional validity. These verified skills are integrated into the agent's action space, enabling more efficient task resolution in future interactions.

Experimental Evaluation

WebArena Benchmark

The WebArena benchmark is employed to evaluate ASI's performance, involving a variety of web navigation tasks across different domains. ASI outperforms both static and adaptive agents by leveraging its programmatic skill induction, which streamlines task-solving procedures by abstracting complex actions into concise programmatic calls.

Scaled-Up Activities

In scenarios involving extended task sequences, ASI maintains efficiency by reducing the steps required to complete tasks. This efficiency is particularly noted in tasks that involve repetitive procedures, where program entropy offers significant advantages over traditional text-based memory augmentation methods.

Figure 2: Example scaled-up task of updating multiple addresses on shopping website.

Cross-Website Generalization

ASI's skills effectively transfer across websites within similar domains, though some skills require adaptation to new webpage designs. The programmatic structure of skills enables ASI to quickly refine or create new skills, demonstrating flexibility and robustness in diverse web environments.

Implications and Future Work

The research highlights the potential of programmatic skills in enhancing agent efficiency and success across varied digital tasks. Future developments could explore the optimal granularity of skills, stability in online evolution, and further comparisons to human expert benchmarks. Overall, ASI contributes a significant step towards adaptive agent design, with various practical and theoretical implications in AI research.

Conclusion

ASI significantly improves web agent performance through the induction of verifiable programmatic skills, showcasing greater efficiency and adaptability in both standard and scaled-up web tasks. Its ability to generalize skills across different websites underscores the potential of programmatic representations in developing autonomous digital agents.

This research opens avenues for further exploration into the dynamics of skill acquisition and application, promising advances in the efficiency and versatility of AI agents.