Automated Marketplace-Scale Vetting of Agent Skills

Develop automated vetting pipelines for Agent Skills that combine static analysis of bundled scripts, semantic analysis of natural language instructions, dynamic analysis in sandboxed environments, and provenance verification of dependencies, while maintaining low false positive rates suitable for marketplace-scale review.

Background

The paper argues that manual marketplace review will not scale with community contribution volumes. Existing scanners provide heuristic checks but do not robustly analyze natural language instructions or novel attack techniques.

A comprehensive automated vetting pipeline must integrate multiple analyses and still avoid blocking legitimate Skills due to high false positive rates.

References

Developing automated vetting pipelines that combine static analysis of bundled scripts, semantic analysis of natural language instructions, dynamic analysis of Skill behavior in sandboxed environments, and provenance verification of declared dependencies---while maintaining low false positive rates to avoid blocking legitimate Skills---is an open engineering and research challenge.

Towards Secure Agent Skills: Architecture, Threat Taxonomy, and Security Analysis  (2604.02837 - Li et al., 3 Apr 2026) in Section 7.2, Open Challenges (C6: Automated Skill Vetting)