StruPhantom: Evolutionary Injection Attacks on Black-Box Tabular Agents Powered by Large Language Models

Published 14 Apr 2025 in cs.CR and cs.AI | (2504.09841v1)

Abstract: The proliferation of autonomous agents powered by LLMs has revolutionized popular business applications dealing with tabular data, i.e., tabular agents. Although LLMs are observed to be vulnerable against prompt injection attacks from external data sources, tabular agents impose strict data formats and predefined rules on the attacker's payload, which are ineffective unless the agent navigates multiple layers of structural data to incorporate the payload. To address the challenge, we present a novel attack termed StruPhantom which specifically targets black-box LLM-powered tabular agents. Our attack designs an evolutionary optimization procedure which continually refines attack payloads via the proposed constrained Monte Carlo Tree Search augmented by an off-topic evaluator. StruPhantom helps systematically explore and exploit the weaknesses of target applications to achieve goal hijacking. Our evaluation validates the effectiveness of StruPhantom across various LLM-based agents, including those on real-world platforms, and attack scenarios. Our attack achieves over 50% higher success rates than baselines in enforcing the application's response to contain phishing links or malicious codes.

Abstract PDF Upgrade to Chat

Summary

The paper introduces StruPhantom, which employs evolutionary optimization to enhance indirect prompt injection attacks on LLM-driven tabular agents.
It utilizes a constrained Monte Carlo Tree Search and an off-topic evaluator to iteratively refine attack payloads across structured data formats.
Experimental results demonstrate significant improvements in attack success rates, highlighting critical security vulnerabilities in LLM-based applications.

StruPhantom: Evolutionary Injection Attacks on Black-Box Tabular Agents Powered by LLMs

The paper "StruPhantom: Evolutionary Injection Attacks on Black-Box Tabular Agents Powered by LLMs" explores a novel approach to circumvent security vulnerabilities within LLM-powered tabular agents. It presents a method named StruPhantom, designed to exploit indirect prompt injection (IPI) attacks through evolutionary optimization techniques.

Introduction to StruPhantom

StruPhantom targets LLM-integrated applications dealing with structured data such as CSV, JSON, and XML. These tabular agents, while enhancing automation in data analysis, pose unique challenges due to their strict format requirements which traditionally hinder prompt injection attacks (Figure 1).

Figure 1: Indirect prompt injection attacks on LLM-based agents with structural inputs (i.e., tabular agents).

The research introduces an automatic optimization procedure utilizing a constrained Monte Carlo Tree Search (MCTS) coupled with an off-topic evaluator. By iteratively refining attack payloads, attackers can exploit agent vulnerabilities to enforce responses containing unauthorized behaviors like phishing or malicious code.

Methodology and Implementation

StruPhantom's methodology involves several key components:

Optimization Process:
- Utilizes MCTS to continuously enhance attack templates.
- Incorporates an off-topic evaluator to maintain the integrity and relevance of the attack throughout optimization.
Attack Vector Construction:
- Builds upon initial manually crafted templates equipped to navigate structured input complexities.
- Implements strategies including Generation, Crossover, Expansion, Shortening, and Rephrasing.
Evaluation Framework:
- Deploys a shadow agent based on the ReAct paradigm facilitating a detailed inspection of agent response mechanisms.
  Figure 2: A schematic diagram of the StruPhantom workflow.

Experimental Validation

Experiments validate StruPhantom across CSV, XLSX, XML, and JSON formats. Key findings include:

Success Rates: Achieves attack success rates (ASR) significantly surpassing baseline methods, with some scenarios seeing over 50% improvement in ASR.
Adaptability: Consistent adaptability demonstrated across diverse structural input types and real-world platforms.
Figure 3: Improvements in the attack success rate of different schemes over the optimization iterations.

Figure 4: Snapshots on a successful attack with Website template on a tabular agent application on ByteDance's Doubao platform (The application is crafted by the authors for ethical reasons).

Figure 5: Snapshots on a successful attack on a tabular agent application on ByteDance's Coze platform (The application is crafted by the authors for ethical reasons).

Implications and Future Directions

The implications of this research highlight critical security vulnerabilities inherent in current LLM-based tabular agents. StruPhantom underscores the necessity for improved defense mechanisms against IPI attacks. Proposed countermeasures include enhanced input validation, adopting interpretability techniques for auditing, and segregating input and output processes to mitigate injection risks.

The research invites future exploration into robust guardrails for LLM applications and comprehensive evaluation frameworks to safeguard against evolving adversarial strategies.

Conclusion

The paper's findings reveal substantial gaps in safeguarding LLM-powered tabular agents, emphasizing the strategic advantage of iterative optimization in crafting potent IPI attacks. StruPhantom's approach not only advances the understanding of structural input vulnerabilities but also propels the need for adaptive security solutions in AI systems processing complex data structures.

Markdown Report Issue