Detecting Prompt Injection and Evaluation Manipulation in LLM-as-a-Judge
Develop methods to detect prompt injection and other evaluation manipulation attacks that bias the decisions of LLM-as-a-Judge systems.
References
The open research problems in this context are: Design methods for detecting prompt injection and evaluation manipulation attacks.
— Security in LLM-as-a-Judge: A Comprehensive SoK
(2603.29403 - Masoud et al., 31 Mar 2026) in Section 7.1, Vulnerability to Adversarial Prompt Manipulation (Challenges and Open Problems)