Adversarially Robust LLM-as-a-Judge Evaluation Systems
Develop LLM-as-a-Judge-based evaluation systems that prevent adversarial attacks, ensuring that judgments are not manipulated by prompt injection or maliciously crafted responses and that automated assessment pipelines remain secure.
References
The open research problems in this context are: Create evaluation systems based on LLM to prevent adversarial attacks.
— Security in LLM-as-a-Judge: A Comprehensive SoK
(2603.29403 - Masoud et al., 31 Mar 2026) in Section 7.1, Vulnerability to Adversarial Prompt Manipulation (Challenges and Open Problems)