Fine-Tuning Agents and Overseers for Coordination and Detection
Determine whether fine‑tuning large language models can reliably teach communicating models to learn and use shared coordination schemes and, in parallel, improve overseer models’ detection capabilities within the CASE framework.
References
We hypothesize that models can be fine-tuned to learn coordination schemes and similarly, overseer models can be improved in terms of detection capabilities. With the release of CASE, we leave this as an open problem to study.
— Secret Collusion among Generative AI Agents: Multi-Agent Deception via Steganography
(2402.07510 - Motwani et al., 2024) in Appendix, Section "LLM Fine-Tuning"