Generalization of Among Us LLM-Agent Findings Across Models and Training Paradigms

Determine whether the measurements of speech-act distributions and deception strategies observed for Llama 3.2 agents in the text-based Among Us multi-agent simulation generalize to other large language model architectures and to alternative training paradigms.

Background

The study analyzes 1,100 text-based Among Us games played by autonomous Llama 3.2 agents, reporting that dialogue is dominated by directives, impostors slightly increase representative acts, and deceptive language mainly appears as equivocation rather than falsification. These findings are derived from a single underlying model architecture and specific prompting and evaluation pipelines.

Because only Llama 3.2 was used for gameplay and Gemini was used for linguistic labeling, it remains an open question whether the same communicative and deceptive profiles would be reproduced by different model families (e.g., GPT, Gemini, Mistral) or by agents trained under alternative paradigms (e.g., reinforcement learning, different safety-tuning strategies). Establishing cross-model and cross-training generalization is necessary to assess the robustness and external validity of the reported behavioral patterns.

References

For this, our experiments used only a single underlying model architecture (Llama 3.2), so it is unknown if other LLM models or training paradigms would offer the same results.

— Deception and Communication in Autonomous Multi-Agent Systems: An Experimental Study with Among Us (2603.26635 - Milkowski et al., 27 Mar 2026) in Subsection 'Limitations and Future Work' under 'Conclusions'

Generalization of Among Us LLM-Agent Findings Across Models and Training Paradigms

Background

References

Related Problems