How memory can affect collective and cooperative behaviors in an LLM-Based Social Particle Swarm

Published 14 Apr 2026 in cs.AI, cs.CL, cs.GT, and cs.MA | (2604.12250v1)

Abstract: This study examines how model-specific characteristics of LLM agents, including internal alignment, shape the effect of memory on their collective and cooperative dynamics in a multi-agent system. To this end, we extend the Social Particle Swarm (SPS) model, in which agents move in a two-dimensional space and play the Prisoner's Dilemma with neighboring agents, by replacing its rule-based agents with LLM agents endowed with Big Five personality scores and varying memory lengths. Using Gemini-2.0-Flash, we find that memory length is a critical parameter governing collective behavior: even a minimal memory drastically suppressed cooperation, transitioning the system from stable cooperative clusters through cyclical formation and collapse of clusters to a state of scattered defection as memory length increased. Big Five personality traits correlated with agent behaviors in partial agreement with findings from experiments with human participants, supporting the validity of the model. Comparative experiments using Gemma~3:4b revealed the opposite trend: longer memory promoted cooperation, accompanied by the formation of dense cooperative clusters. Sentiment analysis of agents' reasoning texts showed that Gemini interprets memory increasingly negatively as its length grows, while Gemma interprets it less negatively, and that this difference persists in the early phase of experiments before the macro-level dynamics converge. These results suggest that model-specific characteristics of LLMs, potentially including alignment, play a fundamental role in determining emergent social behavior in Generative Agent-Based Modeling, and provide a micro-level cognitive account of the contradictions found in prior work on memory and cooperation.

Abstract PDF Upgrade to Chat

Authors (3)

Summary

The paper demonstrates that memory length drastically alters cooperative dynamics in LLM-based agent swarms, with heavily aligned models showing cooperation collapse and lightly aligned models enhancing cooperation.
The study employs generative agent-based modeling using Gemini 2.0 Flash and Gemma 3:4b, integrating Big Five personality traits and dynamic memory slices to simulate iterated Prisoner’s Dilemma interactions.
Findings reveal that model-specific cognitive biases—negative versus positive memory interpretation—fundamentally shape emergent social structures and have significant implications for designing autonomous AI collectives.

Introduction

This study investigates how memory length and model-specific characteristics, including alignment, influence emergent collective and cooperative behavior in LLM-controlled agents embedded in a Social Particle Swarm (SPS) system. By leveraging generative agent-based modeling (GABM) with LLMs instead of rule-based agents, the work directly interrogates how LLM-dependent cognitive schemas—personality, history interpretation, and alignment-imposed heuristics—drive macrodynamics such as cluster formation and long-term cooperation within iterated Prisoner's Dilemma frameworks.

Model and Methodology

The study augments the SPS model by instantiating each agent as an LLM session, specifically using either Google Gemini 2.0 Flash (a heavily safety-aligned, commercial model) or Gemma 3:4b (an open-weight, lightly-aligned model). Each agent is parameterized by Big Five personality traits, a two-dimensional spatial position, and a controlled-length memory $L_m$ of recent opponent-specific interaction histories. Agents interact with neighbors within a fixed radius, selecting both movement vectors and a cooperate/defect strategy at each timestep, informed by prompt-injected state, neighborhood context, personality, and dynamic memory slices. The prompt engineering ensures that strategies and social movements can only emerge from the LLM’s internalized reasoning, without hard-coding behavioral rules beyond environmental structure.

Empirical Findings: Gemini 2.0 Flash

A central empirical result is that Gemini 2.0 Flash agents exhibit a monotonic suppression of cooperation as memory increases. Specifically, as $L_m$ increments from 0 to 3, the mean cooperation rate collapses from 0.899 to 0.0776, and neighbor clustering evaporates. Three canonical dynamical regimes mirror prior SPS taxonomy:

$L_m = 0$ : Rapid self-organization into durable cooperative clusters (Class B dynamics).
$L_m = 1$ : Oscillatory cycles of emergence and collapse of cooperation (Class C dynamics), with maximal volatility in social ties and behavior.
$L_m \geq 2$ : Accelerating fragmentation to isolated defectors (Class A dynamics), eliminating productive interactions.

The mechanism is elucidated via sentiment analysis of LLM-generated agent rationales: as memory length increases, Gemini agents’ interpretations of memory become increasingly negative, shifting from trust-building to risk aversion and withdrawal. Thus, Gemini's alignment and training biases it toward interpreting longer memory as accumulating evidence for punishment and caution, manifesting at scale as systemic cooperation breakdown.

Empirical Findings: Gemma 3:4b

Gemma 3:4b exhibits the opposite qualitative pattern: increasing memory length robustly promotes cooperation and facilitates dense, persistent cooperative clusters, in direct contradiction to the Gemini regime. For $L_m = 3$ , the system achieves a cooperation rate of 0.766 and median neighbor counts above 22.8. Sentiment analysis of memory-referential reasoning texts confirms that Gemma agents interpret memory as increasingly positive with longer $L_m$ , leveraging history primarily for reciprocity and trust establishment rather than punitive retreat. These model-specific divergences in memory interpretation supply a cognitively grounded explanation for longstanding empirical contradictions on the effect of memory length in agent-based game theory.

Personality Trait Correlations

Behavioral analyses confirm that Big Five personality traits modulate agent strategies with statistically significant and human-parallel correlations:

Agreeableness: Strong positive correlation with cooperation and cluster stability; negative correlation with movement, consistent with localization in stable, pro-social groups.
Extraversion: High positive correlation with movement, capturing exploratory tendencies.
Neuroticism: In Gemini, negatively correlates to cooperation, aligning with threat sensitivity and retreat; distinction is noted relative to human data, where neuroticism drives behavioral volatility rather than spatial withdrawal.

These results support the ecological validity of LLM-based agent modeling, with nuanced divergences reflective of LLM-specific cognitive biases.

Theoretical and Practical Implications

The results empirically substantiate that memory impacts on collective behavior are not invariantly determined by game-theoretic parameters or environment, but are critically contingent on the internal cognitive model of the agent LLM. The underlying LLM’s alignment, finetuning regime, architecture, and corpus composition act as cryptic, yet fundamental, axes governing major regime shifts in emergent social structure.

These findings have direct methodological implications:

GABM observations cannot be assumed to reflect agent-agnostic rules; rather, LLM-specific inductive biases may override classical evolutionary dynamics, even when environmental constraints are isomorphic.
Alignment-induced behavioral tendencies can manifest at scale in multi-agent systems, potentially destabilizing or promoting cooperation in unpredictable ways.

In real-world deployment, such sensitivities will determine the reliability and predictability of synthetic agent collectives—ranging from algorithmic governance, social simulations, to distributed AI systems. The social ontology of LLM-driven populations cannot be separated from their model-specific priors and interpretative machinery.

Future Research Directions

Several key avenues emerge for further inquiry:

Extension to richer memory representations (qualitative, non-episodic, or impressionistic memory).
Incorporation of explicit reasoning phases as mediators between memory interpretation and action.
Broad comparative analyses across diverse LLMs using LLM-as-a-Judge frameworks to robustly characterize model-resolved reasoning schemas.
Deeper study of the interaction between alignment, pretraining, and macro-level emergent phenomena, potentially through causal interventions or transparency toolkits.

Conclusion

This study demonstrates that the effect of agent memory on cooperation and collective dynamics in LLM-based multi-agent systems is ultimately subordinate to the interpretative frameworks innate to each LLM. Contrary and even diametric system-wide behaviors can occur under identical settings purely due to model-specific cognitive biases—principally arising from alignment regimes and underlying architectures. This underscores the necessity for rigorous model characterization and validation when designing societies of autonomous AI agents, as both the micro-cognitive content and the macro-social order fundamentally depend on the properties of the generative agents involved.