Cross-Conversation Memory Sharing
- Cross-Conversation Memory Sharing is a framework of architectures and algorithms enabling persistent, structured knowledge transfer across separate dialogue sessions using dynamic memory models.
- Methodologies include techniques such as dynamic bipartite graphs, nonlinear high-dimensional indexing, and layered cognitive architectures to ensure robust retrieval and policy compliance.
- Empirical evidence shows these methods improve accuracy, reduce resource calls, and enhance multi-step reasoning in multi-agent, multi-session tasks.
Cross-Conversation Memory Sharing concerns the architecture, algorithms, and policies that enable persistent, structured, and effective knowledge transfer across separate conversational episodes—whether between sessions, users, agents, or entire dialogue networks. This capability is increasingly critical for LLM agents tasked with sustained reasoning, multi-user collaboration, or complex task decomposition, and is realized via a diverse range of memory models, indexing strategies, and policy enforcement schemes. Approaches span distributed memory stores with formal access controls, cognitively inspired narrative schemata, multi-agent frameworks with tiered sharing, and nonlinear geometric retrieval modules. The overarching research agenda is to guarantee that agents recall, reuse, and update relevant knowledge from disparate conversations while ensuring interpretability, policy compliance, and efficiency.
1. Formal Models and Memory Architectures
Formalisms for cross-conversation memory sharing differ substantially depending on scale, trust environment, and primary use case.
Dynamic Bipartite Graphs with Policy Enforcement:
Collaborative Memory (Rezazadeh et al., 23 May 2025) operates on a bipartite access graph where users (), agents (), and resources () are interconnected through time-evolving graphs: Memory is split into private and shared fragments, each with immutable provenance (agents, resources, users, timestamps). Candidate memory access for is
Read and write policies can be global, per-user, per-agent, and time-varying.
Nonlinear High-Dimensional Indexing:
Wormhole Memory (Wang, 24 Jan 2025) encodes each memory in a multi-axis “Rubik’s cube” tensor , where is dialogue ID, is turn, and is topic partition. Retrieval is via: using composite embedding and locality-sensitive hashing for sublinear lookup across conversation boundaries.
Narrative Schemata and Thread Cards:
TraceMem (Shu et al., 10 Feb 2026) encodes episodes as clusterable experience traces, hierarchically clustered first by topic then by temporal thread, and stored as user memory cards. Each card links back to detailed traces by thread ID, naturally spanning multiple sessions.
Egocentric and Multi-Partner Memory:
EMMA (Jang et al., 2024) maintains per-speaker egocentric memory slots, updating and linking these via classification and retrieval encoders to ensure continuity and avoid contradiction in settings with dynamic partner changes.
Blending and Refinement:
CREEM (Kim et al., 2024) implements a pipeline where past memories are retrieved, blended with current context, and refined to prune redundancy or contradiction, yielding a dynamically evolving memory across sessions.
Layered Cognitive Architectures:
CogMem (Zhang et al., 16 Dec 2025) differentiates Long-Term Memory (LTM), Direct Access (DA) memory, and a per-turn Focus of Attention (FoA), integrating strategic, session, and contextual information through bounded stores and attention-based retrieval.
2. Algorithms and Memory Operations
Insertion, Update, and Deletion:
Update rules are typically hybrid:
- Collaborative Memory enforces auditability, only allowing fragment updates if permission holds at .
- Wormhole Memory applies momentum and decay in its update:
where is the stored vector and the new candidate.
Retrieval Mechanisms:
- Wormhole Memory retrieves cross-dialogue memory by multi-axis LSH and aggregates via softmax over cosine similarities.
- TraceMem’s agentic search first selects cards by user or topic, then retrieves relevant experience traces by embedding proximity.
- EMMA uses dual BERT encoders and retrieval by top similarity, plus chain retrieval via linked memory slots.
- CREEM retrieves the top-k memories for a given query, blends them with the current dialogue, generates new insights via LLMs, and prunes outdated slots through entailment-based classification.
Share/Transfer:
- Memory Sandbox (Huang et al., 2023) enables manual drag-and-drop of memory objects across different canvases, storing a new reference in the target session.
- DRMN (Ji et al., 2021) retrieves top-k semantically similar conversations for each new generation and iteratively reads them via attention mechanisms, encoding essential logic into the current session’s memory.
3. Access Control, Provenance, and Auditability
Robust access control and provenance attributes are foundational for safe memory sharing in multi-user and multi-agent environments.
- Collaborative Memory requires that no fragment is ever read or written unless the current agent- and resource-permission constraints hold, verifiable via saved graph snapshots and immutable (Rezazadeh et al., 23 May 2025).
- Wormhole Memory enforces cross-dialogue access via explicit barriers and exception traps, permitting “wormhole” sharing only when authorized (Wang, 24 Jan 2025).
- Memory Sandbox tracks provenance at the UI layer (source conversation per memory object), though does not enforce policy constraints at runtime (Huang et al., 2023).
4. Empirical Evidence and Benchmarks
Empirical evaluations span synthetic scenarios, realistic long-term tasks, and human ratings.
| System | Key Metric/Result | Testbed/Setting |
|---|---|---|
| Collaborative Memory | Accuracy > 0.90, up to 61% fewer resource calls | Multi-user MultiHop-RAG, SciQAG |
| TraceMem | MultiHop: 0.9220, Temporal: 0.8660 (SOTA) | LoCoMo benchmark (Shu et al., 10 Feb 2026) |
| Wormhole Memory | F1: mean 0.9213, retrieval acc: 0.937 | CoQA; 0% cross-dialogue retrieval for baselines |
| EMMA (MISC dataset) | Consistency ≈4.9, Engagingness ≈4.63, Memorability ≈4.6 (1–5 scale) | Human eval on 8.5K episodes (Jang et al., 2024) |
| CogMem | Reasoning acc: 0.93 (full), bounded token growth | TurnBench-MultiStep (extended reasoning) |
| CREEM | Reduces contradiction, improves QA accuracy/memorability in multi-session dialog | Custom multi-session dialogue tasks (Kim et al., 2024) |
| DRMN | BLEU: 28.96 (CDD), 23.42 (JDDC); 34–41% drop without shared memory | Large-scale legal/e-commerce datasets (Ji et al., 2021) |
Ablation studies in TraceMem and DRMN confirm that disabling cross-session sharing or narrative/thematic consolidation results in marked decreases in multi-hop, temporal, and overall accuracy.
5. Practical Implementations and Workflows
Practical memory sharing systems vary by degree of automation, transparency, and integration:
- Collaborative Memory enforces end-to-end auditing with formal provenance and stores every write/read operation for compliance.
- TraceMem structures interaction histories into searchable narrative threads, exposing precise retrieval justifications.
- Memory Sandbox foregrounds user agency—objects are shared or edited manually, with visibility flags controlling LLM context inclusion (Huang et al., 2023).
- EMMA enables seamless retrieval and updating of egocentric memories, crucial for handling dynamic conversation partner changes in real-world, multi-session settings.
- DRMN performs retrieval, memory read, and integration implicitly in the neural architecture, focusing on latent transfer from analogous tasks without surface-level exposure to the user (Ji et al., 2021).
6. Limitations, Open Problems, and Prospects
Documented constraints and open challenges include:
- Scale and Robustness: Collaborative Memory notes the gap between synthetic/small-scale scenarios and enterprise-grade concurrency or role churn (Rezazadeh et al., 23 May 2025). Wormhole Memory’s evaluation is performed in Python simulation rather than transformer-embedded production use (Wang, 24 Jan 2025).
- Forgetting and Reconsolidation: TraceMem, CREEM, and CogMem all identify the challenge of controlled forgetting and memory rewriting (“reconsolidation”) at recall time (Shu et al., 10 Feb 2026, Kim et al., 2024, Zhang et al., 16 Dec 2025).
- Policy Lapses and Hallucinations: Even with strict policies, unpredictable LLM outputs may cause policy breaches (Rezazadeh et al., 23 May 2025).
- Representation and Prioritization: TraceMem and CogMem consider richer memory representations (e.g., knowledge graphs, hierarchical trees) and adaptive, personality-conditioned prioritization as future work (Zhang et al., 16 Dec 2025, Shu et al., 10 Feb 2026).
- Retrieval Efficiency: Nonlinear retrieval (WMM) and hierarchical compression (CogMem) provide bounded runtime, but may require multi-tiered infrastructure as memory grows (Wang, 24 Jan 2025, Zhang et al., 16 Dec 2025).
Further extensions include integration of multimodal data, formal verification of access invariants, and scaling to real-time, high-throughput, many-agent deployments.
7. Historical Context and Comparative Perspectives
Interest in cross-conversation memory grew out of recognition that single-session context is insufficient for high-quality, consistent AI-driven interaction. Early systems (e.g., DRMN (Ji et al., 2021)) focused on leveraging similar conversations for improved utterance generation. Successive work delineated explicit private/shared partitioning, dynamic access graphs, and cognitive narrative models. Recent architectures, exemplified by Collaborative Memory and CogMem, unify provenance, auditability, structured retrieval, and dynamic adaptation, setting the direction for future research and real-world LLM agent systems.