Temporal Memory Tree (TMT)
- Temporal Memory Tree (TMT) is a hierarchical structure that organizes raw conversation data into progressively abstract representations for efficient memory retrieval.
- It employs semantic-guided consolidation with LLM-generated summaries and strict temporal containment to maintain coherent dialogue context.
- Empirical evaluations demonstrate TMT's improved recall accuracy and significant reduction in token usage compared to baseline memory frameworks.
A Temporal Memory Tree (TMT) is a formal structure introduced within the TiMem memory framework to support hierarchical, temporally coherent memory consolidation for long-horizon conversational agents whose interaction histories exceed the finite context window limitations of LLMs. TMTs enable systematic transformation of raw conversational observations into progressively abstracted representations—such as distilled persona-level summaries—while supporting efficient, complexity-aware memory retrieval. The TMT design is characterized by strict formal constraints, semantic-guided consolidation of memories via LLMs without fine-tuning, and algorithmic recall procedures balancing precision and efficiency (Li et al., 6 Jan 2026).
1. Formal Architecture of the Temporal Memory Tree
A TMT is defined as a rooted, level-indexed tree
with the following components:
- : a collection of memory nodes partitioned into abstraction levels.
- : directed edges , where is parent to and .
- : assigns a closed time interval to each node, with for every .
- : assigns each node a semantic summary (LLM-generated text string) and a fixed-dimensional embedding vector.
The tree satisfies the following constraints:
- Temporal Containment: Each parent’s interval contains that of its children.
- Progressive Consolidation: The number of nodes per level is non-increasing, i.e., for .
- Hierarchy Level Marking: for .
Raw dialogue turns with timestamp are grouped into base-level segments (e.g., each user–assistant exchange). Corresponding leaf nodes have and produced by segment-level consolidation.
2. Semantic-Guided Memory Consolidation
At the core of TMT's memory abstraction is the semantic-guided consolidation operator at each level : where:
- : child memories from level with intervals within grouping window .
- : most recent nodes at level for contextualization.
- : human-designed instruction describing the abstraction goal at level (e.g., factual summary, pattern extraction, persona distillation).
The consolidation process for each proceeds as:
- Gather and history .
- Format an LLM prompt using , passing texts of and .
- The LLM returns a summary and its encoding (using a fixed encoder such as Qwen3-Embedding).
- Create new node at level with , , and edges linking to all .
No further fine-tuning is required beyond the initial LLM and embedding model setups.
3. Complexity-Aware Memory Recall
Memory recall from a TMT is dynamically tailored to query complexity:
- Query Classification: A recall planner classifies input query with labels , where and is a keyword set extracted for retrieval.
- Leaf Activation: Each leaf node is scored:
with in TiMem. Top- nodes are selected as .
- Hierarchical Propagation: For each leaf , ancestors at levels in (as determined by planner-defined retrieval strategy) are gathered into . The candidate pool is:
- Recall Gating: A filtering function (implemented as a single LLM call with candidate texts) selects relevant candidates:
- Final Ordering: Retained memories are sorted by hierarchy level and recency:
yielding the final recall set .
4. Core Algorithms and Pseudocode
Key routines are expressed as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
def INSERT_SEGMENT(o_t): # o_t = raw turn at time t create leaf m with τ(m) = [t, t] σ(m) = LLM_consolidate_level_1(o_t, history_1, I_1) add m to M_1 def SCHEDULE_CONSOLIDATION(level_i, window_g): C_i = {m in M_{i-1} | τ(m) ⊆ g} H_i = most recent w_i memories in M_i m_new = Φ_i(C_i, H_i, I_i) # LLM call τ(m_new) = g σ(m_new) = text + embedding link m_new as parent to each c in C_i def RECALL(q): (c, K) = planner(q) # LLM call Ω_1 = TopK1_{m in M1} s(m, q; K) Ω_c = Ω_1 ∪ {ancestors of Ω_1 at levels in S(c)} Ω_p = gating_LLM(Ω_c, q, c) # LLM filter return sort(Ω_p by (ℓ(m), |t_q - t_end(m)|)) |
This operational structure performs segment insertion, scheduled hierarchical consolidation, and complexity-aware recall with rigorous temporal alignment.
5. Quantitative Results and Evaluation Methodology
TMT is primarily evaluated within the TiMem framework using datasets and metrics as summarized below.
| Dataset | Questions | Task Categories/Types |
|---|---|---|
| LoCoMo | 1,540 | 4 |
| LongMemEval-S | 500 | 6 |
Evaluation uses:
- Accuracy (LLJ):
- F1/ROUGE-L: Compared at token level between generated and gold answers (on LoCoMo).
- Recalled Context Length: Average number of tokens recalled per query.
- Latency: 50th/95th percentiles for end-to-end recall time.
Reported results for TiMem using TMT:
- LoCoMo: 75.30% ± 0.16 (vs. best baseline 69.24%)
- LongMemEval-S: 76.88% ± 0.30 (vs. best baseline 68.68%)
- Context reduction on LoCoMo: 52.2% fewer tokens than baseline on recalled context.
6. Manifold Analysis and Emergent Persona Structure
TMT's progressive memory abstraction yields distinct effects in manifold space, assessed via UMAP and clustering diagnostics. For LoCoMo (10-user, real-data):
- Silhouette Score: 0.093 (level 1) to 0.574 (level 5), improvement
- Intrinsic Dimensionality: across levels
- Separation Ratio:
For LongMemEval-S (single persona, synthetic):
- Spread (variance): Shrinks by 50% from L1 to L5
- Intrinsic Dimension:
- Radius95: Contracts by 44%
These observations indicate that, on genuine multi-user data, hierarchical consolidation amplifies user-specific features, resulting in well-separated persona clusters; on synthetic/homogeneous data, the primary effect is noise reduction and template convergence.
7. Significance and Implications
The TMT constitutes a foundational mechanism for temporally contiguous, hierarchically abstracted memory structures in long-horizon conversational agents. The combination of temporal containment, multi-level semantic consolidation, and complexity-aware recall produces improved accuracy and substantial reductions in recalled memory size relative to prior frameworks. This approach treats temporal continuity as an organizing principle, enabling stable personalization and robust scaling beyond the single-context window regime of present-day LLMs (Li et al., 6 Jan 2026). A plausible implication is applicability to broader domains requiring temporally structured, multi-level summary representations, suggesting cross-disciplinary utility in sequence modeling and lifelong learning.