Papers
Topics
Authors
Recent
Search
2000 character limit reached

Hierarchical Trees in RAPTOR

Updated 7 January 2026
  • Hierarchical trees are structured models that recursively abstract and summarize data using techniques like GMM clustering and GPT-based summarization for efficient retrieval.
  • In RAPTOR, documents are split into chunks, embedded with SBERT, and organized into a multi-level tree that enables precise, context-aware question answering.
  • The HO-Tree in ST-Raptor extends these ideas to semi-structured tables by integrating meta and body trees to maintain structural integrity and improve answer accuracy.

Hierarchical trees in retrieval and question answering constitute a structured approach for organizing, abstracting, and querying complex information spaces, particularly in the RAPTOR framework for text documents (Sarthi et al., 2024) and the HO-Tree representation for semi-structured tables in ST-Raptor (Tang et al., 25 Aug 2025). These models utilize recursive or orthogonal tree structures to recursively abstract, summarize, and enable context-sensitive reasoning over large, heterogeneous corpora and tables, optimizing both accuracy and efficiency for retrieval-augmented LLMs.

1. RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval

The RAPTOR model applies hierarchical tree construction to lengthy documents, facilitating multi-range retrieval for LLMs. Let DD be a document of TT tokens. RAPTOR begins by segmenting DD into NN contiguous, sentence-respecting text “chunks” of at most 100 tokens. Each chunk tjt_j is embedded using SBERT: e0(tj)=φ(tj)Rde_0(t_j)=\varphi(t_j)\in\mathbb{R}^d, with d=768d=768.

At each recursive level \ell, embeddings e(n,i)e_{\ell}(n_{\ell,i}) corresponding to nodes n,in_{\ell,i} are clustered using a Gaussian Mixture Model (GMM). To address high-dimensionality, UMAP-based dimensionality reduction may optionally reduce TT0. The GMM fits TT1 Gaussians to maximize

TT2

with TT3 chosen to minimize BIC:

TT4

where TT5. Nodes are assigned soft cluster memberships via posteriors TT6, with cluster sets TT7 for a threshold TT8.

Child texts in each cluster TT9 are concatenated and summarized using GPT-3.5-turbo, generating new summary nodes DD0. Each summary is embedded: DD1. This process recurses until a single root node (level DD2) remains or clusters become too small to split. The resulting hierarchical summary tree enables retrieval at multiple abstraction levels.

2. Formal Tree Structure, Inference, and Query Algorithms

Formally, the tree comprises DD3 levels indexed DD4. Level-0 nodes correspond to original text chunks, with each higher level built from GMM-clustered, abstractive summaries. A node DD5 at level DD6 has children DD7.

Querying in RAPTOR proceeds by embedding the query DD8 to DD9. Each node is scored by cosine similarity:

NN0

Two retrieval modes are defined:

  • Tree traversal: Starting at the root (NN1), top-NN2 nodes by score are selected, then recursively their children are scored at each lower level, selecting top-NN3 at each step. The context for the LLM is the union of texts from selected nodes across all levels.
  • Collapsed-tree retrieval: All nodes across all levels are pooled; top nodes are chosen by score in descending order until a global token budget is reached.

Empirical evidence indicates collapsed-tree retrieval achieves higher answer accuracy, while tree traversal provides deterministic per-level quota and lower computational overhead when NN4 (with NN5 the total number of tree nodes), especially in large document scenarios (Sarthi et al., 2024).

3. Computational Complexity and Trade-Offs

The RAPTOR build time consists of NN6 for chunk embedding (where NN7 is per-encoder cost), NN8 per-level for clustering, and a summarization overhead proportional to LLM-invocation token counts per level. Empirically, wall-clock and token costs scale linearly with document length NN9.

During retrieval:

  • Flat-retrieval (baseline, e.g., DPR/BM25): tjt_j0 for scoring/sorting.
  • Tree traversal: tjt_j1, typically much less than tjt_j2 for small tjt_j3 and moderate tjt_j4.
  • Collapsed-tree retrieval: tjt_j5 or tjt_j6 with tjt_j7 to score and sort all nodes.

FAISS or approximate tjt_j8-NN can accelerate all approaches.

Key trade-offs are summarized in the following table:

Retrieval Mode Granularity Speed Accuracy (Empirical)
Collapsed-tree Flexible Moderate Highest
Tree traversal Fixed per-level Fast (when tjt_j9) Lower (but scalable)

Collapsed-tree retrieval is most accurate; tree traversal is optimal when deterministic quotas and speed are required (Sarthi et al., 2024).

4. Hierarchical Trees for Semi-Structured Tables: HO-Tree in ST-Raptor

ST-Raptor generalizes hierarchical tree frameworks to semi-structured tables, formulating the Hierarchical Orthogonal Tree (HO-Tree) representation (Tang et al., 25 Aug 2025). For a table e0(tj)=φ(tj)Rde_0(t_j)=\varphi(t_j)\in\mathbb{R}^d0, the HO-Tree is a triple

e0(tj)=φ(tj)Rde_0(t_j)=\varphi(t_j)\in\mathbb{R}^d1

where:

  • e0(tj)=φ(tj)Rde_0(t_j)=\varphi(t_j)\in\mathbb{R}^d2: Meta-Tree, representing headers and their hierarchical containment.
  • e0(tj)=φ(tj)Rde_0(t_j)=\varphi(t_j)\in\mathbb{R}^d3: Body-Tree, representing content cells as paths (rows) in a row-oriented trie.
  • e0(tj)=φ(tj)Rde_0(t_j)=\varphi(t_j)\in\mathbb{R}^d4: a pointer from each meta-tree leaf (resolved header) to a body-tree level.

This design encodes both hierarchical header structure and side-by-side orthogonal table sections, accommodating multi-row/column spans, arbitrary merged cells, and recursive subtables.

Algorithmically, HO-Tree construction proceeds by meta-information detection (via VLMs and embedding-based header identification), recursive table partitioning according to merged-cell and header orientation principles, and depth-first construction—producing a forest of HO-Trees under a synthetic root as needed. Each cell is processed e0(tj)=φ(tj)Rde_0(t_j)=\varphi(t_j)\in\mathbb{R}^d5 times (with e0(tj)=φ(tj)Rde_0(t_j)=\varphi(t_j)\in\mathbb{R}^d6 the total number of cells), and embedding dominates the cost at e0(tj)=φ(tj)Rde_0(t_j)=\varphi(t_j)\in\mathbb{R}^d7.

5. Operations and Pipelines over Hierarchical Trees

ST-Raptor exposes a formal language of atomic tree operations, composable into complex pipelines for LLM-guided question answering. The basic operations include:

  • e0(tj)=φ(tj)Rde_0(t_j)=\varphi(t_j)\in\mathbb{R}^d8: Child subtree retrieval for meta-node e0(tj)=φ(tj)Rde_0(t_j)=\varphi(t_j)\in\mathbb{R}^d9.
  • d=768d=7680: Ancestor subtree retrieval for meta-node d=768d=7681.
  • d=768d=7682: Value extraction, returning all body-nodes at meta-node d=768d=7683’s associated level with row-ancestor value d=768d=7684.
  • d=768d=7685: Data filtering by predicate d=768d=7686.
  • d=768d=7687: Numeric aggregation.
  • d=768d=7688: Set comparison.
  • d=768d=7689: Map operation.
  • \ell0: Parameter alignment.
  • \ell1: LLM-based reasoning on data \ell2 for query \ell3.

Given a natural language question \ell4, ST-Raptor (1) decomposes \ell5 into sub-questions using few-shot prompting and retrieved exemplars, (2) generates atomic operation statements for each, and (3) sequentially executes these over the HO-Tree, invoking forward and backward verification mechanisms to ensure correctness and stability of answers (Tang et al., 25 Aug 2025).

6. Empirical Results and Impact

Controlled experiments confirm the efficacy of the hierarchical tree approach. RAPTOR achieves state-of-the-art results on multi-step reasoning question answering tasks. Notably:

  • On QuALITY (5k-token passages), DPR+GPT-4 yields 60.4% accuracy, RAPTOR+GPT-4 62.4% (+2.0 pp); on the QuALITY-HARD subset, performance improves from 54.7% to 56.6%.
  • On QASPER, RAPTOR+GPT-4 obtains 55.7 F1 (DPR+GPT-4: 53.0 F1).
  • On NarrativeQA, RAPTOR+UnifiedQA improves ROUGE-L, BLEU-1, and METEOR metrics by 1–0.7 pp over strong baselines.

Coupling RAPTOR retrieval with GPT-4 yields an absolute 20-point lift in QuALITY accuracy over the previous SOTA (62.3% to 82.6%), demonstrating RAPTOR’s utility for multi-step, thematic reasoning over long contexts (Sarthi et al., 2024).

ST-Raptor, leveraging the HO-Tree, attains up to 20% higher answer accuracy than nine other baselines in the semi-structured table setting, as measured on the SSTQA dataset with 764 questions across 102 real-world tables (Tang et al., 25 Aug 2025).

7. Significance and Generalizations

Hierarchical trees in RAPTOR and ST-Raptor provide an explicit, recursive abstraction of the underlying information space, decoupling summary granularity and query expressivity from fixed chunking strategies. This enables robust retrieval and reasoning, supporting complex question decomposition, abstraction, and compositional generalization. The models natively accommodate multi-modal, recursive, and thematic content, and their modular tree operations and layouts offer principled mechanisms for aligning LLM inference with the original data structure. This suggests that recursive trees may remain foundational for scalable, interpretable retrieval-augmented language modeling in both textual and tabular domains, particularly where multi-step reasoning and layout complexity are central.

Further developments may refine dynamic tree construction, LLM-guided pipeline generation, and cross-modal schema induction, leveraging the demonstrated empirical effectiveness and formal flexibility of hierarchical tree representations (Sarthi et al., 2024, Tang et al., 25 Aug 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (2)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Hierarchical Trees (RAPTOR).