Papers
Topics
Authors
Recent
Search
2000 character limit reached

Dual-Branch HNSW Approach

Updated 10 February 2026
  • Dual-branch HNSW approach is a framework combining dual-tower embedding models and parallel HNSW graphs to enhance ANN search performance.
  • It employs LID-based branch assignment and skip-bridges to reduce search layers and build time while boosting recall.
  • Empirical evaluations reveal recall improvements up to 30% and build time reductions of 20% across NLP and CV domains.

The dual-branch HNSW approach refers to a family of extensions to Hierarchical Navigable Small World (HNSW) graphs designed for efficient Approximate Nearest Neighbor (ANN) search, encompassing both representation learning with dual-tower (two-branch) models for generating embeddings (Li et al., 2023) and graph structural innovations via parallelized HNSW graphs with LID-driven skip connections (Nguyen et al., 23 Jan 2025). These techniques share the core principle of leveraging architectural duality—either in embedding model structure or in HNSW graph organization—to enhance recall, accelerate construction, and exploit sparsity or neighborhood geometry.

1. Dual-Branch Two-Tower Models and Embedding Sparsity

The original “dual-branch” concept arises in representation learning, specifically, two-tower architectures for embedding queries and items independently. Each tower transforms heterogeneous features (categoricals, numerics, text) into learned embeddings. Importantly, both cosine two-tower and chi-square two-tower models utilize weight-tying across towers; the final embedding ϕ\phi is shared up to the last layer. Distinctions lie in the normalization: cosine towers uu/u2u\leftarrow u/\|u\|_2, chi-square towers uu/i=1duiu \leftarrow u/\sum_{i=1}^d u_i. The latter yields nonnegative, L1-normalized vectors, with sparsity driven by ReLU activations. Empirical results show that, for embedding size d=1024d=1024, observed sparsity may be as low as 2.8%2.8\% (chi-square tower) or 5.4%5.4\% (cosine tower) (Li et al., 2023). This effect is direct: ReLU induces exact zeros, and L1 normalization amplifies the nonnegative lasso-like effect, substantially reducing effective storage and similarity computation cost.

2. Dual-Branch HNSW++ Graph Structure with LID-Driven Optimization

In the context of HNSW graph search structures, the dual-branch methodology is realized as HNSW++, which constructs two independent, random HNSW graphs ("branches") over the data, merging only at the base layer. Assignment to branches and layers is based on Local Intrinsic Dimensionality (LID): points with higher LID—typically at cluster boundaries—are prioritized for higher layers and distributed to ensure each branch covers roughly half the points. An explicit skip-bridging mechanism enables points with LID above threshold TT and within radius ϵ\epsilon of the query to “jump” directly to layer 0, bypassing intervening layers (Nguyen et al., 23 Jan 2025).

Key architectural differences from vanilla HNSW:

  • Two parallel HNSW graphs are built, each with approximately n/2n/2 points.
  • Search initiates at the top layer of both branches, proceeding in parallel.
  • Branches merge only at layer 0 for final neighbor selection.
  • Skip bridges controlled by LID introduce non-greedy, direct routes to the dense base layer.

This structure addresses HNSW’s local optima problem and cluster disconnections by providing multi-directional traversal and improved inter-cluster connectivity.

3. Local Intrinsic Dimensionality (LID) and Graph Construction

LID estimation is central to dual-branch assignment and skip-bridge creation. For point xx, the Maximum-Likelihood LID estimator [Levina & Bickel 2004] is

LID(x)=(1k1i=1k1lndk(x)di(x))1\mathrm{LID}(x) = \left( \frac{1}{k-1}\sum_{i=1}^{k-1}\ln\frac{d_k(x)}{d_i(x)} \right)^{-1}

where di(x)d_i(x) is the distance to the ii-th nearest neighbor. High LID points, identified as lying in sparse or transitional regions, are accorded bridge-building privileges and allocated to higher layers to serve as connectors across clusters. This facilitates broader graph navigation and mitigates search failures due to premature local convergence.

4. Search, Insertion Algorithms, and Complexity

Precise pseudocode from (Nguyen et al., 23 Jan 2025) formalizes layer/branch assignment, insertion, and search. Layer and branch allocation exploits normalized LID scores to ensure dense and boundary points are appropriately distributed. Search operates in parallel across both branches:

  • Each branch runs classic HNSW search, supporting O(logN\log N) complexity.
  • The presence of skip-bridges reduces the expected number of traversed layers: Leff=Ltotal(1Pskip)L_\text{eff} = L_\text{total}\cdot(1 - P_\text{skip}).
  • Final result sets are merged at layer 0 for kNN retrieval.

Construction complexity remains O(NlogN)O(N \log N) with build speed improved by roughly 20% owing to halved search spaces and skip-bride acceleration.

5. Integration with Sparse and Binary Embeddings

Sparse embeddings from dual-tower models—especially with ReLU+L1 normalization—yield significant memory and compute advantages when indexed with HNSW-style graphs. For d=1024d=1024 and 2.8%2.8\% nonzeros, storage per vector in CSR format declines by 24× compared to dense; similarity computation enjoys a theoretical 35× reduction in multiply-adds, with empirically observed ~3× speedup in practical scenarios, mainly in ad-targeting (Li et al., 2023). Further, Sign-Cauchy Random Projections (SignCRP) binarize embeddings, producing kk-bit codes whose Hamming similarity offers unbiased estimation of chi-square similarity, enabling additional 10–20× resource reductions with minimal recall/accuracy loss.

6. Empirical Evaluation and Ablation Studies

Comprehensive experiments on six datasets covering NLP and CV domains demonstrate HNSW++ dual-branch gains. On CV data, recall@10 improves by up to 30% (e.g., SIFT: 0.65→0.85 recall at 10k QPS); for NLP (GLOVE), recall increases by 18% (0.60→0.78 at 8k QPS). Construction time is reduced by ~20%, positioning HNSW++ within 9% build time of FAISS IVFPQ, while maintaining query speed. Ablation studies isolate the effect of innovations:

Variant Recall@10 ↑ Accuracy@10 ↑ Build time ↓ Query time Δ
Basic 100% 100% 100% 100%
Multi-Branch +12% +11% −18% +1.5%
LID-Based +20% +18% −20% +0.5%
HNSW++ (full) +28% +26% −20% +1.0%

The largest recall/accuracy gains result from LID-driven assignment, enhanced by dual-branch organization and skip-bridges (Nguyen et al., 23 Jan 2025). Practical throughput on large indexes (>>300k points) is in the order of 1–2 ms/query, with hundreds to thousands QPS/core possible when leveraging sparsity (Li et al., 2023).

7. Limitations, Open Questions, and Future Work

The efficacy of the dual-branch approach depends on careful threshold selection for LID-based skips; empirical findings suggest optimal values around T=0.6T=0.6–$0.8$ across datasets, with recall relatively insensitive to TT below high thresholds. In regimes with extreme dimensionality (d1d\gg1k), LID computation dominates build time, highlighting the desirability of constant-cost LID estimators and adaptive “ef” settings. Branch merging is currently implemented only at layer 0; dynamic or multi-way merging strategies present avenues for further robustness and scalability. A plausible implication is that refinement of LID-driven heuristics and further multiplexed branching could yield further robustness against graph fragmentation and local minima. No significant query time or recall trade-offs have been observed empirically for HNSW++ (Nguyen et al., 23 Jan 2025).


For detailed implementation, evaluation, and theoretical underpinning, refer to (Li et al., 2023) for dual-branch embedding models integrated with sparse and binary projection techniques, and (Nguyen et al., 23 Jan 2025) for structural dual-branch HNSW graph methods, LID-based optimization, and experimental validation.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Dual-Branch HNSW Approach.