Constant-Distortion Hyperbolic Embeddings

Updated 29 January 2026

Constant-distortion hyperbolic embeddings are mathematical constructions that map graphs, trees, or metric spaces into hyperbolic space while preserving pairwise distances within a controlled distortion.
They leverage hyperbolic geometry’s exponential volume growth and negative curvature to efficiently represent hierarchical data, achieving near-perfect fidelity in low dimensions.
The approach rigorously quantifies tradeoffs among distortion, coordinate precision, and embedding dimensions, guiding both combinatorial constructions and practical implementations.

Constant-distortion hyperbolic embeddings are mathematical constructions that map graphs, trees, or general metric spaces into hyperbolic space, preserving pairwise distances up to a small (controlled) distortion. Such embeddings are particularly powerful for representing data with hierarchical structure, owing to the exponential volume growth and negative curvature of hyperbolic geometry. Theoretical developments demonstrate that, for trees and other hierarchical data, hyperbolic embeddings can achieve very high fidelity with remarkably low dimensions—a property not generally available for Euclidean embeddings. These embeddings admit a tradeoff between distortion, the number of required dimensions, and numerical precision, with combinatorial algorithms capable of embedding trees with distortion arbitrarily close to one, given sufficient precision and/or embedding dimension (Sa et al., 2018).

1. Combinatorial Hyperbolic Tree Embeddings

A principal advance in constant-distortion hyperbolic embeddings is the development of a simple, combinatorial, non-iterative construction for embedding arbitrary trees into hyperbolic space with provably small distortion. The method generalizes Sarkar’s algorithm for embedding rooted trees into the Poincaré disk (ℍ₂), where the root is placed at the origin, and child nodes are recursively embedded at a fixed hyperbolic distance τ from their parent, spaced uniformly around the parent. This approach is described as follows:

For a parent–child pair, the parent’s position is mapped to the origin via an isometry, the scale parameter τ is set to control the radial displacement, and children are placed on a circle of radius $r = \tanh(\tau/2)$ .
Children are evenly arranged in angle and mapped back to the parent coordinate system via an inverse isometry.
Each edge is thus mapped to exactly the same hyperbolic distance, and the process recurses over the entire tree with time complexity $O(|V|)$ , where $V$ is the set of nodes.

The worst-case distortion $D_{wc}$ of such an embedding can be made arbitrarily close to $1$ by increasing τ, at the cost of higher numerical precision. Formally, if τ is set as $((1+\epsilon)/\epsilon)\,2 \ln(\Delta_{max}\cdot(\pi/2))$ for desired distortion $1+\epsilon$ (where $\Delta_{max}$ is the tree's maximum degree), then $D_{wc} \leq 1+\epsilon$ (Sa et al., 2018).

This algorithm generalizes to higher-dimensional hyperbolic spaces (ℍ_r) using code-based arrangements on high-dimensional spheres for the increased packing efficiency.

2. Precision, Dimensionality, and Distortion Tradeoffs

A core theoretical contribution is the rigorous quantification of tradeoffs among embedding distortion, coordinate precision, and dimension. In the Poincaré disk, coordinates must reside within unit norm; representing a point at hyperbolic distance $d$ from the origin requires $k \gtrsim d$ bits, as the norm approaches $1-2e^{-d}$ for large $d$ . For a tree of depth $\ell$ and maximum degree $\Delta_{max}$ , achieving distortion at most $1+\epsilon$ demands:

At least $\Omega((\ell/\epsilon)\cdot\log \Delta_{max})$ bits of coordinate precision for $r=2$ (Theorem 2.1).
In general $\mathbb{H}_r$ , the required bits per coordinate decrease inversely with $r$ up to $r \leq \log \Delta_{max}+1$ , after which $O(\ell/\epsilon)$ bits suffice.

A summary of these scaling relationships is provided in the following table:

Space	Bits per Coordinate	Parameters
$\mathbb{H}_2$	$\Omega((\ell/\epsilon)\log \Delta_{max})$	Tree depth $\ell$ , max degree $\Delta_{max}$ , distortion $\epsilon$
$\mathbb{H}_r$ (small r)	$O((\ell/(r\epsilon))\log \Delta_{max})$	$r \leq \log \Delta_{max}+1$
$\mathbb{H}_r$ (large r)	$O(\ell/\epsilon)$	$r > \log \Delta_{max}+1$

This quantification allows embedding designers to select dimensions and required precision according to application constraints. A key implication is that hierarchical structures, such as trees derived from synonym or type hierarchies (e.g., WordNet), can be embedded in hyperbolic space with near-exact fidelity in very low dimensions using the combinatorial method, outperforming previous approaches both in distortion and representational efficiency (Sa et al., 2018).

3. Hyperbolic Multidimensional Scaling (h-MDS)

To embed general metric spaces—not only trees—into hyperbolic space, a method termed hyperbolic multidimensional scaling (h-MDS) has been introduced. h-MDS operates in the hyperboloid model, representing points in $\mathbb{R}^{r+1}$ subject to a Lorentzian quadratic constraint $x^T Q x = 1$ , $x_0 > 0$ , with $Q = \text{diag}(1, -1, ..., -1)$ . The algorithm comprises:

Forming the matrix $Y_{ij} = \cosh d_{ij}$ , where $d_{ij}$ is the hyperbolic distance between data points.
Performing principal component analysis (PCA) on $-Y$ (as $Y$ has both positive and negative eigenvalues), retaining the top $r$ positive eigenvalues to reconstruct spatial coordinates.
Recovering the $x_0$ component from the norm of projected coordinates and, optionally, mapping hyperboloid points to the Poincaré ball via $p = \vec{x} / (1 + x_0)$ .

Exact recovery (up to isometry) of the point configuration is possible when provided the full distance matrix. The procedure relies on the notion of "pseudo-Euclidean centering," where the data is centered such that $\sum_i x_{0,i}\vec{x}_i = 0$ . Local convexity results for geodesic dimension reduction show that, in local neighborhoods, gradient descent is guaranteed to converge to the best lower-dimensional hyperbolic submanifold.

4. Stability and Perturbation Analysis

The sensitivity of the h-MDS algorithm to noise in the input distances is governed by standard PCA perturbation theory. For true distances $H$ and observed $H + \Delta H$ , the deviation in the embedding obeys:

$D_E(X, \hat X) = O\left(\frac{n^2}{\lambda_{min}}\,\sinh^2(\|H\|_\infty)\,\|\Delta H\|_\infty^2\right)$

where $D_E$ is the squared Euclidean Frobenius norm error, $n$ is the number of points, and $\lambda_{min}$ is the smallest nonzero eigenvalue of $X X^T$ . The scaling with $\sinh^2(\|H\|_\infty)$ implies that recovery is robust for small-magnitude hyperbolic distances (where $\sinh$ behaves almost linearly), but large input distances amplify noise significantly. This result places explicit limits on the numerical stability of constant-distortion embeddings for large-diameter graphs or graphs with noisy distances (Sa et al., 2018).

5. Practical Implementation Strategies

Practical deployment of constant-distortion hyperbolic embeddings, including for large datasets or when only partial distance information is available, benefits from several computational heuristics:

Stochastic gradient descent (SGD) can be used with a squared-distance loss in the Poincaré ball, typically with a learnable or fixed scale parameter and optional weighting to emphasize local structure.
To circumvent the singular gradient at coinciding points in $\operatorname{acosh}$ , optimizers use $d_H^2(x,y)$ as the loss, for which gradients remain bounded.
Riemannian gradients are implemented by modulating Euclidean gradients via the metric tensor, $g_x=(2/(1-\|x\|^2))^2I$ , yielding updates scaled by $(1-\|x\|^2)^2/4$ .
Gradient clipping and explicit re-projection onto the ball are applied to prevent numerical overflows and NaNs.
For incomplete distance matrices, low-rank matrix completion of $\cosh(D)$ followed by h-MDS often suffices to recover a high-fidelity embedding—empirically, as little as $5$– $10\%$ of pairwise distances are adequate in many cases.

The implementation principles support scalable, robust hyperbolic embedding pipelines and are available in a PyTorch-based framework.

6. Empirical Performance and Applications

Empirical evaluation of these embeddings on hierarchical data structures, such as WordNet, demonstrates that combinatorial embeddings in two-dimensional hyperbolic space can achieve a mean-average-precision (MAP) of $0.989$ using only two dimensions, compared to $0.87$ for a 200-dimensional embedding by Nickel et al. This stark empirical superiority, coupled with precise theoretical bounds, underscores the value of hyperbolic space for representing taxonomies, type hierarchies, and other data with latent hierarchical structure. A plausible implication is that for a broad class of applications such as natural language processing or knowledge graph representation, hyperbolic embeddings offer a unique and efficient tool for high-fidelity, low-dimensional geometric representation (Sa et al., 2018).

Markdown Report Issue Upgrade to Chat

References (1)

Representation Tradeoffs for Hyperbolic Embeddings (2018)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Constant-Distortion Hyperbolic Embeddings.