Riemannian Autoencoder (RAE) Overview

Updated 29 January 2026

Riemannian Autoencoders are latent variable models that embed data-driven Riemannian geometry into the latent space to enhance generative and manifold modeling tasks.
They replace standard Euclidean assumptions with metrics induced by decoder Jacobians or score-based flows, enabling meaningful geodesic computations and robust interpolation.
Practical applications include improved density estimation, motion planning, and manifold recovery, with strong theoretical guarantees underlying their geometric structure.

A Riemannian Autoencoder (RAE) is a class of latent variable model that explicitly equips the latent space with a data-driven Riemannian geometry, yielding principled generative, representation, and manifold modeling capabilities. RAE frameworks systematically replace the standard Euclidean assumptions of classical VAEs with geometric structures induced either by the decoder Jacobian (pullback metric), anisotropic score-based flows, or extrinsic manifold constraints. This approach enables meaningful geodesic computations in latent codes, robust regularization for manifold learning, and improved interpolation and density modeling.

1. Riemannian Latent Space Structure

The defining feature of RAE models is the construction of a Riemannian metric on the latent space, typically denoted $G(z)$ or $g(z)$ , which is induced via a smooth immersion of the decoder network. For a generative map in diagonal form $f_\theta(z) = (\mu_\theta(z), \sigma_\theta(z))$ , the latent metric is given by

$G(z) = J_\mu(z)^\top J_\mu(z) + J_\sigma(z)^\top J_\sigma(z)$

where $J_\mu$ , $J_\sigma$ are the Jacobians of the mean and variance decoder branches, respectively (Kalatzis et al., 2020). This metric yields, for tangent vectors $v, w$ : $g_z(v, w) = v^\top G(z) w$ and induces local volume elements $dM_z = \sqrt{\det G(z)} dz$ , geodesic lengths, and logarithm/exponential maps on the latent manifold.

In alternative constructions, the metric may be learned via score-based pullback formulations, e.g.

$g_x(v, w) = v^\top D_x\phi^\top \Sigma^{-1} D_x\phi w$

where $\phi$ is a learned normalizing flow and $\Sigma$ is an anisotropic covariance (Diepeveen et al., 2024). This generalizes the RAE approach to a broad family of flow-based models and enables closed-form geodesic computation.

2. Variational Objectives and Priors

RAEs modify the standard VAE evidence lower bound (ELBO) to account for the Riemannian geometry. Notably, Kalatzis et al. (Kalatzis et al., 2020) introduce a Riemannian Brownian motion (heat kernel) prior: $p(z) \approx (2\pi t)^{-d/2} H_0(z; \mu) \exp(-\ell^2(z, \mu)/(2t))$ where $H_0$ is a volume factor ratio, $t$ is the scalar diffusion time (variance), $\ell(z, \mu)$ is the geodesic distance under the latent metric, and the density is defined on the manifold without intractable normalization constants. Both the prior and the variational family $q_\phi(z|x)$ are heat-kernel forms, with centers and diffusion times output by the encoder.

The resulting ELBO is: $\mathcal{L}(x; \theta, \phi) = \mathbb{E}_{q_\phi(z|x)}[\log p_\theta(x|z)] - \mathrm{KL}(q_\phi(z|x) \| p(z))$ with KL divergence estimated via Monte Carlo over explicit forms for log-densities involving geodesic distances and local volume elements.

Other RAEs may append regularization terms enforcing pullback metric concentration (e.g. Frobenius norm penalties (Diepeveen et al., 26 Jan 2026)) or isometry (low distortion/bending (Braunsmann et al., 2022)), and in some cases integrate the metric into the generative process via score-based flows or SPD manifold constraints.

3. Geodesic Computation and Interpolation

RAEs admit principled geodesic computations on the data-adaptive latent manifold by leveraging the explicit metric. Given two points $z_0, z_1$ , the minimizing path $\gamma(t)$ , $\gamma(0) = z_0, \gamma(1) = z_1$ extremizes the energy functional: $E[\gamma] = \int_0^1 \|\dot{\gamma}(t)\|_{G(\gamma(t))} dt$ The Euler–Lagrange equations yield a second-order ODE involving the Christoffel symbols of $G$ . In practice, most RAEs discretize latent space, construct weighted graphs (edge weights via metric-averaged distances), and apply Dijkstra/A* algorithms for shortest paths, optionally refining solutions with cubic splines or neural curve fitting (Beik-Mohammadi et al., 2021, Shamsolmoali et al., 2023).

In score-based pullback models, geodesics take closed form: for quadratic scores and flows,

$\gamma(t) = \phi^{-1}((1-t)\phi(x) + t\phi(y))$

yielding straight-line paths in flow coordinates (Diepeveen et al., 2024).

Interpolation networks further regularize the geodesic curve (insertion, curvature, length) and outperform linear mixing, maintaining path fidelity in high-density regions (Shamsolmoali et al., 2023).

4. Encoder-Decoder Architectures and Training

RAEs employ encoder networks mapping data into latent representations, with either Gaussian or heat-kernel variational families. Decoders are neural immersions from latent codes to data space, with explicit branching for mean and variance (and, in pose models, quaternion orientation (Beik-Mohammadi et al., 2021)). In SPD-manifold learning (e.g., DreamNet), encoder–decoder blocks are constructed as BiMap (SPD → SPD), ReEig (eigenvalue rectification), and LogEig (SPD logarithm), stacked with residual connections (Wang et al., 2022).

Training objectives combine standard reconstruction loss (and likelihood if variational) with geometric regularization:

Pullback metric concentration penalties (Frobenius/low-rank norm (Diepeveen et al., 26 Jan 2026))
Isometry/distortion and curvature penalties on random pairs (low-bending (Braunsmann et al., 2022))
KL-divergence to the Riemannian Brownian motion prior (Kalatzis et al., 2020)
Explicit classifier branches for supervised metric learning (DreamNet (Wang et al., 2022))

Optimization proceeds by stochastic gradient methods (Adam, Riemannian SGD for orthogonal weights), backpropagating through geodesic solvers, volume elements, and determinant computations.

5. Practical Applications and Model Capacity

RAEs have demonstrated marked improvements in generative modeling, motion planning, and manifold learning:

Riemannian Brownian motion priors yield lower ELBOs and more faithful samples, even with a single additional scalar parameter controlling heat-kernel width; higher F₁ scores are obtained in low-dimension classification (Kalatzis et al., 2020).
Geodesic planning enables robot trajectories that naturally avoid obstacles, interpolate between multiple demonstrated modes, and adapt online via ambient metric reshaping without retraining; the underlying latent geometry is crucial for multibranch skills and reactive motion (Beik-Mohammadi et al., 2021, Beik-Mohammadi et al., 2022).
Flow-based RAEs recover smooth, low-dimensional manifold charts even from corrupted data, with bi-Lipschitz guarantees and sharp generative priors suitable for inverse problems (Diepeveen et al., 26 Jan 2026).
In SPD-manifold learning, stacked RAEs with residual block identity constraints prevent degradation with depth and yield improved generalization and accuracy in visual classification benchmarks (Wang et al., 2022).
Statistical latent modeling via RDA-INR achieves robust and resolution-independent mean-variance analysis in LDDMM shape spaces (Dummer et al., 2023).

6. Theoretical Guarantees and Geometric Insights

Recent work establishes explicit bi-Lipschitz guarantees for decoder mappings and error bounds for intrinsic dimension estimation (Diepeveen et al., 26 Jan 2026, Diepeveen et al., 2024). Score-based pullback RAEs can extract global charts, compute closed-form geodesics, and accurately estimate data manifold geometry, supported by isometry regularization during flow training.

Isometric regularization and extrinsic flatness constraints yield latent mappings where Euclidean distances closely approximate geodesic distances on the data manifold, substantially aiding clustering, interpolation, and downstream analysis (Braunsmann et al., 2022).

Riemannian geodesic interpolation supports semantically meaningful transitions and realistic sample generation, mitigating the mode collapse and interpolation artifacts endemic to classic VAEs (Chadebec et al., 2020, Shamsolmoali et al., 2023).

7. Experimental Evaluation and Empirical Impact

RAEs consistently outperform Euclidean VAEs and classical linear/geodesic subspace methods across diverse tasks:

Model/Task	Capacity/Accuracy Gain	Key Geometric Features
R-VAE w/ BM Prior (Kalatzis et al., 2020)	Lower ELBO, higher F₁ in low dims	Prior mass concentrated on data manifold
VTAE (Shamsolmoali et al., 2023)	Bits-per-dim, recon PSNR, user preference	Geodesic interp, STF, metric flatness
AmbientFlow RAE (Diepeveen et al., 26 Jan 2026)	Manifold recovered from corrupted data	Bi-Lipschitz chart, flow pullback geometry
DreamNet SPD SRAE (Wang et al., 2022)	Generalization at increased depth	SPD/affine-invariant metric, residual blocks
RDA-INR (Dummer et al., 2023)	Robust resolution-indep. recon	LDDMM Riemannian met, INR parameterization
RHVAE (Chadebec et al., 2020)	Clustering/interpolation, low-data regime	Hamiltonian flow, learned metric

Empirical results demonstrate that incorporating geometric structure fundamentally enhances modeling capacity—manifested in lower negative log-likelihood, improved interpolation fidelity, and more reliable clustering—even when explicit curvature constraints are relaxed or the regularizer is low-rank (Kalatzis et al., 2020, Diepeveen et al., 26 Jan 2026, Braunsmann et al., 2022).

8. Extensions: Manifold-Valued Data and Implicit Manifolds

RAEs have been generalized to manifold-valued data, including SPD matrix learning, pose/motion manifolds, and diffeomorphism groups (Miolane et al., 2019, Hartwig et al., 10 Oct 2025, Dummer et al., 2023). Weighted Riemannian submanifold frameworks evaluate submanifold-to-submanifold error (e.g., via Wasserstein distances) and establish latent codes respecting the underlying manifold geometry (Miolane et al., 2019).

Implicit manifold methods define latent manifolds via approximate projection operators and denoising networks, supporting robust geodesic shooting and exponential maps independent of the underlying autoencoder (Hartwig et al., 10 Oct 2025).

In summary, the Riemannian Autoencoder paradigm subsumes classical Euclidean latent modeling, integrating explicit Riemannian metrics into the autoencoding framework, and facilitating manifold-consistent probabilistic inference, generative modeling, and geometric skill generation with strong theoretical and empirical guarantees.