Anisotropic Heatmap Regression

Updated 11 February 2026

Anisotropic Heatmap Regression is a framework that approximates spatially varying functions by summing anisotropic multivariate splats, enhancing local adaptivity and interpretability.
The method employs parameterized anisotropic Gaussian bumps with positive-definite covariance matrices, optimized via Wasserstein–Fisher–Rao gradient flows for precise function fitting.
Empirical results demonstrate that this approach outperforms traditional techniques in 1D and 2D tasks by reducing error and preserving geometric interpretability.

Anisotropic heatmap regression is a regression framework wherein the target is a spatially varying function (commonly a heatmap) and the predictor is modeled as a sum of anisotropic multivariate “splats”—parametric bump functions such as anisotropic Gaussians, each with heterogenous orientation and scale controlled by positive-definite covariance matrices. This approach, formalized in the Splat Regression Model (SRM) and optimized via gradient flows in the Wasserstein–Fisher–Rao (WFR) metric geometry, achieves locally adaptive, interpretable, and highly expressive function approximations particularly effective in low-dimensional settings (Daniels et al., 18 Nov 2025).

1. Anisotropic Splat Primitives

Let $x \in \mathbb{R}^d$ denote the input domain. The basic primitive is the anisotropic splat function $\varphi(x;\mu,\Sigma)$ , defined as the push-forward of an isotropic “mother” density $\rho(z)$ (often standard Gaussian) under affine transformation:

$\varphi(x;\mu,\Sigma) = |\det \Sigma|^{-1/2} \,\rho\big(\Sigma^{-1/2}(x-\mu)\big)$

For the Gaussian mother density,

$\rho(z) = (2\pi)^{-d/2} \exp(-\|z\|^2/2)$

leading to the explicit multivariate Gaussian form

$\varphi(x;\mu,\Sigma) = (2\pi)^{-d/2}|\Sigma|^{-1/2}\exp\big(-\tfrac12(x-\mu)^\top \Sigma^{-1}(x-\mu)\big)$

Here, $\mu \in \mathbb{R}^d$ is the center and $\Sigma \in \mathbb{S}_+^d$ is a positive-definite covariance matrix encoding both local scale (via eigenvalues) and orientation (via eigenvectors). The decompositional flexibility— $\Sigma = A A^\top$ for full-rank $A$ , or via spectral decomposition $\Sigma = R \Lambda R^\top$ with $R \in SO(d)$ and diagonal $\Lambda$ —enables local adaptation to anisotropy.

2. Splat Regression Model Architecture

The Splat Regression Model approximates a target mapping $f^\star\!:\mathbb{R}^d \to \mathbb{R}^p$ by forming a finite mixture of $k$ anisotropic splats,

$f(x) = \sum_{i=1}^k w_i\,\varphi(x;\mu_i,\Sigma_i)$

with

$w_i \in \mathbb{R}^p$ : amplitude (output weight) vector for the $i$ -th splat,
$\mu_i \in \mathbb{R}^d$ : center,
$\Sigma_i \in \mathbb{S}_+^d$ : anisotropy matrix.

Parameter counting per splat yields $p$ in $w_i$ , $d$ in $\mu_i$ , and $d^2$ in $A_i$ (or $d(d{+}1)/2$ for a symmetric parameterization).

3. Parameterization and Anisotropy Constraints

Positive-definiteness of $\Sigma_i$ is enforced via parameterizations such as:

$A_i$ unconstrained full-rank (yielding $\Sigma_i=A_iA_i^\top$ ),
Cholesky decomposition: $\Sigma_i=L_iL_i^\top$ (with $L_i$ lower-triangular),
Eigen-decomposition: $\Sigma_i=R_i\operatorname{diag}(\lambda_{i1},\ldots,\lambda_{id})R_i^\top$ with all $\lambda_{ij}>0$ .

In eigen-parameterization, unconstrained log-eigenvalues parameterize the scale, while rotation matrices $R_i$ (in $SO(d)$ , parameterizable via the Lie algebra) encode orientation. This structure enables each splat to locally stretch and align with elongated features in the underlying data.

4. Wasserstein–Fisher–Rao Gradient Flow Optimization

Optimization proceeds in the non-Euclidean space of mixing measures, lifting the splat parameters $\{w_i,\mu_i,\Sigma_i\}_{i=1}^k$ to an atomic measure

$\mu\in \mathcal{P}(\mathbb{R}^p\times \rho(\mathbb{R}^d))$

with each atom $(w_i, \rho_{A_i, b_i})$ . The population loss,

$F(f_\mu) = \mathbb{E}_{x\sim \pi}\big[L(f_\mu(x), y(x))\big]$

(where $L$ is e.g., squared error), is minimized via the Wasserstein–Fisher–Rao gradient flow, decomposing tangent directions into a mass-teleportation (Fisher–Rao) and a transport (Wasserstein) component. The gradients are:

Fisher–Rao (mass) gradient:

$\delta^{FR}F(\mu)(w_i,A_i,b_i) = \int \langle \delta F(x),w_i \rangle\,\rho_{A_i, b_i}(x)\, \pi(dx) - \mathbb{E}_j \int \langle \delta F(x),w_j \rangle\,\rho_{A_j, b_j}(x)\, \pi(dx)$

Wasserstein (parameter) gradients:

$\begin{align*} \partial_{w}F &= \int \delta F(x)\, \rho_{A_i, b_i}(x)\, \pi(dx) \ \partial_{A}F &= \int \langle \delta F(x), w_i \rangle \left[ I + \nabla_x \log \rho_{A_i, b_i}(x)\, (x{-}b_i)^\top \right] A_i^{-T}\, \rho_{A_i, b_i}(x)\, \pi(dx) \ \partial_{b}F &= -\int \langle \delta F(x), w_i \rangle \nabla_x \log \rho_{A_i, b_i}(x)\, \rho_{A_i, b_i}(x)\, \pi(dx) \end{align*}$

where $\delta F(x) = \partial L / \partial f$ at $f_\mu$ . In practice, stochastic gradient steps or particle birth-death schemes are applied over minibatches of $x \sim \pi$ .

5. Workflow for Anisotropic Heatmap Regression

When the regression target $y(x)$ is a heatmap on a 2D grid $x\in [0,1]^2\to \mathbb{R}$ , the workflow comprises: a) Select a mother splat $\rho$ (typically 2D standard Gaussian). b) Initialize $k$ splats as small isotropic Gaussians on a grid of centers $b_i$ , with initial $w_i\approx 0$ and $A_i\approx \alpha I$ . c) Specify loss: $F=\tfrac12\sum_{x \in \mathrm{grid}}(f_\mu(x) - y(x))^2$ . d) Compute error $\delta F(x)=f_\mu(x)-y(x)$ and form the above gradients via Monte-Carlo minibatching. e) Update $\{w_i, b_i, A_i\}$ via Adam or SGD on the combined WFR gradients. f) The learnt covariance $\Sigma_i=A_iA_i^\top$ encodes local anisotropic scaling: large eigenvalues elongate the splat, aligning it with elongated heatmap features.

Performance is tracked via held-out mean-squared error, and qualitative assessment is aided by visualizing ellipses $\{(x-b_i)^\top\Sigma_i^{-1}(x-b_i)=c\}$ to examine alignment with heatmap structures.

6. Empirical Performance and Comparative Analysis

Empirical results indicate that anisotropic splat models offer substantial benefits on low-dimensional approximation and regression tasks:

In a 1D multiscale interpolation problem, a $k=30$ splat model learns an adaptive interpolation grid, outperforming Haar-wavelet interpolation and matching Chebyshev methods on nonuniform domains.
On a 2D regression task with $f(x,y)=\sin(3\pi \sqrt{x})\cos(3\pi y)$ and $k=100$ anisotropic splats, models achieve an order of magnitude lower error than comparably sized multilayer perceptrons (MLPs) or Kolmogorov–Arnold networks by leveraging local orientational adaptation.
On physics-informed regression (e.g., Allen–Cahn equation interfaces on $[0,1]^2$ ), anisotropic splat models fit boundary layers and curved interfaces more accurately and with fewer parameters than isotropic radial basis function (RBF) methods or standard physics-informed neural networks (PINNs) (Daniels et al., 18 Nov 2025).

A plausible implication is that the learned anisotropic parameters confer model capacity that remains interpretable and resistant to over-parameterization in low dimensions.

7. Interpretability, Adaptivity, and Applications

Anisotropic heatmap regression via Splat Regression Models yields weighted sums of ellipsoidal bump functions, with learnable centers, amplitudes, and anisotropy matrices. WFR-gradient-based end-to-end learning preserves interpretability: each splat models a localized structure with explicit geometric meaning in $\Sigma_i$ . Visualization of splats as ellipses elucidates how the model aligns and adapts to structured regions of the data, especially in cases exhibiting elongated, curved, or otherwise anisotropic phenomena. This approach enables flexible and accurate solutions to diverse approximation, estimation, and inverse problems where local adaptivity and geometric structure are paramount (Daniels et al., 18 Nov 2025).

Markdown Report Issue Upgrade to Chat

References (1)

Splat Regression Models (2025)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Anisotropic Heatmap Regression.