Procrustes Tangent Space Analysis

Updated 2 February 2026

Procrustes tangent space is a linearized Euclidean approximation of the nonlinear shape manifold, enabling standard linear methods for advanced shape analysis.
The approach converts complex alignment and registration challenges into tractable linear problems by leveraging Generalized Procrustes Analysis and dimensionality adjustments.
Statistical considerations, including RMSE scaling and spatial autocorrelation, are integrated to optimize computational efficiency and ensure reliable inference.

The Procrustes tangent space is a linearized, Euclidean approximation of the highly nonlinear shape manifold arising from the removal of translation, rotation, and scale in landmark-based shape analysis. Central to geometric morphometrics, statistical shape analysis, and optimal transport, this construction underpins algorithms for shape registration, mean shape computation, and multivariate modeling in both finite- and infinite-dimensional settings. Through first-order approximation of the manifold near a consensus or mean shape, the Procrustes tangent space allows high-dimensional shape variation to be represented and analyzed using standard linear methods, with explicit metric, statistical, and computational constraints (Courtenay, 26 Jan 2026, &&&1&&&).

1. Geometry of Shape Manifolds and Procrustes Alignment

Let $X_i \in \mathbb{R}^{p \times k}$ , for $i = 1, \ldots, n$ , denote raw configurations of $p$ homologous landmarks in $k$ dimensions. The removal of translation, scale, and rotation by Generalized Procrustes Analysis (GPA) yields aligned preshapes $X_i^\star$ lying on the nonlinear Kendall shape manifold $\Sigma_{p, k}$ . The mean configuration or "consensus shape" $\bar{X}$ is the Fréchet mean of these superimposed samples.

Around $\bar{X}$ , a first-order approximation of $\Sigma_{p, k}$ defines the Procrustes tangent space $T_{\bar{X}}\Sigma_{p, k}$ , a central construction for approximating nonlinear shape variation by linear methods (Courtenay, 26 Jan 2026).

2. Dimension and Degrees of Freedom in the Tangent Space

The dimension $q$ of the Procrustes tangent space at the mean is given by

$q = kp - k - \frac{k(k-1)}{2} - 1 = kp - \left[\frac{k(k+1)}{2} + 1\right]$

where

$kp$ counts all raw coordinates,
$k$ accounts for translations (one per spatial dimension),
$\frac{k(k-1)}{2}$ for independent rotations,
$1$ for uniform scaling.

The result is a $q$ -dimensional Euclidean approximation of shape variation. For large $p$ (many landmarks), $q$ grows linearly in $p$ , and the subtractions become negligible, but for moderate $p$ , proper accounting for nonshape DOFs is essential (Courtenay, 26 Jan 2026).

3. Statistical Error Scaling and the “Diagonal” in Shape Analysis

Under an isotropic noise model in tangent space, where each coordinate is iid with variance $\sigma^2$ , the expected root-mean-squared-error (RMSE) of a model using all $q$ dimensions satisfies

$\mathrm{RMSE} \approx \sqrt{\frac{q \sigma^2}{n}} \propto \sqrt{\frac{q}{n}}$

Holding RMSE fixed yields $q \propto n$ , so $p \approx n/(k+1)$ for large $p$ , revealing the origin of the "diagonal" in plots of sample-size $n$ versus landmark count $p$ : reliable estimation is only possible beneath this boundary (Courtenay, 26 Jan 2026). In log-space,

$\log(\mathrm{RMSE}) = \tfrac12[\log q - \log n] + \text{const}$

so isocontours of constant error exhibit unit slope, explaining observed transitions in empirical error.

An important subtlety arises from the unit-norm preshape constraint, elevating the effective per-landmark DOF to $k+1$ and yielding a diagonal slope of $1/(k+1)$, matching simulation results for 2D $(1/3)$ and 3D $(1/4)$ shapes (Courtenay, 26 Jan 2026).

4. Spatial Autocorrelation and Effective Tangent-Space Dimension

In biological and other real data, landmarks are not independent: spatial proximity induces covariance (spatial autocorrelation), so the actual covariance matrix in tangent space is typically structured and not proportional to the identity. The effective dimension, $df_{\rm eff}$ , may be much smaller than the full $q$ , as principal directions capture most of the variance.

Models exploiting spatial structure (e.g., convolutional regressors that respect landmark adjacency) can achieve improved RMSE, as

$\mathrm{RMSE} \approx \sqrt{\frac{df_{\rm eff}}{n}}$

thus transcending naïve isotropic bounds. Failing to account for autocorrelation leads to performance degradation, as shown empirically with both vector-wise and convolutional regression in the context of geometric morphometrics (Courtenay, 26 Jan 2026).

5. Tangent Space in Riemannian and Wasserstein Settings

The tangent-space formalism generalizes to Riemannian shape spaces and optimal transport. In the 2-Wasserstein space of Borel probability measures $\mathcal{P}_2(\mathbb{R}^d)$ , the tangent space at a regular (absolutely continuous) measure $\mu$ is

$T_\mu \mathcal{P}_2(\mathbb{R}^d) = \overline{\{\nabla \varphi : \varphi \in C_c^\infty(\mathbb{R}^d)\}}^{L^2(\mu)}$

identifying tangent vectors with gradient fields, i.e., displacement fields from optimal transport maps (Zemel et al., 2017). The Riemannian structure admits logarithm and exponential maps: $\log_\mu(\nu) = T_\mu^\nu - \text{id}, \qquad \exp_\mu(v) = (\text{id} + v)\#\mu$ where $T_\mu^\nu$ is the unique optimal transport from $\mu$ to $\nu$ (Brenier’s theorem), and $\#\mu$ denotes the pushforward.

This framework supports a Procrustes-style algorithm for Fréchet means, alternating between registering measures via pairwise optimal couplings (Procrustes registration) and averaging displacement fields in $T_\mu\mathcal{P}_2$ , with proven convergence and statistical consistency (Zemel et al., 2017).

6. Computational Implications and Algorithmic Frameworks

The Procrustes tangent space enables reduction of high-dimensional nonlinear shape analysis to tractable linear regression and machine learning pipelines. After GPA alignment, the $n \times q$ matrix of vectorized, centered Procrustes coordinates is treated as Euclidean data in $\mathbb{R}^q$ (Courtenay, 26 Jan 2026).

For the orthogonal Procrustes problem, the tangent space at $Y\in O(n)$ is

$T_Y O(n) = \{Y A: A \in \text{Skew}(n)\}$

where $\text{Skew}(n)$ is the space of skew-symmetric $n \times n$ matrices (Gutman et al., 2019). Orthogonal projection of a matrix $W$ onto $T_Y O(n)$ is implemented as $\Pi_{T_Y O(n)}(W) = Y \, \text{skew}(Y^\top W)$ , where $\text{skew}(M) = (M - M^\top)/2$ .

Tangent Subspace Descent (TSD) algorithms decompose $\text{Skew}(n)$ into “Givens” subspaces and perform blockwise gradient steps, yielding computational cost $O(n)$ per step versus $O(n^3)$ for full-gradient methods. This approach ensures global convergence provided that gap-ensuring and randomized-norm conditions are satisfied (Gutman et al., 2019).

For Fréchet means in Wasserstein space, the tangent-bundle/Procrustes strategy allows use of gradient descent:

Registration via log-maps,
Averaging in the tangent space,
Retraction (pushforward by the averaged map) to yield the next iterate.

Consistency and convergence are established for both finite-sample and population-level regimes under regularity and bandwidth conditions (Zemel et al., 2017).

7. Statistical and Methodological Constraints

Analysis in Procrustes tangent space is subject to several critical limitations:

Dimensionality/sample size tradeoff: Reliable error control requires $n \gtrsim (k+1)p$ , or, equivalently, $p/n \lesssim 1/(k+1)$ ; exceeding this leads to overfitting and instability along the “diagonal” boundary (Courtenay, 26 Jan 2026).
Local validity: The tangent-space approximation holds only for configurations close to the mean; larger deformations invoke nonlinearities that a single-point linearization cannot capture.
Non-independence: Procrustes coordinates inherit global and local dependencies due to GPA; treating them as independent features violates underlying constraints.
Data leakage: Applying GPA to mixed train/test samples prior to splitting causes information leakage, shrinking observed RMSE artificially and violating standard validation assumptions. Proper alignment requires aligning test data to the training set exclusively (Courtenay, 26 Jan 2026).

Together, these shape-space constraints enforce fundamental limits for inference, estimation, and prediction in shape analysis.

References:

"On Procrustes Contamination in Machine Learning Applications of Geometric Morphometrics" (Courtenay, 26 Jan 2026)
"Coordinate Descent Without Coordinates: Tangent Subspace Descent on Riemannian Manifolds" (Gutman et al., 2019)
"Fréchet Means and Procrustes Analysis in Wasserstein Space" (Zemel et al., 2017)