Papers
Topics
Authors
Recent
Search
2000 character limit reached

Locality Preserving Loss in Representation Learning

Updated 28 January 2026
  • Locality Preserving Loss (LPL) is a regularization approach that preserves local geometric and affinity structures by leveraging graph Laplacian formulations.
  • It extends to linear mappings, kernel methods, and deep learning architectures, enhancing latent space representations and unsupervised clustering performance.
  • Empirical studies show that LPL improves manifold alignment and topological consistency, making it valuable for both dimensionality reduction and embedding alignment.

Locality Preserving Loss (LPL) refers to a class of regularization and objective functions that explicitly promote the preservation of local geometric or affinity structure in mapping, embedding, and representation learning tasks. LPL has its earliest mathematical foundation in the graph Laplacian-based methods such as Laplacian Eigenmaps and Locality Preserving Projection (LPP), but has since been adapted to suit a variety of deep learning and manifold learning contexts, including deep autoencoders, variational autoencoders, cross-manifold alignment, and representation learning in high-dimensional data.

1. Foundational Formulation: Graph-Based Locality Preserving Loss

The original instantiation of LPL arises from Laplacian-based dimensionality reduction and spectral learning frameworks. Given a dataset {x1,,xn}Rd\{x_1, \ldots, x_n\} \subset \mathbb{R}^d, a sparse nearest-neighbor graph (constructed via ϵ\epsilon-ball or kk-NN) with adjacency (affinity) matrix SRn×nS \in \mathbb{R}^{n \times n} is built, where a prototypical choice is Sij=exp(xixj2/2σ2)S_{ij} = \exp \left( -\|x_i - x_j\|^2 / 2\sigma^2 \right) if jj is a neighbor of ii, $0$ otherwise. The corresponding degree matrix DD and unnormalized Laplacian L=DSL = D - S are then defined, with LL positive semi-definite and rows summing to zero.

The Locality Preserving Loss is

LLPL=i=1nj=1nSijyiyj2=tr(YLY)\mathcal{L}_{\mathrm{LPL}} = \sum_{i=1}^n \sum_{j=1}^n S_{ij} \| y_i - y_j \|^2 = \operatorname{tr}(Y^\top L Y)

where yiRpy_i \in \mathbb{R}^p are the low-dimensional codes and YY is the n×pn \times p matrix stacking them. Minimization is subject to a normalization constraint (YY=IpY^\top Y = I_p or YDY=IpY^\top D Y = I_p) to avoid the trivial solution. The solution is equivalently the Laplacian eigenmap—embedding the data according to the pp nontrivial smallest eigenvectors of LL (Ghojogh et al., 2021).

2. Linear and Kernel Extensions: Locality Preserving Projection

LPL extends directly to linear mappings, yielding Locality Preserving Projection (LPP):

  • Linear case: yi=Uxiy_i = U^\top x_i, URd×pU \in \mathbb{R}^{d \times p}, and Y=UXY = U^\top X. The loss becomes tr(UXLXU)\operatorname{tr}(U^\top X L X^\top U), subject to UXDXU=IpU^\top X D X^\top U = I_p. The solution is given by the generalized eigenproblem (XLX)U=(XDX)UΛ(X L X^\top) U = (X D X^\top) U \Lambda.
  • Kernel case: With a feature map Φ:RdH\Phi: \mathbb{R}^d \to \mathcal{H} and Gram matrix K=Φ(X)Φ(X)K = \Phi(X)^\top \Phi(X), one solves (KLK)Θ=(KDK)ΘΛ(K L K) \Theta = (K D K) \Theta \Lambda for Θ\Theta, with the embedding Y=ΘKY = \Theta^\top K (Ghojogh et al., 2021).

Out-of-sample extensions differ: linear LPP allows y(x)=Uxy(x) = U^\top x for new points, kernel LPP computes y(x)=Θkty(x) = \Theta^\top k_t for kt=[k(xi,x)]ik_t = [k(x_i, x)]_i.

3. Locality Preserving Loss in Deep Learning and Autoencoders

Recent methods generalize LPL to deep representation learning, integrating it with autoencoder frameworks:

Llocality=i=1nj=1nzizj2aijL_{\mathrm{locality}} = \sum_{i=1}^n \sum_{j=1}^n \| z_i - z_j \|^2 a_{ij}

where ziz_i are latent encodings and aija_{ij} are affinities constructed from pretrained (autoencoder) latents. The prior affinity matrix A~\tilde{A} is built per-column by minimizing iz~iz~j2aij+λiaij2\sum_i \| \tilde{z}_i - \tilde{z}_j \|^2 a_{ij} + \lambda \sum_i a_{ij}^2 subject to aij0,iaij=1a_{ij} \ge 0, \sum_i a_{ij} = 1, yielding a sparse kk-NN structure. LPL is incorporated into the end-to-end fine-tuning loss as

L(Θe,W,Θd)=Lreconstruction+αLaffinity+γLlocalityL(\Theta_e, W, \Theta_d) = L_{\mathrm{reconstruction}} + \alpha L_{\mathrm{affinity}} + \gamma L_{\mathrm{locality}}

with γ\gamma weighting LPL. Empirical ablations demonstrate substantially improved unsupervised clustering performance (ACC gains of 10–15%) with LPL inclusion.

  • In (Chen et al., 2022), LPL is formulated via a continuous kk-NN graph (CkNN), considering both data- and latent-space graphs. The loss:

LLPL(ϕ;Xb)=i<j(WijX+WijZ)[dX(xi,xj)γdZ(zi,zj)]2L_{\rm LPL}(\phi;X_b) = \sum_{i < j} (W^X_{ij} + W^Z_{ij}) [ d_X(x_i, x_j) - \gamma d_Z(z_i, z_j) ]^2

where WXW^X and WZW^Z are adjacency matrices on data and latent spaces, and γ\gamma is a learned scaling parameter. The algorithm treats LPL as the primary objective, with reconstruction as a constraint, and extends to hierarchical VAEs.

4. Locality Preserving Loss in Embedding Alignment

(Ganesan et al., 2020) introduces an LPL for supervised or semi-supervised alignment of vector space manifolds (e.g., cross-lingual embeddings):

  • For source embeddings MsM^s and target MtM^t, with paired anchors VpV^p, fθf_{\theta} is trained to minimize alignment MSE and

Llpl(θ,W)=(mis,mit)Vpf(mis;θ)mjsNk(mis)Wijf(mjs;θ)2\mathcal{L}_{\mathrm{lpl}}(\theta, W) = \sum_{(m^s_i, m^t_i) \in V^p} \left\| f(m^s_i;\theta) - \sum_{m^s_j \in N_k(m^s_i)} W_{ij} f(m^s_j;\theta) \right\|^2

where WijW_{ij} are locally linear reconstruction weights (from Locally Linear Embedding) for mism^s_i from its neighbors. The total objective combines MSE (for alignment), LPL (for locality preservation), LLE (for learning WW), and an orthogonality regularizer (for stability in linear mappings). LPL is empirically shown to improve alignment, particularly under limited supervision, by increasing effective training sample utilization and providing graph Laplacian-like smoothness regularization.

5. Graph Construction: Affinity and Topology Preservation

Across all applications, the construction of the affinity/adjacency structure—whether via classic kk-NN, heat kernel, or CkNN—is fundamental:

Paper Graph Construction Affinity Matrix
(Ghojogh et al., 2021) ϵ\epsilon-ball or kk-NN Sij=exp(xixj2/2σ2)S_{ij} = \exp \left( -\|x_i-x_j\|^2/2\sigma^2 \right) or {0,1}\{0,1\}
(Chen et al., 2019) kk-NN on pretrained latents aija_{ij} via local quadratic program, iaij=1\sum_i a_{ij} = 1
(Chen et al., 2022) CkNN (density-adaptive kk-NN) Wij=1W_{ij} = 1 iff d2(xi,xj)δ2rirjd^2(x_i,x_j) \leq \delta^2 r_i r_j

The CkNN affords spectral convergence to the Laplace–Beltrami operator, ensuring that the induced graph accurately reflects the intrinsic topology of the underlying data manifold, including homological features such as connected components and cycles (Chen et al., 2022).

6. Theoretical Motivation and Guarantees

The theoretical grounding of LPL is rooted in spectral graph theory and manifold learning:

  • The LPL objective is equivalent to minimizing a quadratic form in the graph Laplacian, penalizing separating neighbors in the embedding space.
  • In deep learning extensions, LPL acts as a regularizer that aligns the learned manifold structure with a precomputed local geometry, or ensures that encoder/decoder mappings do not collapse or distort local metric neighborhoods.
  • In the CkNN setting, the adjacency graph is guaranteed (in large-sample limits) to yield a Laplacian converging to the manifold's Laplace–Beltrami operator, underpinning homological/topological consistency (Chen et al., 2022).
  • In alignment contexts, LPL effectively expands the annotated training set by manifold-based interpolation, reducing overfitting and encouraging locally smooth mappings (Ganesan et al., 2020).

7. Practical Implementation and Empirical Impact

Algorithmic strategies differ per application:

  • Laplacian eigenmaps and LPP involve solving (generalized) eigenproblems of O(n3)O(n^3) or O(d3)O(d^3) cost, but can be handled efficiently for sparse/Laplacian matrices (Ghojogh et al., 2021).
  • Deep autoencoder training with LPL integrates local graph construction (potentially in minibatch), gradient-based optimization, and, in CkNN, adaptive neighborhood thresholds (Chen et al., 2022).
  • Affinity matrices may be fixed (built from pretrained representations) or dynamic (rebuilt per iteration/batch).
  • Hyperparameters such as kk (neighborhood size), σ\sigma (kernel width), γ\gamma (relative LPL weight), and δ\delta (CkNN scale) are routinely cross-validated; improper choices can induce graph disconnectivity or wash out locality (Ghojogh et al., 2021).

Empirically, LPL consistently improves the preservation of local geometric structure in the latent space, as assessed by trustworthiness, continuity, MRRE, clustering accuracy, or alignment benchmarks, with pronounced gains in data-scarce or high-complexity regimes (Chen et al., 2019, Ganesan et al., 2020, Chen et al., 2022).


References

  • "Laplacian-Based Dimensionality Reduction Including Spectral Clustering, Laplacian Eigenmap, Locality Preserving Projection, Graph Embedding, and Diffusion Map: Tutorial and Survey" (Ghojogh et al., 2021).
  • "Generative approach to unsupervised deep local learning" (Chen et al., 2019).
  • "Locality Preserving Loss: Neighbors that Live together, Align together" (Ganesan et al., 2020).
  • "Local Distance Preserving Auto-encoders using Continuous k-Nearest Neighbours Graphs" (Chen et al., 2022).

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Locality Preserving Loss (LPL).