Papers
Topics
Authors
Recent
Search
2000 character limit reached

Unnormalized Spectral Clustering

Updated 7 January 2026
  • Unnormalized spectral clustering is a graph-based algorithm that segments data by analyzing the eigenstructure of the Laplacian matrix derived from the similarity graph.
  • The method constructs a similarity graph, computes eigenvectors of L = D - W, and applies k-means on the spectral embedding to recover clusters.
  • Despite its clear linear-algebraic foundations and theoretical guarantees, the approach can be sensitive to degree heterogeneity compared to normalized methods.

Unnormalized spectral clustering is a graph-based algorithmic framework for partitioning data into clusters by leveraging the eigenstructure of the unnormalized graph Laplacian matrix. It directly relaxes combinatorial graph-cut objectives and embeds data points into a low-dimensional spectral space where geometric separation reflects underlying cluster structure. The method emphasizes the topology of the constructed similarity graph, using linear algebraic relaxations for computational feasibility, and has well-documented theoretical and practical characteristics in both general and model-based data regimes.

1. Definition of the Unnormalized Laplacian and Graph Construction

Given data points x1,,xnRdx_1, \dots, x_n \in \mathbb{R}^d and a nonnegative, symmetric similarity function s(xi,xj)s(x_i, x_j), an undirected similarity graph G=(V,E)G=(V,E) is constructed where V={v1,,vn}V = \{v_1, \dots, v_n\} and edge weights are wij=s(xi,xj)0w_{ij}=s(x_i, x_j) \geq 0 with wij=wjiw_{ij}=w_{ji} and wii=0w_{ii}=0 (0711.0189). Several sparsification schemes are common, such as kk-nearest neighbor, ε\varepsilon-radius, or fully-connected Gaussian-weighted graphs. The adjacency matrix W=(wij)W=(w_{ij}), and diagonal degree matrix DD with Dii=jwijD_{ii}=\sum_j w_{ij}, are defined. The unnormalized graph Laplacian is then

L=DW.L = D - W.

This matrix is symmetric, positive semidefinite, and satisfies Lf=0Lf = 0 for constant vectors ff (0711.0189). The fundamental quadratic form is

fLf=12i,j=1nwij(fifj)2,f^\top L f = \frac{1}{2} \sum_{i,j=1}^n w_{ij} (f_i - f_j)^2,

which encodes the connectivity structure of GG.

2. Algorithmic Workflow and Spectral Relaxation

The unnormalized spectral clustering algorithm proceeds as follows (0711.0189):

  1. Graph Construction: Compute WW by the selected similarity and sparsification.
  2. Degree and Laplacian: Form DD and L=DWL = D-W.
  3. Spectral Decomposition: Solve Lu=λuL u_\ell = \lambda_\ell u_\ell for eigenpairs (λ,u)(\lambda_\ell, u_\ell) and extract the kk eigenvectors with the smallest eigenvalues.
  4. Spectral Embedding: Treat each data-point ii as the ii-th row of the n×kn \times k eigenvector matrix UU.
  5. Clustering Assignment: Run kk-means in Rk\mathbb{R}^k on the embeddings.
  6. Cluster Recovery: Assign points to clusters according to kk-means output.

The algorithm is a relaxation of the RatioCut objective, where minimizing the sum over clusters jCut(Aj,Aˉj)Aj\sum_j \frac{\mathrm{Cut}(A_j, \bar{A}_j)}{|A_j|} is NP-hard. By relaxing indicator vectors to real vectors with orthogonality and norm constraints, the problem reduces to computing the bottom kk eigenvectors of LL (0711.0189).

3. Theoretical Guarantees and Consistency

The spectral properties of LL directly encode cluster structure: the multiplicity of the zero eigenvalue equals the number of connected components in the graph, with corresponding eigenvectors as component indicators (0711.0189). For i.i.d. samples x1,,xnx_1,\dots,x_n from an underlying measure ρ\rho on DRdD\subset\mathbb{R}^d, and with similarity graph constructed using an appropriate kernel η\eta and connectivity radius εn0\varepsilon_n \to 0, the following holds (Trillos et al., 2015):

  • Eigenvalue Convergence: For each fixed kk,

2λk(n)nεn2σηλk\frac{2\lambda_k^{(n)}}{n \varepsilon_n^2} \to \sigma_\eta \lambda_k

where λk\lambda_k is the kkth eigenvalue of a continuum differential operator, and ση\sigma_\eta depends on the kernel.

  • Eigenvector Convergence: For unit DD-norm eigenvectors uk(n)u_k^{(n)} of LL, the functions uk(n)u_k^{(n)} converge, in the TL2TL^2 topology, to continuum eigenfunctions uku_k.
  • Cluster Consistency: If the kk-means algorithm is run on the embedding by the first kk eigenvectors, the resulting clusters converge (weakly, in measure) to the continuum partition induced by (u1,,uk)#ρ(u_1,\dots,u_k)_{\#}\rho, under assumptions on graph connectivity and scaling of εn\varepsilon_n.

A Γ\Gamma-convergence analysis establishes that discrete graph Dirichlet energies converge to the continuum Dirichlet energy, directly linking spectral clustering on finite data with the underlying population structure. Explicit scaling conditions on εn\varepsilon_n ensure the spectral limits are meaningful: εn(logn/n)1/d\varepsilon_n \approx (\log n / n)^{1/d} is sufficient, and the method remains consistent up to the connectivity threshold (Trillos et al., 2015).

4. Model-Selection, Parameterization, and Practical Implementation

Unnormalized spectral clustering requires careful graph construction and parameter tuning. For geometric data, a topological approach (Rieser, 2015) constructs a one-parameter family of graphs {Gr}r0\{G_r\}_{r \geq 0} by thresholding ambient distances at scale rr. The correct rr is selected using two data-driven criteria:

  • Average Relative Neighborhood Volume: RGr=1ZvN(v)C(v)R_{G_r} = \frac{1}{|Z|} \sum_v \frac{|N(v)|}{|C(v)|}, minimized over rr.
  • Average Relative Entropy: Hr,1H_{r,1} averages over nodes the Kullback–Leibler divergence between heat-diffused distributions at time t=1t=1 and steady-state within components, maximized over rr.

Cluster assignment is then obtained by extracting the kernel of Lr^L_{\hat r} and assigning points by projection in the space of 0-eigenvectors (Rieser, 2015). Computationally, for nn data points and mm candidate rr values, complexity is O(mn3)O(mn^3) in the worst case, though iterative eigensolvers and sparse-matrix methods reduce practical costs.

5. Comparison to Normalized Spectral Clustering

Unnormalized and normalized spectral clustering share the foundational use of graph-Laplacian eigenvectors but differ in normalization and objective. The unnormalized Laplacian L=DWL=D-W relaxes the RatioCut, which depends on the cardinality of clusters Aj|A_j|, while normalized methods (Lsym=D1/2LD1/2L_{\text{sym}} = D^{-1/2} L D^{-1/2}, Lrw=ID1WL_{\text{rw}} = I - D^{-1} W) target the Normalized Cut objective, accounting for cluster volumes vol(Aj)\operatorname{vol}(A_j). Several key differences are documented (0711.0189, Sarkar et al., 2013):

  • Consistency: Unnormalized spectral clustering may lack statistical consistency for large graphs unless all did_i are bounded away from zero and the spectrum considered remains well below minidi\min_i d_i. Eigenvectors corresponding to higher eigenvalues can become localized and uninformative.
  • Degree Sensitivity: Unnormalized algorithms are sensitive to degree heterogeneity. If degrees vary widely or some vertices have small degree, eigenvectors can behave pathologically.
  • Empirical Performance: Both normalized and unnormalized methods achieve the same asymptotic rate of convergence for misclassification in stochastic blockmodels, but normalization consistently shrinks within-cluster spread in the spectral embedding by a constant factor, leading to lower error in finite samples and on real-data link-prediction tasks. For example, normalized clustering attained lower misclassification rates than unnormalized in co-authorship and political blog datasets; for instance, normalized SC misclassified 4% of blogs versus 37% for unnormalized SC after preprocessing (Sarkar et al., 2013).
  • When Unnormalized Clustering Fails: Pathological cases exist (e.g., cockroach graphs) where unnormalized spectral clustering produces suboptimal partitions while normalized methods succeed.

6. Applications, Advantages, and Pitfalls

Unnormalized spectral clustering is widely applicable due to its algorithmic simplicity and its close connection to linear algebra and graph theory. It is favored when the cluster size is of interest, the graph is well-behaved (relatively uniform degree), and the RatioCut objective is appropriate. Its advantages are: direct computation on LL, algorithmic clarity, and connection with indicator subspaces for connected components (0711.0189).

However, sensitive dependence on the global graph structure and lack of cluster-volume normalization can lead to poor performance on graphs with unbalanced or heterogeneous degree distributions, as well as statistical inconsistency in large sample limits under mild connectivity violations. Empirically, normalized variants often outperform unnormalized, especially in the presence of degree heterogeneity and for moderate nn (Sarkar et al., 2013).

A plausible implication is that, for rigorous statistical consistency and robust finite-sample behavior, normalized spectral clustering should be preferred under moderate or unknown degree variation. Unnormalized clustering remains useful for pedagogical purposes, for balanced geometric data, and as the foundation of topological or parameter-free spectral methods (Rieser, 2015).

7. Summary Table: Key Properties

Property Unnormalized Spectral Clustering Normalized Spectral Clustering
Laplacian definition L=DWL = D-W Lsym=D1/2LD1/2L_{\rm sym} = D^{-1/2}LD^{-1/2}
Balances by Cluster size (Aj|A_j|) Cluster volume (vol(Aj)\operatorname{vol}(A_j))
Statistical consistency Only in restricted settings (high min degree, small spectrum) Robust as nn\to\infty
Sensitivity to degree heterogeneity High Low
Empirical neighbor spread Larger within-cluster spread Shrunk by constant factor
Typical use cases Uniform geometric data, topological settings Heterogeneous graphs, real data analysis

The selection of unnormalized spectral clustering should be informed by graph structure, application requirements, and theoretical guarantees. In asymptotic and real-data regimes with degree variability or in the presence of sparse clusters, normalized methodologies are often statistically and empirically superior. Nonetheless, unnormalized variants offer insight into the interplay between topology, spectral theory, and combinatorial clustering (Rieser, 2015, Trillos et al., 2015, 0711.0189, Sarkar et al., 2013).

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Unnormalized Spectral Clustering.