Unnormalized Spectral Clustering
- Unnormalized spectral clustering is a graph-based algorithm that segments data by analyzing the eigenstructure of the Laplacian matrix derived from the similarity graph.
- The method constructs a similarity graph, computes eigenvectors of L = D - W, and applies k-means on the spectral embedding to recover clusters.
- Despite its clear linear-algebraic foundations and theoretical guarantees, the approach can be sensitive to degree heterogeneity compared to normalized methods.
Unnormalized spectral clustering is a graph-based algorithmic framework for partitioning data into clusters by leveraging the eigenstructure of the unnormalized graph Laplacian matrix. It directly relaxes combinatorial graph-cut objectives and embeds data points into a low-dimensional spectral space where geometric separation reflects underlying cluster structure. The method emphasizes the topology of the constructed similarity graph, using linear algebraic relaxations for computational feasibility, and has well-documented theoretical and practical characteristics in both general and model-based data regimes.
1. Definition of the Unnormalized Laplacian and Graph Construction
Given data points and a nonnegative, symmetric similarity function , an undirected similarity graph is constructed where and edge weights are with and (0711.0189). Several sparsification schemes are common, such as -nearest neighbor, -radius, or fully-connected Gaussian-weighted graphs. The adjacency matrix , and diagonal degree matrix with , are defined. The unnormalized graph Laplacian is then
This matrix is symmetric, positive semidefinite, and satisfies for constant vectors (0711.0189). The fundamental quadratic form is
which encodes the connectivity structure of .
2. Algorithmic Workflow and Spectral Relaxation
The unnormalized spectral clustering algorithm proceeds as follows (0711.0189):
- Graph Construction: Compute by the selected similarity and sparsification.
- Degree and Laplacian: Form and .
- Spectral Decomposition: Solve for eigenpairs and extract the eigenvectors with the smallest eigenvalues.
- Spectral Embedding: Treat each data-point as the -th row of the eigenvector matrix .
- Clustering Assignment: Run -means in on the embeddings.
- Cluster Recovery: Assign points to clusters according to -means output.
The algorithm is a relaxation of the RatioCut objective, where minimizing the sum over clusters is NP-hard. By relaxing indicator vectors to real vectors with orthogonality and norm constraints, the problem reduces to computing the bottom eigenvectors of (0711.0189).
3. Theoretical Guarantees and Consistency
The spectral properties of directly encode cluster structure: the multiplicity of the zero eigenvalue equals the number of connected components in the graph, with corresponding eigenvectors as component indicators (0711.0189). For i.i.d. samples from an underlying measure on , and with similarity graph constructed using an appropriate kernel and connectivity radius , the following holds (Trillos et al., 2015):
- Eigenvalue Convergence: For each fixed ,
where is the th eigenvalue of a continuum differential operator, and depends on the kernel.
- Eigenvector Convergence: For unit -norm eigenvectors of , the functions converge, in the topology, to continuum eigenfunctions .
- Cluster Consistency: If the -means algorithm is run on the embedding by the first eigenvectors, the resulting clusters converge (weakly, in measure) to the continuum partition induced by , under assumptions on graph connectivity and scaling of .
A -convergence analysis establishes that discrete graph Dirichlet energies converge to the continuum Dirichlet energy, directly linking spectral clustering on finite data with the underlying population structure. Explicit scaling conditions on ensure the spectral limits are meaningful: is sufficient, and the method remains consistent up to the connectivity threshold (Trillos et al., 2015).
4. Model-Selection, Parameterization, and Practical Implementation
Unnormalized spectral clustering requires careful graph construction and parameter tuning. For geometric data, a topological approach (Rieser, 2015) constructs a one-parameter family of graphs by thresholding ambient distances at scale . The correct is selected using two data-driven criteria:
- Average Relative Neighborhood Volume: , minimized over .
- Average Relative Entropy: averages over nodes the Kullback–Leibler divergence between heat-diffused distributions at time and steady-state within components, maximized over .
Cluster assignment is then obtained by extracting the kernel of and assigning points by projection in the space of 0-eigenvectors (Rieser, 2015). Computationally, for data points and candidate values, complexity is in the worst case, though iterative eigensolvers and sparse-matrix methods reduce practical costs.
5. Comparison to Normalized Spectral Clustering
Unnormalized and normalized spectral clustering share the foundational use of graph-Laplacian eigenvectors but differ in normalization and objective. The unnormalized Laplacian relaxes the RatioCut, which depends on the cardinality of clusters , while normalized methods (, ) target the Normalized Cut objective, accounting for cluster volumes . Several key differences are documented (0711.0189, Sarkar et al., 2013):
- Consistency: Unnormalized spectral clustering may lack statistical consistency for large graphs unless all are bounded away from zero and the spectrum considered remains well below . Eigenvectors corresponding to higher eigenvalues can become localized and uninformative.
- Degree Sensitivity: Unnormalized algorithms are sensitive to degree heterogeneity. If degrees vary widely or some vertices have small degree, eigenvectors can behave pathologically.
- Empirical Performance: Both normalized and unnormalized methods achieve the same asymptotic rate of convergence for misclassification in stochastic blockmodels, but normalization consistently shrinks within-cluster spread in the spectral embedding by a constant factor, leading to lower error in finite samples and on real-data link-prediction tasks. For example, normalized clustering attained lower misclassification rates than unnormalized in co-authorship and political blog datasets; for instance, normalized SC misclassified 4% of blogs versus 37% for unnormalized SC after preprocessing (Sarkar et al., 2013).
- When Unnormalized Clustering Fails: Pathological cases exist (e.g., cockroach graphs) where unnormalized spectral clustering produces suboptimal partitions while normalized methods succeed.
6. Applications, Advantages, and Pitfalls
Unnormalized spectral clustering is widely applicable due to its algorithmic simplicity and its close connection to linear algebra and graph theory. It is favored when the cluster size is of interest, the graph is well-behaved (relatively uniform degree), and the RatioCut objective is appropriate. Its advantages are: direct computation on , algorithmic clarity, and connection with indicator subspaces for connected components (0711.0189).
However, sensitive dependence on the global graph structure and lack of cluster-volume normalization can lead to poor performance on graphs with unbalanced or heterogeneous degree distributions, as well as statistical inconsistency in large sample limits under mild connectivity violations. Empirically, normalized variants often outperform unnormalized, especially in the presence of degree heterogeneity and for moderate (Sarkar et al., 2013).
A plausible implication is that, for rigorous statistical consistency and robust finite-sample behavior, normalized spectral clustering should be preferred under moderate or unknown degree variation. Unnormalized clustering remains useful for pedagogical purposes, for balanced geometric data, and as the foundation of topological or parameter-free spectral methods (Rieser, 2015).
7. Summary Table: Key Properties
| Property | Unnormalized Spectral Clustering | Normalized Spectral Clustering |
|---|---|---|
| Laplacian definition | ||
| Balances by | Cluster size () | Cluster volume () |
| Statistical consistency | Only in restricted settings (high min degree, small spectrum) | Robust as |
| Sensitivity to degree heterogeneity | High | Low |
| Empirical neighbor spread | Larger within-cluster spread | Shrunk by constant factor |
| Typical use cases | Uniform geometric data, topological settings | Heterogeneous graphs, real data analysis |
The selection of unnormalized spectral clustering should be informed by graph structure, application requirements, and theoretical guarantees. In asymptotic and real-data regimes with degree variability or in the presence of sparse clusters, normalized methodologies are often statistically and empirically superior. Nonetheless, unnormalized variants offer insight into the interplay between topology, spectral theory, and combinatorial clustering (Rieser, 2015, Trillos et al., 2015, 0711.0189, Sarkar et al., 2013).