Papers
Topics
Authors
Recent
Search
2000 character limit reached

Persistence Silhouettes in TDA

Updated 19 January 2026
  • Persistence Silhouettes are functional summaries of persistence diagrams, constructed as weighted averages of piecewise-linear tent functions that capture topological feature lifetimes.
  • They offer a highly regularized summary with provable stability guarantees and Lipschitz continuity, enabling robust statistical inference and hypothesis testing.
  • They are efficiently computed and seamlessly integrated into machine learning pipelines, supporting applications in classification, graph analysis, and functional data analysis.

A persistence silhouette is a functional summary of a persistence diagram—a central construct in topological data analysis (TDA) that encodes the birth and death of topological features as a multiset of points in the plane. The silhouette transforms this diagram into a single real-valued, piecewise-linear function using a weighted average of tent functions associated with each persistence pair, controlled by an explicit weighting parameter. This construction offers a highly regularized summary of topological information, with provable statistical properties and stability guarantees, and serves as a foundational tool for statistical inference and machine learning pipelines in TDA (Chazal et al., 2013, Berry et al., 2018, Segovia-Dominguez et al., 2024).

1. Formal Definition and Construction

Given a persistence diagram D={(bi,di)}i=1ND = \{(b_i, d_i)\}_{i=1}^N comprising NN points (bi,di)(b_i, d_i) with birth and death times, the tent (triangle) function associated to (bi,di)(b_i, d_i) is

Ai(t)={tbi,t[bi,bi+di2], dit,t[bi+di2,di], 0,otherwise.A_i(t) = \begin{cases} t - b_i, & t \in [b_i, \frac{b_i + d_i}{2}], \ d_i - t, & t \in [\frac{b_i + d_i}{2}, d_i], \ 0, & \text{otherwise}. \end{cases}

Each AiA_i is continuous, piecewise-linear, and $1$-Lipschitz with respect to tt.

A nonnegative weight function w:R2[0,)w: \mathbb{R}^2 \to [0, \infty), commonly of the form wp(b,d)=(db)pw_p(b, d) = (d - b)^p for p>0p > 0, is assigned to each persistence pair. The general weighted silhouette Silw:DC(R)\mathrm{Sil}_w: D \to C(\mathbb{R}) is defined as

Silw(t)=i=1Nw(bi,di)Ai(t)i=1Nw(bi,di).\mathrm{Sil}_w(t) = \frac{\sum_{i=1}^N w(b_i, d_i) A_i(t)}{\sum_{i=1}^N w(b_i, d_i)}.

The most frequent choice is pp-silhouette: Silp(t)=i=1N(dibi)pAi(t)i=1N(dibi)p.\mathrm{Sil}_p(t) = \frac{\sum_{i=1}^N (d_i - b_i)^p A_i(t)}{\sum_{i=1}^N (d_i - b_i)^p}. This function is always $1$-Lipschitz in tt, and the weights allow continuous interpolation between emphasizing all features equally (pp small) and focusing on the most persistent (pp large) (Chazal et al., 2013, Berry et al., 2018, Segovia-Dominguez et al., 2024).

2. The Silhouette in the Landscape-Unification Framework

The silhouette belongs to the class of functional summaries F:DBF\mathcal{F}: \mathcal{D} \to \mathcal{B}_F, mapping diagrams D\mathcal{D} into a Banach space BF\mathcal{B}_F of real functions (Berry et al., 2018). Unlike the persistence landscape, which forms a sequence λD(k,t)\lambda_D(k, t) of the kkth-largest tent values at tt, the silhouette is a single function, which makes it the minimal-dimensional continuous functional summary of a diagram. This property enables seamless application of functional data analysis and machine learning techniques, including averaging, hypothesis testing, and classification in the space of functions.

The parameter pp gives fine control over the information retained:

  • As p0p \to 0, Silp\mathrm{Sil}_p approaches the unweighted mean of the AiA_i.
  • As pp \to \infty, Silp\mathrm{Sil}_p converges to the tent associated to the persistence pair with the longest lifetime.

This construction provides a consistent interface for statistical learning, clustering, and permutation testing on diagrams via their functional image (Berry et al., 2018).

3. Statistical Properties and Stochastic Convergence

The silhouette enjoys rigorous stochastic-process theory under mild conditions:

  • Uniform boundedness: Silp(t)\mathrm{Sil}_p(t) is uniformly bounded over tt, since each tent is bounded by half the corresponding lifetime (Berry et al., 2018).
  • Lipschitz equicontinuity: Since each AiA_i is 1-Lipschitz and normalization by positive weights preserves this property, the family {Silp(D;t)}D\{\mathrm{Sil}_p(D; t)\}_{D} is equicontinuous. This ensures strong uniform laws of large numbers and consistency:

supt[0,T]Siln(t)Ψp(t)0a.s.,\sup_{t \in [0, T]} |\overline{\mathrm{Sil}}_n(t) - \Psi_p(t)| \to 0 \quad \text{a.s.},

where Siln(t)\overline{\mathrm{Sil}}_n(t) is the sample mean silhouette and Ψp(t)=E[Silp(t)]\Psi_p(t) = \mathbb{E}[\mathrm{Sil}_p(t)] is the population mean.

  • Central limit theorem (CLT): The empirical process

Gn(t)=n(Siln(t)Ψp(t))G_n(t) = \sqrt{n} \left( \overline{\mathrm{Sil}}_n(t) - \Psi_p(t) \right)

converges weakly to a mean-zero Gaussian process indexed by tt, with explicit rates of convergence (Chazal et al., 2013).

  • Bootstrap consistency: Bootstrap samples of {Silp,j(t)}\{\mathrm{Sil}_{p, j}(t)\} yield valid LL_\infty-confidence bands for Ψp(t)\Psi_p(t) with asymptotic coverage 1α1 - \alpha up to order (logn)1/2n1/8(\log n)^{1/2} n^{-1/8} (Chazal et al., 2013, Berry et al., 2018). Both uniform and studentized (variable-width) confidence bands may be constructed.

These results allow direct application of permutation tests and prediction regions in the function space, using classical metrics such as L2L_2 and LL_\infty (Berry et al., 2018).

4. Stability and Robustness

The silhouette inherits strong stability properties from persistence landscapes:

  • Lipschitz stability: For diagrams D,DD, D', the uniform distance between their silhouettes is bounded by the bottleneck distance:

Silp(D)Silp(D)dB(D,D),\|\mathrm{Sil}_p(D) - \mathrm{Sil}_p(D')\|_\infty \le d_B(D, D'),

where dBd_B denotes the standard bottleneck distance (Chazal et al., 2013, Segovia-Dominguez et al., 2024). This property implies robustness to noise and small perturbations in the data.

  • All stability results proved for landscapes (in particular stability w.r.t. pp-Wasserstein distance) carry over directly to silhouettes (Chazal et al., 2013, Segovia-Dominguez et al., 2024).
  • EMP framework extension: In the Effective Multidimensional Persistence (EMP) extension, one computes families of silhouettes across slices of multidimensional parameter grids. The EMP silhouette inherits the single-parameter silhouette’s stability: the sum of uniform deviations across slices is bounded above by the corresponding sum of Wasserstein distances between diagrams (Segovia-Dominguez et al., 2024).

5. Algorithmic Implementation

The silhouette can be computed as follows:

  1. Input: Persistence diagram DD, weight function ω\omega, evaluation grid {t1,,tN}\{t_1, \ldots, t_N\}.
  2. Feature computation: For each j=1,,Dj=1, \ldots, |D|, compute lifetime j=djbj\ell_j = d_j - b_j, set wj=ω(j)w_j = \omega(\ell_j).
  3. Tent evaluation: For each tkt_k, compute Aj(tk)=max{0,min(tkbj,djtk)}A_j(t_k) = \max\{0, \min(t_k - b_j, d_j - t_k)\}.
  4. Weighted sum: For each tkt_k, form numerator nk=jwjAj(tk)n_k = \sum_j w_j A_j(t_k), denominator d=jwjd = \sum_j w_j, then output sk=nk/ds_k = n_k/d.
  5. Complexity: The algorithm requires O(mN)O(m N) flops for mm features and NN grid points (Berry et al., 2018).

In the EMP framework, this computation is repeated across mm slices, and resulting silhouette vectors are assembled into a matrix or higher-dimensional array (Segovia-Dominguez et al., 2024).

6. Applied Usage and Empirical Results

  • Functional inference and classification: Silhouettes have been used as features in kk-nearest neighbor classification, for example in the analysis of simulated Gleason histology, yielding a test error of 11.75% on a four-class task with 400 held-out regions of interest (Berry et al., 2018).
  • Two-sample testing: The silhouette fits directly into permutation-test frameworks for comparing two populations via L2L_2 or LL_\infty distances between sample mean silhouettes (Berry et al., 2018).
  • Machine learning on graphs: EMP silhouettes have been evaluated as input features for standard classifiers (Random Forest, SVM, CNN) on benchmark graph classification datasets (e.g., BZR_MD, COX2_MD, DHFR_MD, MUTAG, REDDIT-B), achieving competitive or state-of-the-art accuracy (Segovia-Dominguez et al., 2024). For instance, combined H0H_0- and H1H_1-EMP silhouettes gave 88.1% accuracy on MUTAG and 88.6% on REDDIT-B.
  • Statistical rigour: All such applications benefit from the silhouette’s stability and the availability of functional CLTs, uniform confidence bands, and valid asymptotic inference (Chazal et al., 2013, Berry et al., 2018).

7. Limitations and Interpretive Considerations

  • Information compression: By averaging over all tent functions, the silhouette summarizes a persistence diagram as a single function, potentially losing multimodal information captured in higher levels (k2k \geq 2) of the persistence landscape. Thus, secondary and tertiary modes of feature persistence may be obscured (Chazal et al., 2013).
  • Weight sensitivity: Selection of the weight parameter pp (or general weight function ww) directly impacts the prominence given to features of varying persistence. Empirical tuning or application-specific guidance may be required (Chazal et al., 2013, Berry et al., 2018).
  • Implementation: For diagrams with only short-lived features (nearly zero lifetime), normalization may be unstable; thresholds on lifetimes or addition of a small ϵ\epsilon may be necessary (Berry et al., 2018).

In summary, the persistence silhouette provides a one-Lipschitz, single-function summary of a persistence diagram, interpolating between uniform averaging of topological features and maximal emphasis on the longest bars. It is theoretically underpinned by stability, stochastic process convergence, and direct applicability to hypothesis testing and machine learning tasks. The silhouette integrates elegantly into frameworks for both single- and multi-parameter persistent homology, supporting a broad spectrum of statistical and computational pipelines (Chazal et al., 2013, Berry et al., 2018, Segovia-Dominguez et al., 2024).

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Persistence Silhouettes.