Time-Varying Persistence Diagrams
- Time-Varying Persistence Diagrams (TVPDs) are sequences of persistence diagrams that capture the evolution of topological features in dynamic datasets by integrating both temporal and spatial information.
- The Continuous Edit Distance (CED) framework provides a mathematically principled method for aligning, averaging, and clustering TVPDs, ensuring robust metric computation and interpretable geodesics.
- Dynamic programming and geodesic-based optimization techniques support scalable algorithmic analysis, improving motif discovery and classification in various real-world applications.
Time-varying persistence diagrams (TVPDs) are sequences of persistence diagrams indexed over time, representing the evolution of topological features within time-varying data. TVPDs constitute a foundational structure in topological data analysis for dynamical systems, capturing how homological features such as connected components, loops, and voids emerge, persist, and vanish as data evolve. The robust comparison, alignment, averaging, and clustering of TVPDs require metrics that are simultaneously sensitive to both topological and temporal aspects. The Continuous Edit Distance (CED) framework for TVPDs provides a mathematically principled and computationally tractable metric that generalizes string edit distances and incorporates temporal elasticity, optimal alignment, and interpretable geodesic paths (Tchitchek et al., 15 Dec 2025).
1. Mathematical Definition and Metric Properties
A TVPD is a map defined on a collection of time intervals , typically discretized at a fixed subdivision scale , assigning to each a persistence diagram in the Polish metric space , where is the 2-Wasserstein distance on persistence diagrams. For two TVPDs and , and subdivision intervals , of length , the local substitution cost is
with trading off spatial diagram mismatch against temporal misalignment. Deletion and insertion costs are
where is a reference (typically the empty diagram).
The global edit distance, for gap-penalty , is defined by minimizing over all order-preserving partial assignments between the subdivisions of and : This infimum yields a true metric, satisfying the triangle inequality and nonnegativity for unit-cost choices (Tchitchek et al., 15 Dec 2025).
2. Geodesics and Explicit Constructions
For , there exists an explicit constant-speed geodesic, providing an interpretable optimal path between TVPDs in edit-distance geometry. The geodesic consists of three phases:
- Deletion: Subintervals of the source TVPD unmatched in the optimal assignment collapse to at constant speed.
- Substitution: Matched subdivisions move continuously (via local -geodesics) toward their targets.
- Insertion: Unmatched subdivisions in the target TVPD are spawned from . This construction enables extraction of midpoint trajectories, facilitates barycenter algorithms, and guarantees minimum-energy paths (Tchitchek et al., 15 Dec 2025).
3. Algorithmic Approaches
The pairwise CED computation is cast as a dynamic programming problem. Letting denote the minimal CED between prefixes , , the Bellman recursion is: with initialization and as cumulative deletion and insertion costs. The table dimension is . Precomputation of local costs is dominated by the evaluation of for persistence diagrams at multiple time stamps and typically costs , where and is the diagram size. The overall pairwise CED computation is (Tchitchek et al., 15 Dec 2025).
4. Barycenters and Fréchet Means in TVPD Spaces
Given a collection of TVPDs, a barycenter minimizes the CED-based Fréchet functional,
Two optimization strategies, both leveraging explicit CED-geodesics, are employed:
- Stochastic Geodesic Descent: Initialize at a randomly chosen TVPD; at each iteration, move along a CED-geodesic toward a randomly selected TVPD reference, accepting steps that reduce .
- Greedy Geodesic Descent: For each reference TVPD and prescribed step discretization along the geodesic, select the candidate with minimal Fréchet energy.
Both approaches guarantee monotonically non-increasing Fréchet energy and yield local minimizers, affording practical computation of barycenters for alignment and clustering (Tchitchek et al., 15 Dec 2025).
5. Empirical Performance and Robustness
Empirical evaluation demonstrates the CED framework’s robustness to both temporal and spatial perturbations. Perturbing sample times (“temporal jitter”) or the feature values within persistence diagrams leads to near-linear or piecewise-linear CED growth at small noise levels, quantifying stability to real-world noise. CED alignment correctly recovers temporal offsets (e.g., phase shifts in dynamical regime transitions), outperforms standard elastic dissimilarities (such as Dynamic Time Warping and TWED) for motif search, and supports accurate motif window retrievals (Tchitchek et al., 15 Dec 2025).
Clustering and classification via CED-barycenter-based -means, both stochastic and greedy, achieve superior or competitive results relative to baseline dissimilarities across sea surface height (SSH), VESTEC combustion, and asteroid impact datasets. Notably, for the VESTEC and Asteroid datasets, only CED-based clustering perfectly recovers the true clusters, as measured by misclassification error (Tchitchek et al., 15 Dec 2025).
6. Computational Bottlenecks and Accelerations
The dominant computational cost is the dynamic program over all subdivision pairs, with each local cost requiring Wasserstein distance calculations. Two prominent accelerations alleviate this bottleneck:
- Pruning persistence diagrams below a persistence threshold, reducing the diagram size .
- Restriction of the DP table to a Sakoe–Chiba–type temporal corridor of width , reducing table size to .
Scaling experiments show that barycenter computations scale principlely with the size of the TVPDs and, for greedy descent, also with the product of the number of references and step count (Tchitchek et al., 15 Dec 2025).
7. Connections and Context
The CED framework for TVPDs generalizes classical edit distances between sequences and exponent-strings (Baek, 2024), and extends these concepts to the setting of topological data evolving in time, combining ideas of elastic alignment and Wasserstein geometry. It unifies temporal and topological dissimilarity under a single metric structure, enabling direct analysis of TVPDs in clustering, motif discovery, and averaging without recourse to feature aggregation or time-series vectorization. The design is flexible in the trade-off parameters , admits explicit geodesics, and supports scalable barycenter computation, embedding TVPD analysis firmly within the modern metric learning toolkit.
The metric, explicit geodesic structure, and optimization procedures situate CED as a natural extension of edit-distances, analogous to recent developments in Continuous Fréchet Edit Distance for polygonal curves (Fox et al., 2024), but for higher-level topological descriptors over time. Empirical evidence supports its robustness, interpretability, and performance in applied data analysis contexts involving time-evolving shapes and spatial-temporal fields (Tchitchek et al., 15 Dec 2025).