Probabilistic Inclusion Depth (PID)
- PID is a data depth measure that quantifies the centrality of fuzzy and binary contours using a probabilistic inclusion operator.
- It computes inclusion scores via average measures, ensuring binary consistency, coordinate-agnosticism, and robustness to perturbations.
- The PID-mean approximation enables linear-time, GPU-accelerated computation, making it scalable for high-dimensional ensemble visualizations.
Probabilistic Inclusion Depth (PID) is a data depth measure designed for centrality ordering and ensemble visualization of fuzzy contours—that is, soft masks output by modern segmentation models as well as conventional binary contours—across arbitrary spatial domains. PID generalizes previous contour depth concepts by employing a probabilistic inclusion operator that supports both fuzzy and crisp representations, accommodates non-uniform grids, and enables scalable computation for large and high-dimensional ensembles (Wu et al., 17 Dec 2025).
1. Probabilistic Inclusion Operator
Let denote a spatial domain (e.g., a 2D image or 3D volume) equipped with a measure %%%%1%%%% (counting for discrete grids, Lebesgue for continuous). A fuzzy contour, or soft mask, is described by a function
such that indicates the probability or degree of membership of in the contour. The total mass is . Provided , induces a measure for measurable .
Given two fuzzy masks and , the probabilistic inclusion operator is
For indicator functions , this reduces to , recovering the continuous-subset operator used in exact Inclusion Depth (eID).
2. Mathematical Formulation of Probabilistic Inclusion Depth (PID)
Given an ensemble of fuzzy contours , PID assigns to each a scalar depth quantifying its centrality: Here, measures the average degree to which is contained in the rest, and measures how well contains others. PID is then defined as
This yields a robust, center-outward ordering. In expanded form:
3. Theoretical Properties
PID and the operator exhibit several important properties:
- Binary consistency: For masks , PID coincides exactly with eID.
- Monotonicity in : is monotonic in .
- Linearity in : .
- Scale-invariance in : Multiplying by a positive scalar does not affect .
- Asymmetry: in general.
- Lipschitz continuity: and PID are Lipschitz continuous in the norm, providing robustness to mask perturbations.
- Coordinate-agnostic: PID depends solely on and is invariant under measure-preserving transformations of , supporting uniform/non-uniform grids and manifolds.
4. Efficient Computation and PID-mean Approximation
Exact PID (Quadratic Complexity)
For ensemble members of spatial size , full PID evaluation is :
- For each pair , compute , along with denominators for and .
- is obtained via as described above.
PID-mean (Linear Complexity)
The ensemble mean mask enables a linear-time surrogate:
This involves operations since is precomputed. The approximation error is bounded by the coefficient of variation of ; if mask masses are homogeneous, PID-mean closely approximates the exact PID.
5. GPU Implementation and Computational Scalability
PID-mean's linear structure allows efficient GPU parallelization:
- For each , a CUDA thread block computes numerators and denominators over , exploiting memory coalescence for maximal bandwidth.
- Within each block, voxels are processed in parallel, accumulating partial sums via shared-memory tree reduction; totals are transferred to the host for final computation of .
Empirical performance demonstrates:
| Dataset | eID (CPU) | PID-mean (CPU) | PID-mean (GPU) |
|---|---|---|---|
| 3D synthetic, k–20k, vox | 50–250 s | 4–11 s | 1.2–2.0 s (up to 125× eID) |
| 3D, , – vox | 22–216 s | 14–29 s | 2.3–5.5 s (up to 39× eID) |
For large or high resolutions, eID becomes intractable in memory, while PID-mean (especially on GPU) remains scalable (Wu et al., 17 Dec 2025).
6. Fuzzy Isovalue Modeling and Sensitivity Encoding
PID supports a probabilistic treatment of isovalue selection in scalar field ensembles by integrating over an isovalue distribution. For each scalar field , instead of extracting a sharp isocontour at , define with density and set: This results in a fuzzy, "soft" isocontour capturing how small changes in affect membership, encoding isolevel sensitivity. Such modeling is comparable to interval volumes and probabilistic marching cubes but specifically adapts to depth-based ensemble visualization via PID. These fuzzy masks directly serve as PID/PID-mean inputs.
7. Experimental Results and Practical Applications
Rank Stability and Consistency
PID yields significantly more stable ensemble rankings than previous methods under isovalue shifts and member removal:
- For weather-forecast ensembles (N=20), PID achieves a rank-rank Pearson correlation of across isolevels m and m, outperforming eID ().
- On synthetic contour sets (), PID-mean agrees strongly with eID () and Prob-IoU (); CBD and ISM show weaker consistency.
Upon removing extreme outliers in 3D ellipsoid data, PID maintains high rank correlation (, ), exceeding Prob-IoU.
Application Domains
- Medical segmentation: PID-mean ranks outputs from 31 SegResNet models (224×224×144) in the MSD Brain-Tumour dataset, producing 3D ensemble boxplots that delineate tumour shape uncertainty.
- Large-scale binary masks: On IXI data (400 hippocampus, 400 ventricle masks), PID-mean extracts central envelopes and medians without the overplotting typical for spaghetti plots.
- Scalar field ensembles: PID-mean applied to 50×150 ScalarFlow smoke plume reconstructions tracks uncertainty evolution in 3D time-series boxplots.
- Flexible grids/manifolds: Weight factors enable PID-mean on uniform/non-uniform grids, meshes, or manifolds; the method is coordinate-agnostic.
Scalability extends to tens of thousands of ensemble members and 3D grids of up to voxels using GPU PID-mean. For spatiotemporal ensembles, time is incorporated as an extra spatial axis, enabling 4D centrality rankings (Wu et al., 17 Dec 2025).
PID generalizes contour-depth analysis to fuzzy masks by substituting set inclusion with an expectation under the reference mask, retaining core theoretical guarantees (monotonicity, stability, coordinate-agnosticism, binary specialization) and enabling scalable, high-resolution ensemble visualization via linear- and GPU-accelerated algorithms. Demonstrated across synthetic and real applications in meteorology, medical imaging, and volumetric data, PID provides robust rankings, increased stability to isovalue fluctuations, and substantial computational acceleration relative to prior approaches (Wu et al., 17 Dec 2025).