Topological-Guided Block Screening (TGBS)

Updated 23 November 2025

TGBS is a method that identifies the neural network layer with maximal topological class separability using persistent homology.
It computes persistence diagrams and distances (e.g. Wasserstein) between activations to differentiate within-class from between-class variations.
The selected block underpins FedTopo, enhancing representation alignment in federated learning with heterogeneous, non-IID data.

Topology-Guided Block Screening (TGBS) is a principled procedure for selecting, from among all intermediate blocks in a neural network, the single layer whose feature activations encode maximally discriminative topological information as measured by persistent homology. Within the context of federated learning under non-I.I.D. conditions, TGBS serves as the key precursor to topologically-informed representation alignment, targeting features most robust to client data heterogeneity and most semantically useful for class-level discrimination (Hu et al., 16 Nov 2025).

1. Motivation and Role in Federated Learning

In federated learning, each client’s local data distribution can diverge markedly from the joint population, causing learned representations to drift toward incompatible optima. Conventional pixel- or patch-level matching objectives often fail to capture the global, multi-scale geometric structure relevant to high-dimensional tasks. TGBS addresses this gap by screening for network blocks whose feature activations, when expressed as topological summaries—connected components ( $H_0$ ), loops ( $H_1$ ), etc.—exhibit strong separation between within-class and between-class pairs. Empirical findings show that shallow layers are overly responsive to low-level, noisy textures, while overly deep layers can overcompress and lose critical geometry. Selecting an optimally informative block is thus essential for targeting features that are stable for alignment across clients and sampling conditions (Hu et al., 16 Nov 2025).

2. Persistent Homology on Neural Activations

Let $x$ be an input, $f_\ell(x;w) \in \mathbb{R}^{C \times H \times W}$ the tensor of activations at block $\ell$ , and $c$ a channel index. Each channel is treated as a scalar field $A = f_\ell^{(c)}(x)$ over the 2D grid $[H] \times [W]$ , inducing a sublevel-set filtration,

$A^\lambda = \{ (i,j) \in [H] \times [W] \mid A_{i,j} \leq \lambda \}.$

As $\lambda$ varies, connected components (0-dimensional homology; $H_0$ ) are born (at $b_i$ ), merge, and persist until death (at $d_i$ ); cycles (1-dimensional; $H_1$ ) similarly appear and vanish. The full persistence diagram in degree $k$ for channel $A$ is

$PD^{(k)}(A) = \{ (b_i^{(k)}, d_i^{(k)}) \}_{i=1}^{N_k},$

with $N_k$ the number of $k$ -dimensional topological features.

3. Topological Separability and Block Selection Criteria

For block $b$ , persistence diagrams are computed for samples in a validation set. Pairwise diagram distances (using metrics such as Wasserstein or bottleneck) are gathered: $d_b^+ \sim \{ \textrm{dist}(PD_j, PD_k) \mid y_j = y_k \}$ for same-class and $d_b^- \sim \{\textrm{dist}(PD_j, PD_k) \mid y_j \neq y_k \}$ for different-class pairs. With similarity defined as $s = -d$ , the ROC curve is constructed to evaluate how well topological summaries separate class labels. Correspondingly, the block’s topological separability score is

$\textrm{AUC}_b = P_{u \sim d_b^+, v \sim d_b^-}[u < v],$

a strictly increasing proxy for mutual information $I(T_b;Y)$ . The block maximizing this AUC is chosen as the topology-informative block.

4. Algorithmic Workflow and Pseudocode

The TGBS procedure operates as follows:

Candidate screening: For each block $b$ in backbone $f$ , activations are extracted for all validation samples $x_i$ .
Dimensionality reduction: (Optional) PCA compresses $z_i$ to $K$ channels.
Persistence computation: For each $i$ , persistent homology is calculated for $H_0$ and $H_1$ .
Distance sampling: For each metric $m$ , $N_{\textrm{pairs}}$ within-class and between-class pairs are sampled, their diagram distances calculated, similarities inverted, and ROC AUC measured.
Aggregated scoring: The mean AUC across metrics is computed per block.
Selection: The block $b^*$ maximizing AUC is returned.

Pseudocode:

Algorithm: TGBS( backbone f, candidate blocks B, validation set D, distance metrics M, N_pairs )
  for each block b in B do
    Extract activations z_i = f_b(x_i) for all x_i in D
    (Optional) Dimensionality reduction: ẑ_i = PCA(z_i) to K channels
    For each sample i, compute PD_i = PersistentHomology(ẑ_i) for H_0 and H_1
    for each metric m in M do
      Sample N_pairs within-class pairs Pw and N_pairs between-class pairs Pb
      Compute distances d_{jk} = dist_m(PD_j,PD_k) for (j,k) in P = Pw ∪ Pb
      Set similarities s_{jk} = -d_{jk} and compute AUC_m(b) = ROC_AUC( {s_{jk} )
    end for
    AUC(b) = (1/|M|) sum_{m in M} AUC_m(b)
  end for
  b* = argmax_{b in B} AUC(b)
  return b*

5. Topological Signature Construction: Persistence Images

Raw persistence diagrams $PD = \{ (b_i, d_i) \}$ are mapped to birth–persistence coordinates $(b_i, p_i = d_i - b_i)$ for compatibility with learning-based frameworks. The density function

$\rho(x, y) = \sum_{i} \exp\left(-\frac{(x-b_i)^2 + (y-p_i)^2}{2\sigma^2}\right)$

is rasterized over a fixed $H \times W$ grid and flattened to vector $\psi \in \mathbb{R}^M$ , yielding a persistence image. Features with persistence $p_i$ below a threshold $\epsilon$ may be pruned to reduce noise.

6. Computational Characteristics and Practical Considerations

For $f_\ell(x;w) \in \mathbb{R}^{C \times H \times W}$ , worst-case PH reduction is $O( C (HW)^\omega )$ , $\omega \leq 3$ , but practical implementations tend toward near-linear time via union-find and clearing heuristics: $\tilde{O}(C \cdot H \cdot W)$ . Typically, $C$ is reduced to $K$ via PCA and only $N_{\textrm{pairs}} \approx 1000$ per block are sampled. Conversion from persistence diagrams to persistence images incurs $O(|PD|+M)$ per channel; across $K$ channels the cost is $O(KM)$ . For a ResNet-18 layer with $C=128$ , $H=W=8$ , 5k validation examples require only minutes of computation. This suggests that TGBS scales efficiently even for moderately sized networks and datasets.

7. Illustrative Usage and Integration within FedTopo

Upon pre-training, TGBS is run on a held-out validation set to select the topology-informative block $\ell^*$ . During each federated learning round, clients extract activations at $\ell^*$ and compute topological embeddings,

$t_i(x;w_i) = \frac{1}{K} \sum_{c=1}^K \psi_c(x;w_i),$

comparing local $w_i$ to global $\bar w$ . Topological Alignment Loss (TAL) is instantiated as

$\mathcal{L}_{TAL} = E_{x \sim D_i}\left[\| t_i(x;w_i) - t_i(x;\bar w) \|_2^2\right],$

with total local loss given by $\mathcal{L} = CE + \alpha \cdot \mathcal{L}_{TAL}$ and adaptive scheduling for $\alpha$ . By concentrating TAL on $\ell^*$ , FedTopo aligns precisely those features whose topology maximizes class discriminability, significantly reducing representation drift under non-I.I.D. client splits (Hu et al., 16 Nov 2025).

In summary, Topology-Guided Block Screening constitutes a computationally efficient, data-driven mechanism for block selection in deep neural networks, identifying layers whose persistent-homology signatures maximize topological class separability. This screening underpins both the compact topological embedding and the robust representational alignment realized in the FedTopo framework, and is essential for mitigating the representational divergence endemic to non-I.I.D. federated learning environments.

Markdown Report Issue Upgrade to Chat

References (1)

FedTopo: Topology-Informed Representation Alignment in Federated Learning under Non-I.I.D. Conditions (2025)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Topological-Guided Block Screening (TGBS).