Federated Learning for WBC Morphology Analysis

Updated 14 January 2026

The paper demonstrates a federated framework that aggregates model updates to enable privacy-preserving learning on non-IID white blood cell datasets across multiple clinical institutions.
It employs advanced aggregation methods like FedAvg, FedMedian, FedProx, and FedOpt alongside robust preprocessing to tackle staining variability and label skew.
Empirical results show that federated models achieve competitive balanced accuracy and improved generalization compared to local models, supporting equitable diagnostic applications.

Federated learning for white blood cell (WBC) morphology analysis is a distributed approach enabling robust, privacy-preserving AI development across clinical institutions. Rather than transferring raw patient data, federated learning frameworks aggregate model updates to collaboratively train deep neural networks on diverse, non-IID blood film datasets, thus overcoming data-sharing restrictions and mitigating domain shifts caused by staining and imaging variance. This paradigm supports generalizable hematological diagnostics, particularly for resource-limited healthcare environments (Ansah et al., 7 Jan 2026, Lee et al., 28 Apr 2025).

1. Mathematical Formulation and Federated Optimization

In federated WBC morphology analysis, a set of clinically distinct institutions ("clients") $C=\{1,\dots,N\}$ each hold private datasets $D_i=\{(x_{i,j},y_{i,j})\}_{j=1}^{n_i}$ , with $n_i=|D_i|$ . The goal is to learn a global model $w\in\mathbb{R}^d$ (e.g., ResNet-34 or Vision Transformer weights), minimizing the weighted average local empirical risk:

$\min_{w\in\mathbb R^d} F(w) = \sum_{i=1}^N \frac{n_i}{\sum_{k=1}^N n_k} F_i(w),\qquad F_i(w) = \mathbb{E}_{(x,y)\sim D_i}[\ell(w;x,y)],$

where $\ell(w;x,y)$ is the cross-entropy loss modulated by Focal Loss for class imbalance. This federated objective intrinsically accommodates non-IID label distributions across sites (Ansah et al., 7 Jan 2026).

2. Federated Algorithms and Communication Protocols

Federated training in WBC morphology analysis leverages multiple aggregation strategies, notably FedAvg, FedMedian, FedProx, and FedOpt, typically implemented using synchronous communication rounds within frameworks such as Flower.

FedAvg: Weighted average of client updates: $w^{t+1} = \sum_{i=1}^N \frac{n_i}{\sum_j n_j} w^t_i$ .
FedMedian: Coordinate-wise median aggregation: $w^{t+1}_j = \mathrm{median}\{w_{1,j}^t,\ldots,w_{N,j}^t\}$ , providing increased robustness to outlier updates.
FedProx: Applies proximal regularization $\frac{\mu}{2}\|w-w^t\|^2$ for stability under heterogeneity.
FedOpt: Utilizes Adam optimizer on aggregated gradients.

A canonical protocol comprises $T=5$ global rounds, each with $E=5$ local training epochs. Clients employ batch size 8, gradient clipping, and gradient accumulation. ResNet-34 (11M parameters) and DINOv2-Small (9M parameters) models yield round-wise communication of 44 MB and 36 MB, respectively, with potential reductions via quantization or sparse updates (Ansah et al., 7 Jan 2026).

3. Data Preprocessing, Staining Variability, and Non-IID Label Skew

Preprocessing pipelines standardize clinical blood film images across sites:

Resizing to $224\times224$ pixels.
Morphology-preserving augmentation: translation ( $\pm10\%$ ), rotation ( $\pm5^\circ$ ).
Class imbalance tackled via Focal Loss ( $\gamma=2$ ) and weighted random sampling.

In comparative benchmark studies (Lee et al., 28 Apr 2025), preprocessing also includes Otsu thresholding, morphological opening/closing, and extraction of cell eccentricity, solidity, and perimeter as auxiliary input channels. Non-IID partitioning is simulated by Dirichlet sampling ( $\alpha$ controls heterogeneity), with lower $\alpha$ exacerbating cross-site label imbalance and increasing required convergence rounds.

4. Model Architectures: CNN, Transformer, and Kolmogorov–Arnold Networks

Federated frameworks for WBC morphology analysis employ diverse architectures:

ResNet-34: Pre-trained on ImageNet, with partial block freezing (first 2 blocks) and fine-tuning of remaining layers; output dimension of 11 (cell classes).
DINOv2-Small: Vision Transformer model; 12 blocks with selective fine-tuning (blocks 8–11), trained via self-supervised objectives on heterogeneous data (explicit domain classifier absent).
Kolmogorov–Arnold Networks (KAN): Universal function approximators using fixed Gaussian/RBF basis, single-layer or shallow depth, optimized for width ( $g$ grid size, $\omega=g^d$ hidden nodes). KAN input tensors concatenate RGB with morphological side channels (Lee et al., 28 Apr 2025).

KAN architectures, in federated settings, demonstrate superior accuracy to multi-layer perceptrons (MLP), with minimal depth and increased width yielding best performance ( $d=1$ , $\omega=125$ reached $94.3\%$ accuracy in benchmark studies).

5. Privacy, Security, and Communication Efficiency

Federated learning ensures strict privacy—no raw data transmission; only model weights/gradients are exchanged. Secure channels (e.g., TLS) are standard, while advanced privacy-preserving mechanisms such as secure aggregation, differential privacy (DP), and homomorphic encryption (e.g., SHEFL) are proposed as future enhancements.

Per-round communication cost for KAN ( $\omega_3=125$ ) is approximately 8 MB per hospital; total training costs (100–150 rounds) are $\approx$ 1 GB, with further reduction achievable via 8-bit quantization or top- $k$ sparsification.
MORPHFED's pilot excluded DP and secure aggregation, but these are recommended for production deployments (Ansah et al., 7 Jan 2026, Lee et al., 28 Apr 2025).

6. Empirical Results and Generalization to Novel Institutions

Performance evaluation demonstrates notable federated learning effectiveness:

Balanced accuracy (BA) and Macro F1 for federated ResNet-34/DINOv2-Small models approach $93\%$ (ResNet) and $87\%$ (DINOv2) of centralized training metrics.
FedMedian aggregation provided superior cross-model stability, robust to outlier client updates, but may suppress minority-class signal.
On external validation (Client 3: Barcelona), federated models generalized better ( $+3$ points BA improvement; $p=0.021$ ), particularly enhancing Band-neutrophil F1 scores (FedMedian: 0.62 vs. Centralized: 0.30).
Local models suffered substantial cross-site drops (ResNet-34: Local1 $0.4497$, Local2 $0.4106$ vs. Federated $0.5738$).
KAN-based federated models maintained statistical significance ( $p<0.01$ vs. MLP-3) on accuracy metrics across all optimization methods.

Aggregation	ResNet-34 (BA)	DINOv2-Small (BA)
FedAvg	0.5679	0.5591
FedMedian	0.5738	0.5797
FedProx	0.5546	0.5718
FedOpt	0.3638	0.5594

Class-wise F1 performance for rare cells is sensitive to aggregation rule selection. Adaptive aggregation (FedOpt) favors minority-class recovery at the expense of overall BA.

7. Architectural–Aggregation Interplay and Future Directions

Interplay between network architecture and aggregation protocol is critical. Median-based aggregation confers resilience but may attenuate rare-class learning; adaptive aggregation (e.g., FedOpt, FedDyn, FedSAM) facilitates minority-class sensitivity, but with potential risk of divergence, especially for CNNs.

Recommended future directions include:

Implementation of secure aggregation and DP on model updates.
Stain-normalization GANs (e.g., FedOrch GAN) to further harmonize domain features.
Expansion via hierarchical FL and personalized aggregation (ALT, FedLA).
Synthetic oversampling in embedding space to address ultra-rare morphologies (ESRT).
Integration of prompt-based adapters (FLoRA) for continuous adaptation and onboarding new clinic data.

A plausible implication is that federated learning frameworks, when coupled with robust aggregation strategies and scalable architectures like KAN and transformer models, may significantly empower equitable, privacy-preserving, and domain-invariant WBC morphology analysis in global, distributed healthcare settings (Ansah et al., 7 Jan 2026, Lee et al., 28 Apr 2025).

Markdown Report Issue Upgrade to Chat

References (2)

MORPHFED: Federated Learning for Cross-institutional Blood Morphology Analysis (2026)

A Unified Benchmark of Federated Learning with Kolmogorov-Arnold Networks for Medical Imaging (2025)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Federated Learning Framework for White Blood Cell Morphology Analysis.