Interpretable VAMP-Net for MTB Drug Resistance
- The paper introduces VAMP-Net, a hybrid deep learning model that combines a permutation-invariant set attention transformer with a quality-aware 1D-CNN to accurately predict drug resistance in MTB.
- It employs an adaptive gating fusion mechanism to integrate biological and technical features, achieving superior performance with AUC scores close to 97%.
- The framework enhances interpretability with attention analyses and integrated gradients, facilitating insights into epistatic interactions and variant-level quality for clinical applications.
The Interpretable Variant-Aware Multi-Path Network (VAMP-Net) is a supervised deep learning framework for robust genomic prediction of drug resistance in Mycobacterium tuberculosis (MTB). It implements two complementary machine learning pathways: a permutation-invariant Set Attention Transformer to capture epistatic interactions between genomic loci, and a quality-aware 1D Convolutional Neural Network (CNN) for adaptive assessment of variant-level quality metrics. These are fused via an adaptive gating mechanism for final resistance classification, yielding high predictive accuracy while providing interpretable outputs at both the biological and technical levels (Boutorh et al., 25 Dec 2025).
1. Permutation-Invariant Set Attention Transformer: Path-1
Path-1 processes each MTB isolate as an unordered set of genomic variant tokens , with each token represented as ChromPos_Ref>Alt (e.g., “761139_C>A”). Tokens are embedded via a shared embedding matrix to yield for padded length, with no positional encoding, enforcing strict permutation invariance: for all .
The model utilizes multi-head Set Attention Blocks (SAB) with heads. Each SAB computes:
For unmasked attention:
With padding-masked attention, a mask is introduced, modifying attention to:
SABs are stacked with layer-norm and residual connections (no causal masking). After SABs, outputs are globally pooled:
This pathway enables sensitive modeling of multilocus epistatic architectures by eschewing input order, a crucial feature when variant call sets exhibit arbitrary ordering.
2. Quality-Aware 1D-CNN: Path-2
Path-2 processes variant-level quality features per isolate. Inputs , with channels comprising GT, DP, DPF, COV_REF, COV_ALT, FRS, GT_CONF, and GT_CONF_PERCENTILE. Missing values are imputed and features min–max scaled.
Each convolutional block applies:
Activation functions use ReLU (or optionally GELU), kernel size , dropout , and L2 regularization. The best configuration (Model A) uses 3 convolutional layers followed by flattening or pooling to generate .
This pathway affords adaptive modeling of sequencing evidence, tracking technical variables that influence confidence in each variant call.
3. Fusion and Classification Architecture
VAMP-Net integrates both pathways using a gated amplification module, selected as the optimal fusion strategy. The gate modulates SAB outputs:
Alternatives such as suppression and bipolar adaptive gating were evaluated, but amplification yielded superior accuracy.
The fused vector passes through fully-connected layers with dropout () and ReLU, producing a final logit with output probability . Classification uses weighted binary cross-entropy to address class imbalance, with regularization:
4. Interpretability Mechanisms
VAMP-Net provides dual interpretability layers:
- Attention Weight Analysis (Path-1): Extracts self-attention matrices , averages over heads and samples, ranks pairwise interactions, and defines epistatic networks. Variant hubs with high summed attention signal significant genetic interactions.
- Integrated Gradients (Path-1): Computes per-variant saliency for token embedding given baseline :
Averaging over the test set highlights critical resistance loci (notably rpoB).
- Gradient-based Feature Importance and Ablation (Path-2): Saliency maps enable ranking and test-time ablation, assessing channel relevance by drop in AUC or accuracy.
This interpretability design enables auditable correlation between genotype, variant confidence, and the final resistance call, facilitating biological inference and technical quality control.
5. Empirical Performance and Robustness
Evaluation used CRyPTIC MTB isolates for Rifampicin (RIF) and Rifabutin (RFB). Model A with gated amplification fusion achieved:
| Drug | Accuracy | Precision | Recall | F1 | AUC |
|---|---|---|---|---|---|
| RIF | 0.952 | 0.951 | 0.960 | 0.955 | 0.969 |
| RFB | 0.939 | 0.952 | 0.955 | 0.954 | 0.968 |
Baseline comparisons consisted of MLP and 1D-CNN on binary SNP matrix after selection, yielding AUC 0.85–0.87. VAMP-Net’s SAB pathway outperformed both by approximately 10 AUC points.
Additional ablation demonstrated minor improvements from padding-masked attention (AUC = 0.969, balanced accuracy 0.945) over unmasked SAB (AUC 0.967). Early-stopping and stable cross-validation splits confirmed reproducible gains, though no formal -values were reported.
A plausible implication is that fusion of biologically and technically grounded pathways produces generalizable, high-precision clinical resistance prediction not achievable with single-modal baselines.
6. Context, Significance, and Future Directions
VAMP-Net establishes a new paradigm for interpretable, actionable genomics in resistance prediction, combining state-of-the-art accuracy (AUC 97%) with comprehensive dual-layer auditability. The SAB pathway enables detection and visualization of epistatic genomic networks, while the quality-aware CNN contextually weighs input confidence on a per-drug basis.
This suggests a generalizable framework for genotype-to-phenotype models where variant call quality and nonlinear genetic interactions are critical. A plausible implication is that further extension of permutation-invariant attention and adaptive gating may benefit other clinical genomics domains. The architecture is designed with modularity to accommodate additional input modalities or alternative fusion mechanisms.
No indicators of controversy were reported regarding this approach in the referenced literature (Boutorh et al., 25 Dec 2025). The combination of permutation-invariant modeling, set attention, and interpretable feature attribution mechanisms positions VAMP-Net as a reference model for computational resistance prediction pipelines, especially where interpretability at both genetic and technical levels is essential for clinical adoption.