VAMP-Net Architecture for Genomic Drug Resistance
- VAMP-Net is a deep learning architecture that predicts binary drug resistance by modeling complex epistatic interactions among genetic variants.
- It employs a dual-path framework: a set attention transformer to capture variant dependencies and a 1D-CNN to evaluate per-variant quality, fused via an amplification gate.
- The model offers dual-layer interpretability through attention mapping and integrated gradients, yielding insights into both biological drivers and technical confidence.
VAMP-Net (Variant-Aware Multi-Path Network) is a deep neural architecture designed for the prediction of binary drug resistance in Mycobacterium tuberculosis from genome-wide variant calls. It integrates permutation-invariant set attention with parallel confidence estimation from raw sequencing quality metrics. The architecture addresses two major challenges in genomic resistance prediction: modeling the epistatic interactions among genetic variants and accounting for the variable reliability of sequencing data. VAMP-Net couples a Set Attention Transformer for variant sets and a 1D convolutional network for per-variant confidence, uniting both in a late-fusion classification module. The framework achieves over 95% accuracy and area under the receiver operating characteristic curve (AUC ∼97%) for rifampicin and rifabutin resistance classification, and provides dual-layer interpretability into both biological and technical determinants of model outputs (Boutorh et al., 25 Dec 2025).
1. Dual-Path Network Architecture
VAMP-Net consists of two complementary computational pathways:
- Path-1: Variant-Aware Set Attention Transformer operates on the unordered set of variant tokens. It uses an embedding of each variant and stacks multiple permutation-invariant Set Attention Blocks (SABs) to capture epistatic, non-local dependencies among genomic loci.
- Path-2: Quality-Aware 1D-CNN ingests the quality associated with each variant as provided by Variant Call Format (VCF) FORMAT fields. This pathway applies a sequence of 1D convolutions and pooling on per-variant feature matrices to model the adaptive confidence of variant calls.
A late-fusion module adaptively combines the outputs by modulating the embedding from Path-1 with a sigmoid-gated vector from Path-2. The fused representation is then forwarded to a multi-layer perceptron (MLP) and a final sigmoid for resistance probability prediction.
2. Path-1: Set Attention Transformer for Variant Sets
Input Representation
Each sample is represented by a set of variants, encoded as strings (e.g. “761139_C>A”), tokenized with a pretrained BERT tokenizer and embedded into , where is the maximum number of variants per sample after padding. A binary mask designates non-informative (padded) positions.
Set Attention Blocks (SABs)
The core block applies multi-head self-attention without positional encoding: with FFN as , , , blocks.
SAB blocks are either fully unmasked (for strict permutation invariance) or apply the padding mask to maintain set-equivariance on valid tokens. After the SAB stack, features are pooled (mean or max) across the variant dimension to yield .
3. Path-2: Quality-Aware 1D-CNN
Input Representation
For each variant, a vector containing eight VCF FORMAT fields——is assembled as , padded to match Path-1.
CNN Architecture and Confidence Scoring
is transposed to channels-first, then passes through a series of Conv1DReLUMaxPool1D modules. Example configuration:
- Conv1D, MaxPool($3$)
- Conv1D, MaxPool($3$)
- Conv1D, MaxPool($3$)
The final feature map is flattened and projected with a linear layer to , followed by dropout (0.15 typical). A sigmoid is applied: , interpreted as per-dimension adaptive confidence scores.
4. Fusion Mechanism and Classification Head
The fusion module combines the biological and confidence pathways. The optimal "Amplification" gate, as evaluated in comparative tests, is: where denotes elementwise multiplication.
For classification, a two-layer feed-forward network applies: with , producing a resistance probability .
5. Optimization, Training, and Validation
Weighted binary cross-entropy is used to correct class imbalance. Optimization uses Adam with a tuned learning rate (∼1.3×10⁻³) and weight decay (≲1e–4). Regularization is enforced via dropout (range: 0.1–0.4), with random shuffling of variant tokens as data augmentation to preserve permutation invariance. Batch size is 32. Training proceeds for 30–50 epochs with early stopping on validation AUC.
Validation is performed using hold-out or 5-fold cross-validation, with performance reported in accuracy, AUC, and F1 score. Demonstrated results include accuracy >95% and AUC ∼97% for RIF and RFB resistance (Boutorh et al., 25 Dec 2025).
6. Interpretability: Epistatic and Technical Attribution
VAMP-Net introduces two levels of interpretability:
- Attention Weight Analysis (SAB Path): Extraction and averaging of first-layer multi-head self-attention maps across heads and samples yields epistatic interaction graphs. This enables visualization of variant co-attendance and identification of network modules.
- Integrated Gradients (SAB Path): IG attributions quantify per-variant importance:
where is a zero (all-padding) baseline. Variants within rpoB consistently receive high attribution scores, corroborating known causal loci for rifampicin resistance.
- Gradient-Based Feature Attribution (CNN Path): Channel importance is assessed via the gradient , aggregating mean absolute gradients globally. Ablation (zeroing out individual channels at test time and tracking AUC drop) identified FRS and GT_CONF_PERCENTILE as critical for accurate predictions.
7. Implementation Blueprint and Model Reproducibility
VAMP-Net’s design is specified to enable direct re-implementation. All relevant tensor shapes, computational sequence, and hyperparameters are provided. Pseudocode, as furnished in the original work, outlines preprocessing, transformer and CNN computation, fusion via amplification gating, and final MLP-based classification, detailed as:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
tokens: B×N # integer IDs F: B×N×8 # float32 quality ... ... g = sigmoid(z_CNN) # B×D z_fused = (1 + g) * z_SAB # B×D h = ReLU(z_fused · W_h + b_h) # B×H p = sigmoid(h · w_o + b_o) # B×1 loss = weighted_binary_cross_entropy(p, label) |
This enables faithful reproduction of the reported results under the stated validation protocols (Boutorh et al., 25 Dec 2025).
8. Context and Significance
VAMP-Net establishes a reference paradigm for genomic drug resistance prediction in settings characterized by both epistatic genetic mechanisms and substantial technical variance in sequencing data quality. The explicit modeling of variant set structure coupled with per-variant technical confidence offers improved predictive performance and interpretability compared to CNN or MLP baselines. Attention-derived epistatic networks and integrated gradients provide direct biological insight, while confidence attributions from the CNN pathway yield actionable quality control for variant calls. This dual-layer interpretability is posited as necessary for robust, clinically-actionable deployment of resistance prediction in genomics (Boutorh et al., 25 Dec 2025).