VAMP-Net Architecture for Genomic Drug Resistance

Updated 10 February 2026

VAMP-Net is a deep learning architecture that predicts binary drug resistance by modeling complex epistatic interactions among genetic variants.
It employs a dual-path framework: a set attention transformer to capture variant dependencies and a 1D-CNN to evaluate per-variant quality, fused via an amplification gate.
The model offers dual-layer interpretability through attention mapping and integrated gradients, yielding insights into both biological drivers and technical confidence.

VAMP-Net (Variant-Aware Multi-Path Network) is a deep neural architecture designed for the prediction of binary drug resistance in Mycobacterium tuberculosis from genome-wide variant calls. It integrates permutation-invariant set attention with parallel confidence estimation from raw sequencing quality metrics. The architecture addresses two major challenges in genomic resistance prediction: modeling the epistatic interactions among genetic variants and accounting for the variable reliability of sequencing data. VAMP-Net couples a Set Attention Transformer for variant sets and a 1D convolutional network for per-variant confidence, uniting both in a late-fusion classification module. The framework achieves over 95% accuracy and area under the receiver operating characteristic curve (AUC ∼97%) for rifampicin and rifabutin resistance classification, and provides dual-layer interpretability into both biological and technical determinants of model outputs (Boutorh et al., 25 Dec 2025).

1. Dual-Path Network Architecture

VAMP-Net consists of two complementary computational pathways:

Path-1: Variant-Aware Set Attention Transformer operates on the unordered set of variant tokens. It uses an embedding of each variant and stacks multiple permutation-invariant Set Attention Blocks (SABs) to capture epistatic, non-local dependencies among genomic loci.
Path-2: Quality-Aware 1D-CNN ingests the quality associated with each variant as provided by Variant Call Format (VCF) FORMAT fields. This pathway applies a sequence of 1D convolutions and pooling on per-variant feature matrices to model the adaptive confidence of variant calls.

A late-fusion module adaptively combines the outputs by modulating the embedding from Path-1 with a sigmoid-gated vector from Path-2. The fused representation is then forwarded to a multi-layer perceptron (MLP) and a final sigmoid for resistance probability prediction.

2. Path-1: Set Attention Transformer for Variant Sets

Input Representation

Each sample is represented by a set $\{a_1,\dots,a_T\}$ of variants, encoded as strings (e.g. “761139_C>A”), tokenized with a pretrained BERT tokenizer and embedded into $\mathbb{R}^{N\times d_{\rm model}}$ , where $N$ is the maximum number of variants per sample after padding. A binary mask $M_{\rm pad}\in\{0,-\infty\}^{N\times N}$ designates non-informative (padded) positions.

Set Attention Blocks (SABs)

The core block applies multi-head self-attention without positional encoding: $\begin{aligned} Q &= XW^Q,\quad K = XW^K,\quad V = XW^V;\ A &= \mathrm{softmax}((QK^\top + M_{\rm pad})/\sqrt{d_k});\ Z &= A\,V;\ Y &= \mathrm{LayerNorm}(X + Z);\ X' &= \mathrm{LayerNorm}(Y + \mathrm{FFN}(Y)), \end{aligned}$ with FFN as $\mathrm{ReLU}(Y W_1 + b_1) W_2 + b_2$ , $d_{\rm model}=64$ , $d_{\rm ff}=32$ , $L=3$ blocks.

SAB blocks are either fully unmasked (for strict permutation invariance) or apply the padding mask to maintain set-equivariance on valid tokens. After the SAB stack, features are pooled (mean or max) across the variant dimension to yield $z_{\rm SAB}\in\mathbb{R}^{d_{\rm model}}$ .

3. Path-2: Quality-Aware 1D-CNN

Input Representation

For each variant, a vector containing eight VCF FORMAT fields— $\{\rm{GT, DP, DPF, COV\_REF, COV\_ALT, FRS, GT\_CONF, GT\_CONF\_PERCENTILE}\}$ —is assembled as $F\in\mathbb{R}^{N\times 8}$ , padded to match Path-1.

CNN Architecture and Confidence Scoring

$F$ is transposed to channels-first, then passes through a series of Conv1D $\to$ ReLU $\to$ MaxPool1D modules. Example configuration:

Conv1D $(8 \to 96, k=5)$ , MaxPool($3$)
Conv1D $(96 \to 128, k=9)$ , MaxPool($3$)
Conv1D $(128 \to 256, k=2)$ , MaxPool($3$)

The final feature map is flattened and projected with a linear layer to $z_{\rm CNN}\in\mathbb{R}^{d_{\rm model}}$ , followed by dropout (0.15 typical). A sigmoid is applied: $g = \sigma(z_{\rm CNN})\in (0,1)^{d_{\rm model}}$ , interpreted as per-dimension adaptive confidence scores.

4. Fusion Mechanism and Classification Head

The fusion module combines the biological and confidence pathways. The optimal "Amplification" gate, as evaluated in comparative tests, is: $z_{\rm fused} = (1 + g) \odot z_{\rm SAB}$ where $\odot$ denotes elementwise multiplication.

For classification, a two-layer feed-forward network applies: $\mathrm{FC}(d_{\rm model}\to d_{\rm hid})\xrightarrow{\mathrm{ReLU}}\mathrm{FC}(d_{\rm hid}\to 1)\xrightarrow{\mathrm{sigmoid}}y$ with $d_{\rm hid}=32$ , producing a resistance probability $y\in (0,1)$ .

5. Optimization, Training, and Validation

Weighted binary cross-entropy is used to correct class imbalance. Optimization uses Adam with a tuned learning rate (∼1.3×10⁻³) and weight decay (≲1e–4). Regularization is enforced via dropout (range: 0.1–0.4), with random shuffling of variant tokens as data augmentation to preserve permutation invariance. Batch size is 32. Training proceeds for 30–50 epochs with early stopping on validation AUC.

Validation is performed using hold-out or 5-fold cross-validation, with performance reported in accuracy, AUC, and F1 score. Demonstrated results include accuracy >95% and AUC ∼97% for RIF and RFB resistance (Boutorh et al., 25 Dec 2025).

6. Interpretability: Epistatic and Technical Attribution

VAMP-Net introduces two levels of interpretability:

Attention Weight Analysis (SAB Path): Extraction and averaging of first-layer multi-head self-attention maps $A\in\mathbb{R}^{N\times N}$ across heads and samples yields epistatic interaction graphs. This enables visualization of variant co-attendance and identification of network modules.
Integrated Gradients (SAB Path): IG attributions quantify per-variant importance:

$\mathrm{IG}_i(x) = (x_i - x'_i)\int_{\alpha=0}^1 \frac{\partial F(x' + \alpha (x - x'))}{\partial x_i}d\alpha$

where $x'$ is a zero (all-padding) baseline. Variants within rpoB consistently receive high attribution scores, corroborating known causal loci for rifampicin resistance.

Gradient-Based Feature Attribution (CNN Path): Channel importance is assessed via the gradient $\frac{\partial y}{\partial F_{j,\ell}}$ , aggregating mean absolute gradients globally. Ablation (zeroing out individual channels at test time and tracking AUC drop) identified FRS and GT_CONF_PERCENTILE as critical for accurate predictions.

7. Implementation Blueprint and Model Reproducibility

VAMP-Net’s design is specified to enable direct re-implementation. All relevant tensor shapes, computational sequence, and hyperparameters are provided. Pseudocode, as furnished in the original work, outlines preprocessing, transformer and CNN computation, fusion via amplification gating, and final MLP-based classification, detailed as:

tokens:   B×N        # integer IDs
F:        B×N×8      # float32 quality

...

...

g        = sigmoid(z_CNN)              # B×D
z_fused  = (1 + g) * z_SAB             # B×D
h        = ReLU(z_fused · W_h + b_h)   # B×H
p        = sigmoid(h · w_o + b_o)      # B×1

loss = weighted_binary_cross_entropy(p, label)

This enables faithful reproduction of the reported results under the stated validation protocols (Boutorh et al., 25 Dec 2025).

8. Context and Significance

VAMP-Net establishes a reference paradigm for genomic drug resistance prediction in settings characterized by both epistatic genetic mechanisms and substantial technical variance in sequencing data quality. The explicit modeling of variant set structure coupled with per-variant technical confidence offers improved predictive performance and interpretability compared to CNN or MLP baselines. Attention-derived epistatic networks and integrated gradients provide direct biological insight, while confidence attributions from the CNN pathway yield actionable quality control for variant calls. This dual-layer interpretability is posited as necessary for robust, clinically-actionable deployment of resistance prediction in genomics (Boutorh et al., 25 Dec 2025).

Markdown Report Issue Upgrade to Chat

References (1)

VAMP-Net: An Interpretable Multi-Path Framework of Genomic Permutation-Invariant Set Attention and Quality-Aware 1D-CNN for MTB Drug Resistance (2025)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to VAMP-Net Architecture.

VAMP-Net Architecture for Genomic Drug Resistance

1. Dual-Path Network Architecture

2. Path-1: Set Attention Transformer for Variant Sets

Input Representation

Set Attention Blocks (SABs)

3. Path-2: Quality-Aware 1D-CNN

Input Representation

CNN Architecture and Confidence Scoring

4. Fusion Mechanism and Classification Head

5. Optimization, Training, and Validation

6. Interpretability: Epistatic and Technical Attribution

7. Implementation Blueprint and Model Reproducibility

8. Context and Significance

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics