PLANET v2.0: Deep Learning for Binding Affinity
- PLANET v2.0 is a deep learning framework for protein–ligand binding affinity prediction using integrated 2D molecular graphs and 3D structural pocket information.
- It employs a 10-layer GNN, dual-stage pocket embedding, and cross-attention mechanisms to capture detailed atom–residue interactions with Gaussian mixture modeling.
- The model achieves state-of-the-art docking accuracy and millisecond inference speeds, making it ideal for ultra-large-scale virtual screening in drug discovery.
PLANET v2.0 is a deep learning-based protein–ligand binding affinity prediction framework designed to accelerate virtual screening in drug discovery by integrating rigorous 2D/3D molecular representations and mixture density network-based (MDN) modeling. It achieves state-of-the-art scoring, ranking, and docking power while maintaining millisecond-level inference speed, making it suitable for ultra-large-scale compound libraries. PLANET v2.0 addresses limitations of earlier models regarding protein–ligand contact map accuracy by explicitly learning non-covalent interaction distributions and coupling them with a differentiable Gaussian mixture energy model for affinity estimation (Gao et al., 12 Jan 2026).
1. Model Architecture
PLANET v2.0 processes (a) the 2D molecular graph of the ligand and (b) the 3D atomic structure of the protein binding pocket (all residues within 12 Å of the ligand centroid). Its architecture is modular:
- Ligand Representation: 10-layer message-passing GNN, inheriting the L₁ layer structure from JT-VAE, encodes atom-level features.
- Pocket Embedding: A two-stage process:
- Five-layer Distance Attention Network (DAN) aggregates local/global information across pocket residues.
- Five-layer Equivariant Graph Convolutional Layer (EGCL) further refines residue features under SO(3) equivariance constraints.
- Protein–Ligand Communication: Two cross-attention layers mediate bidirectional message passing between ligand atoms and pocket residues.
- Pairwise Interaction Modeling: For each atom–residue pair, their embeddings are concatenated and input to an MDN predicting parameters of one or more Gaussians describing distance distributions and energy couplings. The MDN outputs GMM parameters for both the interaction probability density and the pair “energy” .
The following schematic summarizes the forward workflow:
| Input Components | Processing Pipeline | Outputs |
|---|---|---|
| 2D Ligand Graph | 10-layer GNN | Atom Features |
| 3D Pocket Structure | DAN (5 layers) → EGCL (5 layers) | Residue Features |
| Atom–Residue Embeddings | 2-step Cross-attention → MDN | GMM Parameters for , |
| Full Model Output | Aggregation over atom–residue pairs via MDN-to-affinity integration | Predicted Binding Affinity |
2. Multi-Objective Loss and Supervised Tasks
PLANET v2.0 is trained end-to-end using a composite loss function encompassing four supervised objectives and two auxiliary regularizations:
- (a) Binding Affinity Regression: Minimizes MSE between predicted and experimental , with .
- (b) Ligand Distance Matrix Recovery: Encourages latent structure preservation by reconstructing which pairs of ligand atoms are within 4 Å in the crystal structure.
- (c) MDN Distance-Density Fitting: The log-likelihood of the observed atom–residue pairwise distance under the MDN-predicted mixture distribution is maximized.
- (d) Decoy Non-Interaction Regularization: Negative-density loss is imposed on ligand–protein pairs known to be inactive, enforcing low density at small .
- (e) Atom-Type Auxiliary Prediction: Node type (atom and residue) classifiers provide regularization to promote embedding robustness in deep GNNs.
The total loss is a weighted sum:
with .
3. Mixture Density Model and Affinity Determination
PLANET v2.0 synthesizes pairwise atom–residue interactions using GMMs predicted by the MDN:
- Distance Density :
- Energy Coupling : Modeled with a GMM, with allowed to be positive or negative, capturing both favorable and unfavorable interactions.
- Affinity Calculation: The expected short-range ( Å) energy for each pair is:
The aggregate predicted binding free energy is the sum over all (atom, residue) pairs:
This formulation enables the model to subsume both discrete contact-map and distance-based affinity information in a continuous, differentiable, and probabilistically interpretable manner.
4. Data Representation and Contact Modeling
- Ligand: Atom-level features are one-hot encoded: chemical type (C, N, O, S, P, halogen), degree, charge, hybridization, aromaticity (). Edge features encode bond order, conjugation, ring, and stereo ().
- Pocket: Each residue node includes the BLOSUM62 profile (20D), plus 15D RBF-encoded features for backbone and side-chain distances to the pocket center, yielding a feature size of 50D.
- Contact Maps: No explicit input contact map is used; instead, the learned operates as a “soft” contact map, feeding directly into affinity integration.
5. Dataset Preparation and Training Protocol
- Training Data: 22,920 protein–ligand complexes from PDBbind v2021 (“general set”).
- Test Set: 285 complexes from CASF-2016.
- Deduplication: To prevent overlap, proteins are clustered at 90% sequence identity using CD-HIT; within clusters, ligand pairs of ECFP4 Tanimoto > 0.9 are removed (400 complexes).
- Dataset Splits: ~18,455 training and ~4,614 validation complexes.
- Decoy Augmentation: Approx. 1,000,000 ChEMBL molecules per target, Lipinski-filtered, Tanimoto < 0.2 to actives.
- Optimization: Adam optimizer, learning rate , exponential decay 0.9 every 50,000 batches; total loss includes all task terms.
6. Computational Performance and Benchmarking
CASF-2016 Results
| Model | RMSE (pK) | Pearson | Spearman | Docking Top-1 (%) | Docking Top-3 (%) |
|---|---|---|---|---|---|
| PLANET v2.0 | 1.171 | 0.848 | 0.669 | 85.2 | 97.2 |
| PLANET v1.0 | 1.247 | 0.824 | 0.666 | — | — |
| Glide SP | 1.89 | 0.513 | 0.419 | — | — |
| Autodock Vina | 1.73 | 0.604 | — | — | — |
- Screening Power (LIT-PCBA, mean AUROC over 15 targets): Glide SP 0.536; PLANET v1.0 0.556; PLANET v2.0 0.576.
Ultra-Large-Scale Virtual Screening
- Benchmark using modified LIT-PCBA with 19 million decoys:
- PLANET v2.0 AUROC = 0.463; EF₁% = 1.115
- +Lipinski filter: AUROC = 0.501; EF₁% = 3.627
- +Heavy-atom normalization: AUROC = 0.522; EF₁% = 4.001
- Runtime: On a single AMD EPYC 9654 with NVIDIA RTX 4090, PLANET v2.0 scored 20 million complexes in 10 hours (1.5 ms/complex), substantially faster than traditional docking.
7. Recommendations for Virtual Screening Integration
- Prefiltering: Employ as an ultra-fast prefilter after elementary druglikeness filters (Lipinski, Veber) in very large compound libraries.
- Normalization: Reduce ligand size bias by either dividing predicted affinity by heavy-atom count or leveraging ligand-efficiency thresholds.
- Rescoring: For higher accuracy, rerank top candidates with physics-based methods (e.g., FEP+).
- Pose Prediction: The MDN-derived score can be used for pose discrimination, reaching top-1 success (RMSD < 2 Å) without explicit docking.
- Retraining: Regular updates with new PDBbind releases are recommended; performance declines modestly on PDBbind 2024, indicating the benefit of periodic retraining (annual).
In summary, PLANET v2.0 systematically advances protein–ligand affinity modeling via integration of deep graph embeddings, continuous interaction distributions, and a multi-task training regime. This framework yields specific improvements in speed and predictive fidelity, facilitating its deployment as a pragmatic tool for modern virtual screening workflows across both academic and industrial settings (Gao et al., 12 Jan 2026).