Papers
Topics
Authors
Recent
Search
2000 character limit reached

Fidelity-Aware Feature Modulation (FAFM) in GeoOpt-Net

Updated 6 February 2026
  • Fidelity-Aware Feature Modulation (FAFM) is a technique that modulates feature vectors through learnable scale and shift parameters to adapt network outputs to high-fidelity quantum chemical conditions.
  • It integrates seamlessly with a multi-branch SE(3)-equivariant network architecture, combining radial, angular, and dihedral features for precise molecular geometry refinement.
  • FAFM significantly accelerates quantum-chemical workflows by reducing DFT optimization steps and ensuring DFT-quality geometries from inexpensive force-field conformers.

GeoOpt-Net is a multi-branch SE(3)-equivariant deep learning architecture designed for rapid and accurate refinement of molecular geometries, targeting single-shot prediction of density functional theory (DFT)-quality structures at the B3LYP/TZVP level directly from inexpensive, force-field-generated starting conformers. Implemented as an integrated graph neural operator with fidelity-aware calibration, GeoOpt-Net enables high-throughput, physically consistent geometry preparation for downstream quantum chemical workflows, substantially accelerating the pre-DFT optimization process without compromising on energetic or structural fidelity (Liu et al., 30 Jan 2026).

1. SE(3)-Equivariant Multi-Branch Network Architecture

GeoOpt-Net accepts as input a molecular graph G=(V,E)G = (V, E) and an initial coordinate matrix RinitialR_\text{initial}, such as those produced by RDKit’s ETKDG+MMFF94 pipeline. Its architecture features three explicit message-passing streams, each encoding different order geometric invariants:

  • 2-body stream: Processes bond lengths rijr_{ij}.
  • 3-body stream: Encodes angles θijk\theta_{ijk}.
  • 4-body stream: Encodes dihedrals ϕijkl\phi_{ijkl}.

Scalar (=0\ell=0) features are represented using radial basis expansions of distances, while directional (1\ell\geq1) features leverage real spherical harmonics Y()(r^ij)Y^{(\ell)}(\hat r_{ij}) for representation of geometric orientation. These features are combined in each stream by Clebsch–Gordan projections:

mij()=1,2[hi(1)Y(2)(r^ij)]CGϕ(rij)m_{ij}^{(\ell)} = \sum_{\ell_1, \ell_2} \left[ h_i^{(\ell_1)} \otimes Y^{(\ell_2)}(\hat r_{ij}) \right]_{CG} \cdot \phi(r_{ij})

where “\otimes” denotes the tensor product, the subscript CG_{CG} is Clebsch–Gordan projection, and ϕ(r)\phi(r) is a learnable radial filter.

Nonlinearities (e.g., GELU) and LayerNorm are applied strictly to scalar channels, keeping vector channels linearly updated and gated for equivariance. The three streams’ equivariant embeddings are fused via a lightweight Transformer decoder to yield a global latent Fθ(G,Rinitial,d)F_\theta(G, R_\text{initial}, d). The refined geometry is given by

Rrefined=Rinitial+Fθ(G,Rinitial,d)R_\text{refined} = R_\text{initial} + F_\theta(G, R_\text{initial}, d)

guaranteeing that SE(3) actions on the input yield the same transformation on the output.

2. Fidelity-Aware Feature Modulation (FAFM) and Two-Stage Training

GeoOpt-Net incorporates Fidelity-Aware Feature Modulation (FAFM) to inject theory- and basis-set-specific responses. Each hidden feature vector hh within the message-passing layers is modulated:

h~=h(1+gd)+bd\tilde h = h \odot (1 + g_d) + b_d

with dd a one-hot domain embedding (e.g., “6-31G(2df,p)” or “TZVP”), and gdg_d, bdb_d as learnable scale and shift vectors, respectively. FAFM enables rapid re-calibration to higher-fidelity conditions via:

  • Stage 1 (Pre-training): Network is trained on \sim290k molecules from QM9+QM40 at B3LYP/6-31G(2df,p), with gd=bd=0g_d = b_d = 0.
  • Stage 2 (Fine-tuning): Weights are warm-started; FAFM is turned on for “TZVP” (high-fidelity), optimizing gdg_d, bdb_d, and output layers on \sim180k molecules (QMe14S dataset at B3LYP/TZVP). Only these parameters are updated.

This mechanism allows efficient specialization without full retraining, capturing systematic shifts required by a larger basis set while preserving generalizability.

3. Training Protocols and Loss Functions

The loss function is a composite over multiple geometric targets:

L=Lrmsd+λbLbond+λaLangle+λdLdihedral+λrLbond_rangeL = L_\text{rmsd} + \lambda_b L_\text{bond} + \lambda_a L_\text{angle} + \lambda_d L_\text{dihedral} + \lambda_r L_\text{bond\_range}

With definitions:

  • LrmsdL_\text{rmsd}: Root-mean-square deviation between predicted and reference coordinates.
  • Lbond,Langle,LdihedralL_\text{bond}, L_\text{angle}, L_\text{dihedral}: MSE for bond lengths, angles, dihedrals.
  • Lbond_rangeL_\text{bond\_range}: Soft bond-range constraint via softplus penalties outside physically plausible intervals.

Optimization uses AdamW (lr=10310^{-3}), batch size 64, with multistep learning rate decay and gradient clipping. Implementation is in PyTorch + e3nn for equivariant operations, with a custom Transformer decoder.

4. Geometric, Energetic, and Electronic Performance

GeoOpt-Net achieves sub-milli-Å all-atom RMSD for most molecules in the ZINC20 test set (N=1000N=1000), with log10_{10}(RMSD) distribution sharply peaked at 4-4. Baseline methods (UMA, xTB, Auto3D, RDKit) show broader distributions between 0.1–1 Å. Single-point energy deviations at B3LYP/TZVP are centered near zero for GeoOpt-Net (σ<0.05\sigma < 0.05 kcal/mol), compared to multi-kcal/mol errors for baselines.

Error decomposition:

Metric GeoOpt-Net Best Baseline Range
Bonds (Å) 10410^{-4} 0.01–0.05
Angles (°) < 0.05 0.5–2
Dihedrals (°) \sim0.1 5–30

Dipole moments (μ\mu, Debye) at B3LYP/TZVP are preserved (GeoOpt-Net: 3.167 D vs. reference: 3.165 D; baselines deviate by \sim0.37–0.5 D).

5. DFT Convergence and Workflow Acceleration

GeoOpt-Net’s refined geometries satisfy 40–58% of individual DFT convergence criteria (max force, RMS force, max displacement, RMS displacement) versus \sim0% for baselines. The “All-YES” convergence rate (satisfying all four) is 65.0% under loose and 33.4% under default criteria; UMA, xTB, Auto3D, and RDKit attain 0%. Using GeoOpt-Net as a pre-DFT guess reduces the average number of DFT geometry optimization steps by \sim50% (GeoOpt-Net: \sim14 vs. \sim30–35 for baselines), leading to wall-clock time speedups of 2×–2.5×. This streamlines quantum-chemical workflows and reduces failure rates and manual intervention.

6. Scalability, Robustness, and Practical Considerations

GeoOpt-Net generalizes robustly to drug-like molecules with up to 20 rotatable bonds and 40 heavy atoms, maintaining ΔE<0.1\Delta E < 0.1 kcal/mol, while baseline errors grow to several kcal/mol. The network’s SE(3) equivariant design ensures geometric and energetic consistency under spatial symmetry operations. The method is implemented for neutral, closed-shell organic molecules; extension to open-shell or transition-metal systems would necessitate further training data. Memory and compute demands scale with the angular cutoff max\ell_{\max} and molecule size, which may become nontrivial for >50>50 heavy atoms.

7. Limitations and Extensions

GeoOpt-Net’s current domain is restricted to neutral, closed-shell organic structures. Integration of FAFM for additional quantum chemical fidelities beyond DFT (e.g., MP2, CCSD) would require further extension and calibration of the modulation mechanism. Model and hardware optimizations will be necessary for routine applications to very large biomolecules or inorganic clusters. Nonetheless, the approach transforms molecular geometry refinement preceding DFT from a multi-step bottleneck into a mesh-invariant, one-shot operation (Liu et al., 30 Jan 2026).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Fidelity-Aware Feature Modulation (FAFM).