Foundational Machine Learning Potentials

Updated 23 January 2026

Foundational Machine Learning Potentials are parameterized models that predict interatomic energies and forces with quantum-chemical accuracy, serving as universal surrogates for the Born–Oppenheimer surface.
They employ symmetry-adapted descriptors and advanced architectures such as graph neural networks to capture both local and nonlocal atomic interactions effectively.
Meta-learning and extensive pretraining on diverse datasets enable these models to rapidly adapt to new chemical environments while maintaining high transferability and efficiency.

Foundational Machine Learning Potentials (MLPs) are parameterized models that predict interatomic energies, forces, and related properties with quantum-chemical accuracy, trained on large, diverse datasets. They are constructed to be broadly applicable (“universal” or “foundational”) across chemical and materials spaces, in contrast to narrowly-targeted, task-specific models. These potentials serve as surrogates for the Born–Oppenheimer potential energy surface, allowing efficient atomistic simulations of molecules, materials, and reactions at scales and speeds unachievable by direct electronic structure methods (Xia et al., 11 Feb 2025, Kempen et al., 18 Dec 2025).

1. Mathematical Formulation and Symmetry Principles

Foundational MLPs employ an explicit energy decomposition and symmetry-adapted structure representations to guarantee physical invariances:

Energy Decomposition: The total potential energy $E_\mathrm{tot}$ is written as a sum of atomic (or site-specific) contributions:

$E_\mathrm{tot} = \sum_{i=1}^N E_i(d_i)$

where $d_i$ encodes the local environment of atom $i$ (Tokita et al., 2023, Fabritiis, 2024, Xia et al., 11 Feb 2025).

Symmetry Enforcement: Physical invariance under translation, rotation, and permutation of chemically-identical atoms is imposed either via explicit construction of descriptors (e.g., atom-centered symmetry functions, ACSFs; smooth overlap of atomic positions, SOAP; cluster expansions) or via symmetry-aware model architectures (e.g., equivariant graph neural networks) (Xia et al., 11 Feb 2025, Lysogorskiy et al., 25 Aug 2025, Kocer et al., 2021).

2. Descriptor Frameworks and Model Architectures

A variety of representation techniques and regression models have been developed and benchmarked as foundational MLPs:

Handcrafted Descriptors and Local Networks: Behler–Parrinello HDNNPs use ACSFs to map local atomic neighborhoods to fixed-length feature vectors, which are processed by element-specific feed-forward NNs, combining radial and angular terms over a finite cutoff (Tokita et al., 2023, Xia et al., 11 Feb 2025).
Kernel Methods: Gaussian Approximation Potentials (GAP) utilize kernel ridge regression on high-dimensional neighbor density (e.g., SOAP) descriptors. These encode angular and radial distribution of neighbors with rotational averaging (Xia et al., 11 Feb 2025, Kocer et al., 2021).
Systematic Basis Expansions: Moment Tensor Potentials (MTP) and Atomic Cluster Expansion (ACE) construct hierarchies of body-ordered invariant basis functions, contracting tensor products of geometric quantities (Xia et al., 11 Feb 2025).
Graph Neural Networks (GNNs): Message-passing NNs (SchNet, DimeNet, MACE) and their equivariant variants (NequIP, Allegro, GRACE) leverage learned, rotationally-adapted features for scalability and data efficiency. GRACE, for example, extends ACE to graph-based architectures with semi-local message passing and chemical embeddings spanning the entire periodic table (Lysogorskiy et al., 25 Aug 2025).
Generational Progression: MLPs have evolved from (1) local-only, hand-crafted descriptors, to (2) explicit inclusion of fixed or environment-predicted charges, (3) environment-dependent charge schemes, and (4) fully nonlocal, self-consistent global charge equilibration (e.g., 4G-HDNNP, CENT, BpopNN) (Ko et al., 2023, Kocer et al., 2024).

3. Meta-Learning and Foundation Model Paradigms

The shift to foundational MLPs is exemplified by leveraging meta-learning and pretraining on extensive, heterogeneous datasets:

Task Unification via Meta-Learning: The Reptile algorithm demonstrates that model parameters can be meta-trained across multiple quantum-mechanical datasets (differing in QM method or chemical coverage) so that rapid adaptation to new tasks is enabled by a few gradient steps (Allen et al., 2023). Each dataset is considered a separate meta-task; the meta-objective aligns the parameter space to support efficient fine-tuning.
Pretraining and Adaptation: Foundational MLPs are pretrained on millions of molecular and materials configurations, across diverse chemistries and QM levels, producing a θ initialization that readily adapts to unseen scenarios, including new molecules, functionals, or property regimes (Allen et al., 2023, Lysogorskiy et al., 25 Aug 2025).
Zero-Shot and Fine-Tuning Benchmarks: Benchmarks indicate that foundational models trained in this manner achieve lower data requirements (up to 5–10x reduction in required labels for new tasks) and higher smoothness of potential energy surfaces compared to classic transfer learning or single-dataset training (Allen et al., 2023).

4. Evaluation Methodologies, Transferability, and Practical Considerations

Accurate, transferable foundational MLPs require stringent validation and rational design of datasets and workflows:

Loss Functions: Training objectives typically combine energy and force RMSE, often with regularization and explicit charge/other auxiliary losses for advanced generations (Tokita et al., 2023, Kocer et al., 2024).
Validation Metrics: Besides test RMSE for energy and forces, chemical fidelity is tested via property prediction—torsion scans, bond dissociation curves, phonon spectra, and radial distribution functions. Robust extrapolation checks and active-learning loops are common practices (Tokita et al., 2023, Xia et al., 11 Feb 2025).
Transferability: Foundational MLPs exhibit strong zero-shot generalization when the training data are sufficiently diverse in chemical and configurational space. However, in narrow or highly specific regimes, task-specific models may still surpass foundational ones in pure accuracy (Kempen et al., 18 Dec 2025).
Failure Modes: Catastrophic failures are observed when critical features (spin polarization, rare environments) are absent from training data or descriptor frameworks. Structure relaxation in MLIP-generated surfaces sometimes increases energetic discrepancies, necessitating careful selection and testing of candidate models for a given application (Kempen et al., 18 Dec 2025).

5. Nonlocal Physics: Charge Transfer, Redox, and Electrostatics

Advanced foundational MLPs incorporate long-range physics and resolve limitations of local ansatzes:

4th-Generation MLPs (4G-HDNNP, CENT, BpopNN): These models integrate global charge equilibration procedures, enabling handling of multiple charge states, oxidation-state discrimination, and physically correct redox/electron-transfer events that are inaccessible to locality-bound architectures (Ko et al., 2023, Kocer et al., 2024).
Electrostatic Embedding: Electrostatic-potential descriptors and charge-equilibration schemes enrich nonlocal character and improve transferability to charged clusters and periodic systems, achieving near–“chemical accuracy” for challenging benchmarks, such as the NaCl cluster test set (Ko et al., 2023).
Descriptor Limits and Extension: For redox chemistry (e.g., Fe²⁺/Fe³⁺ in solution), only nonlocal charge-aware models reproduce correct solvation structures and dynamic electron transfer; local 2G/3G MLPs fail to distinguish oxidation states when the counter-ions are outside the local cutoff (Kocer et al., 2024).

6. Unification, Representation Analysis, and Model Interoperability

The latent geometry of atomic representations across independently trained foundational MLPs exhibits universal structure:

Platonic Representation: By projecting model-specific atomic embeddings onto a set of common anchor vectors, a model-agnostic “Platonic” representation space is formed, in which chemical trends (element groups, oxidation states) and structural invariants organize into well-defined clusters. This enables quantitative cross-model comparison, embedding arithmetic, and detection of representational or physical failure modes (Li et al., 5 Dec 2025).
Interoperability: The Platonic framework unifies equivariant, non-equivariant, conservative, and direct-force models, providing a metric for assessing representational bias and catastrophic forgetting, and offering tools for diagnostic and interpretability (Li et al., 5 Dec 2025).

7. Impact, Limitations, and Future Directions

Foundational machine learning potentials are reshaping atomistic simulation and discovery:

Extensive Applications: MLPs are deployed across quantum-chemical property prediction, molecular simulations (binding free energies, folding), materials discovery, catalysis (reaction energetics, defect chemistry), and electronic processes (redox, charge transfer) (Xia et al., 11 Feb 2025, Kempen et al., 18 Dec 2025, Fabritiis, 2024, Kocer et al., 2024).
Computational Scaling: Linear/near-linear scaling is achieved via local decomposition, sparse basis expansion, and graph/messaging frameworks. Recent Pareto-optimal models (e.g., GRACE) combine high accuracy with efficiency at scale, allowing simulations of up to hundreds of thousands of atoms per GPU step (Lysogorskiy et al., 25 Aug 2025).
Open Challenges: Descriptor completeness, data-set diversity, uncertainty quantification, nonlocal physics, and multi-fidelity data integration remain active research areas (Allen et al., 2023, Xia et al., 11 Feb 2025). Automating adaptation, preventing catastrophic forgetting during fine-tuning/distillation, and coupling to explicit long-range electrostatics, spin, and QM calculations are ongoing frontiers.
Limitations: No single foundational model dominates across all chemical and materials domains. Success depends crucially on dataset selection, architecture family, and task-specific fine-tuning (Kempen et al., 18 Dec 2025). Foundational MLPs typically exhibit longer inference times and “black box” character compared to minimal, interpretable task-specific models.
Implications: Anchoring model representations and correctness to shared chemical manifolds (Platonic representation), coordinating model interoperability, and developing robust foundation models for atomistic simulations are expected to become central in future progress (Li et al., 5 Dec 2025).

References:

"Learning Together: Towards foundational models for machine learning interatomic potentials with meta-learning" (Allen et al., 2023)
"Tutorial: How to Train a Neural Network Potential" (Tokita et al., 2023)
"The Evolution of Machine Learning Potentials for Molecules, Reactions and Materials" (Xia et al., 11 Feb 2025)
"Platonic representation of foundation machine learning interatomic potentials" (Li et al., 5 Dec 2025)
"Machine learning potentials for redox chemistry in solution" (Kocer et al., 2024)
"Accurate Fourth-Generation Machine Learning Potentials by Electrostatic Embedding" (Ko et al., 2023)
"Machine Learning Potentials: A Roadmap Toward Next-Generation Biomolecular Simulations" (Fabritiis, 2024)
"How accurate are foundational machine learning interatomic potentials for heterogeneous catalysis?" (Kempen et al., 18 Dec 2025)
"Graph atomic cluster expansion for foundational machine learning interatomic potentials" (Lysogorskiy et al., 25 Aug 2025)
"Neural Network Potentials: A Concise Overview of Methods" (Kocer et al., 2021)