Gaussian Approximation Potentials

Updated 29 January 2026

Gaussian Approximation Potentials are machine-learned interatomic potential models that use Gaussian process regression and invariant atomic descriptors like SOAP to capture local environments.
The methodology employs advanced descriptors and sparsification techniques to ensure near-quantum accuracy while maintaining efficiency in molecular dynamics and defect studies.
GAP models are validated against DFT with energy errors as low as 1–5 meV/atom, enabling reliable simulations of materials, alloys, and complex atomic systems.

Gaussian Approximation Potentials (GAP) are a class of machine-learned interatomic potentials constructed by fitting total energies and their derivatives (forces, stresses) using Gaussian process regression (GPR) on descriptors of local atomic environments. GAP models combine the near-ab initio accuracy of quantum-mechanical methods, such as density functional theory (DFT), with efficiency suitable for large-scale molecular dynamics and Monte Carlo simulations, enabling quantum-level force field predictions for systems containing up to millions of atoms (Bartók, 2010, Klawohn et al., 2023, Bartók et al., 2015, Szlachta, 2014).

1. Theoretical Framework and Gaussian Process Regression

In GAP, the total energy of an atomic configuration is expressed as a sum of local atomic contributions: $E = \sum_{i=1}^N \varepsilon(x_i)$ where each atomic energy $\varepsilon(x_i)$ is modeled as a nonlinear function of a descriptor $x_i$ of the atom's neighborhood. The core principle is to model $\varepsilon(x)$ as a sample from a zero-mean Gaussian process prior with a covariance kernel $K(x, x')$ : $P[\varepsilon(x)] \sim \mathcal{GP}(0, K(x, x'))$ The regression is performed by minimizing a regularized least-squares loss involving both target observables and a prior over the model parameters: $(K + \Sigma_n^2 I) \alpha = y$ where $K$ is the kernel matrix evaluated between $M$ representative ("sparse") environments, $\Sigma_n$ is a diagonal matrix of assumed noise levels (distinct for energies, forces, and stresses), and $y$ is the vector of training targets (energies or linear combinations including forces and virials). Prediction for a new atomic environment uses the GPR mean: $\varepsilon(x^*) = \sum_{j=1}^{M} \alpha_j K(x^*, x_j)$ Sparsification techniques (e.g., pseudo-inputs, CUR decomposition) are employed to manage scaling to large training databases (Bartók, 2010, Szlachta, 2014, Klawohn et al., 2023, Klawohn et al., 2022).

2. Atomic Environment Descriptors and Invariant Kernels

Central to GAP are descriptors that uniquely capture local atomic environments subject to key physical symmetries. Two main families are employed:

Bispectrum Descriptors: The original GAP models used a 4D bispectrum based on hyperspherical harmonics expansions of the atomic neighbor density projected onto $S^3$ . These invariants are constructed to be invariant under rigid translations, rotations, and permutations of identical atoms (Bartók, 2010).
SOAP (Smooth Overlap of Atomic Positions): The majority of modern GAPs employ the SOAP descriptor, whereby the neighbor density around each atom is represented as a sum of Gaussians, expanded in a basis of orthonormal radial functions and spherical harmonics:

$\rho_i(\mathbf{r}) = \sum_{n, l, m} c_{nlm}^{(i)} g_n(r) Y_{lm}(\hat{r})$

Rotational invariance is incorporated by summing over $m$ to yield the power spectrum $p_{nn'l}^{(i)}$ . The SOAP kernel between two environments $i, j$ is then

$k_{\rm SOAP}(i, j) = \left( \frac{\sum_{nn'l} p_{nn'l}^{(i)} p_{nn'l}^{(j)}}{||p^{(i)}||\, ||p^{(j)}||} \right)^\zeta$

Typically, $\zeta$ is an integer (commonly 2–4) controlling kernel sharpness. This construction ensures invariance to translation, rotation, permutation, and, when required, reflection (Szlachta, 2014, Klawohn et al., 2023).

Other descriptors, such as localized Coulomb matrices and cluster-distance representations, have been integrated for specific applications (Barker et al., 2016, John, 2016).

3. Model Construction: Kernel Decomposition, Training, and Regularization

GAP models frequently combine multiple descriptor channels (2-body, 3-body, many-body SOAP), each with associated kernels and hyperparameters: $E_{\rm total} = E_{\rm pair}(r) + \sum_{i<j} \delta_{2b}^2 \sum_{s=1}^{M_2} \alpha_s^{(2b)} K_{2b}(q_i^{(2b)}, q_s^{(2b)}) + \sum_{i=1}^N \delta_{\rm mb}^2 \sum_{s=1}^{M_{\rm mb}} \alpha_s^{(\rm mb)} K_{\rm SOAP}(d_i, d_s)$ Regression weights are obtained by solving the regularized normal equations, where the regularization ("noise") levels $\sigma_E$ , $\sigma_F$ , $\sigma_V$ control the tradeoff between overfitting and transferability. Weighted block-diagonal $\Lambda$ matrices enable different tolerances for configurations of diverse physical character (e.g., liquids vs. crystals vs. defects) (Klawohn et al., 2023, Rosenbrock et al., 2019, Szlachta, 2014, Byggmästar et al., 2020).

Descriptor choice and kernel hyperparameters—cutoff radii, smearing widths, basis sizes, polynomial exponents, number of sparse environments—are typically optimized by cross-validation, physical intuition, grid search, or type-II marginal likelihood maximization. Constraints from the locality of interactions (i.e., finite cutoff) impose a lower bound on achievable force accuracy (~0.1 eV/Å for a ~4 Å cutoff in semiconductors) (Bartók, 2010, Szlachta, 2014, Deringer et al., 2016).

4. Data Generation, Training Protocols, and Parallel Implementation

Construction of a transferable GAP requires a DFT database that densely and diversely samples the relevant configuration space. Representative protocols include:

Elastic and phonon sampling: Random strains on primitive cells, frozen-phonon and MD sampling for phonons at multiple temperatures.
Defects and surfaces: Defect creation (vacancies, interstitials, surfaces, stacking faults, $\gamma$ -surfaces), their thermal sampling, and further DFT re-evaluation.
Phase and structural diversity: Multiple crystal structures, liquid states, and transition pathways (e.g., alloying or intercalation configurations).
Active learning: Iterative cycles augmenting the database in regions where model uncertainty, as predicted by the GPR posterior variance, is largest (Szlachta, 2014, Rosenbrock et al., 2019, Zhang et al., 2022).
Sparsification and high-performance fitting: Scalability to > $10^4$ – $10^5$ sparse points and $10^6$ training environments is achieved via curated selection (e.g., CUR decomposition, Kmeans) and distributed linear algebra (MPI-ScaLAPACK-based parallel solvers), eliminating the serial-memory bottleneck and enabling model fitting on thousands of CPU cores (Klawohn et al., 2023, Klawohn et al., 2022).

5. Model Validation, Benchmarking, and Typical Applications

GAP models are validated against DFT and experiment for a wide range of physical observables:

Property Class	Systematic Validations (examples)
Elastic & structural	Lattice constants, elastic moduli, stress–strain up to ±10%, phase energetics
Vibrational/phonon	Phonon dispersion, Grüneisen parameters, thermal expansion, powerspectrum RMSE < 0.15 THz
Defect energetics	Vacancy/interstitial formation and migration—typically within 0.05–0.1 eV of DFT
Surfaces & grain boundaries	Surface energies (within 2% of DFT), $\gamma$ -surface profiles, interface stabilities
Dislocations & plasticity	Core structure and Peierls barriers (e.g., 0.31 eV in W, matching DFT), dislocation–defect binding
Melting and liquids	Coexistence curves, volume–pressure EOS, radial/angular distribution functions, correct melting trends
Radiation damage	Threshold displacement energies, defect cluster sizes, Frenkel pair evolution, at agreement with DFT/exp.
Alloys, nanoparticles, guest–host	Energetics and phase diagrams of multicomponent (e.g., Ag–Pd, Fe-NPs, Li–intercalation), surface motifs

GAP is routinely used for advanced atomistic studies including fracture, phonon transport, large-scale defect evolution, battery-material intercalation, amorphous/biomolecular modeling, and radiation-damage in alloys (Szlachta, 2014, Bartók, 2010, Rosenbrock et al., 2019, Fujikake et al., 2017, Unruh et al., 2021, Deringer et al., 2016, Byggmästar et al., 2020, Koskenniemi et al., 2022, Zhang et al., 2022).

RMSEs in energy can reach 1–5 meV/atom, with force errors $\sim$ 0.05–0.2 eV/Å (saturating at the DFT locality limit). Transferability across phases, and to complex, large-scale systems, is a key strength compared to traditional potentials.

6. Tabulated GAP (tabGAP), Descriptor Compression, and Emerging Directions

To further extend computational efficiency, GAP models have been compressed, tabulated, or hybridized:

tabGAP: High-dimensional SOAP or 3-body terms are replaced by tabulated low-dimensional 2- and 3-body (plus EAM-like) terms, with energies precomputed on dense grids. At run time, cubic or multilinear interpolation yields $\mathcal{O}(100)$ – $\mathcal{O}(1000)$ -fold speedups, allowing MD of $\sim10^6$ atoms at near-EAM computational cost, with minor losses in accuracy (few meV/atom) relative to SOAP-GAP (Fellman et al., 2024, Koskenniemi et al., 2022).
Descriptor compression: Tensor-reduced, element-embedded, or turbo-SOAP methods lower the dimension of SOAP features, making modeling of multicomponent (10+ elements) alloys tractable by reducing scaling from $\mathcal{O}(S^2 n^2 l)$ to $\mathcal{O}(K l)$ , where $S$ is the number of species and $K \ll Sn$ (Klawohn et al., 2023).
Active learning and uncertainty quantification: The GP prior/posterior yields pointwise model uncertainties $\sigma(x)$ , enabling online selection of configurations to extend the training set and ensure robust accuracy in previously unvisited configuration space, or during dynamical simulations ("learn on the fly") (Bartók, 2010, Zhang et al., 2022).
Hybrid and coarse-grained approaches: GAP regression has been extended to learning coarse-grained molecular free-energy surfaces (GAP-CG) and to embedding in multi-scale workflows coupling atomistic to mesoscale models (John, 2016).
Algorithmic advances: Recent implementation improvements include OpenMP/MPI parallelism, optimized QR solvers, and planned GPU offloading to further accelerate large-scale training and inference (Klawohn et al., 2023, Klawohn et al., 2022).

7. Relation to Competing Methodologies and Limitations

Compared to analytic empirical potentials, GAP achieves near-DFT accuracy across diverse structures, configurations, and chemistries, while maintaining computational tractability. It uniquely integrates a non-parametric (GPR) regression approach with systematically improvable descriptors and kernels. However, limitations persist:

Scaling and efficiency: The main bottleneck is in descriptor computation and kernel evaluation, particularly for large and multi-element training sets, though tabGAP and descriptor compression significantly mitigate this (Fellman et al., 2024, Klawohn et al., 2023).
Transferability: GAP's accuracy is contingent on the diversity and coverage of the quantum-mechanical training set; extrapolation beyond the training domain (e.g., to different elements, high-charge, magnetically complex states) requires explicit new training data (Bartók, 2010, Jana et al., 2023).
Long-range effects: Standard GAP omits explicit Coulombic or dispersion interactions unless specific extensions (e.g., environment-dependent charges, Ewald summation, or explicit many-body charge models) are incorporated (Barker et al., 2016, Klawohn et al., 2023).
Comparison to ACE and MACE: The Atomic Cluster Expansion (ACE) formalism recasts the energy as an explicit polynomial expansion with truncatable body order, offering more direct control of systematic convergence and typically higher efficiency at intermediate accuracy. Multilayer ACE models (MACE) implement neural-network architectures atop ACE bases for "foundation" multi-element models with excellent accuracy/speed tradeoffs (Bernstein, 2024).

References

(Bartók, 2010) Gaussian Approximation Potential by Szlachta, Bartók, Payne, Csányi (2010)
(Szlachta, 2014) First principles interatomic potential for tungsten based on Gaussian process regression (2014)
(Bartók et al., 2015) Gaussian Approximation Potentials: a brief tutorial introduction (2015)
(Deringer et al., 2016) Machine-learning based interatomic potential for amorphous carbon (2016)
(Rosenbrock et al., 2019) Machine-learned Interatomic Potentials for Alloys and Alloy Phase Diagrams (2019)
(Klawohn et al., 2023) Gaussian Approximation Potentials: theory, software implementation and application examples (2023)
(Klawohn et al., 2022) Massively Parallel Fitting of Gaussian Approximation Potentials (2022)
(Byggmästar et al., 2020) Gaussian approximation potentials for body-centered-cubic transition metals (2020)
(Fellman et al., 2024) Fast and accurate machine-learned interatomic potentials for large-scale simulations of Cu, Al and Ni (2024)
(Bernstein, 2024) From GAP to ACE to MACE (2024)
(Koskenniemi et al., 2022) Efficient atomistic simulations of radiation damage in W and W-Mo using machine-learning potentials (2022)
(Zhang et al., 2022) Atomistic fracture in bcc iron revealed by active learning of Gaussian approximation potential (2022)
(Jana et al., 2023) Searching for iron nanoparticles with a general-purpose Gaussian approximation potential (2023)
(Fujikake et al., 2017) Gaussian approximation potential modeling of lithium intercalation in carbon nanostructures (2017)
(Unruh et al., 2021) A Gaussian Approximation Potential for Amorphous Si:H (2021)
(Barker et al., 2016) Localized Coulomb Descriptors for the Gaussian Approximation Potential (2016)
(John, 2016) Many-Body Coarse-Grained Interactions using Gaussian Approximation Potentials (2016)
(Rowe et al., 2017) A Machine Learning Potential for Graphene (2017)
(George et al., 2020) Combining phonon accuracy with high transferability in Gaussian approximation potential models (2020)

Gaussian Approximation Potentials thus represent a mature, technically robust framework for constructing transferable, systematically improvable, and near-quantum accurate interatomic potentials suitable for high-fidelity atomistic simulation.