SPA$^\mathrm{H}$M(a,b): encoding the density information from guess Hamiltonian in quantum machine learning representations

Published 6 Sep 2023 in physics.chem-ph | (2309.02950v2)

Abstract: Recently, we introduced a class of molecular representations for kernel-based regression methods -- the spectrum of approximated Hamiltonian matrices (SPA$^{\mathrm{H}$M)} -- that takes advantage of lightweight one-electron Hamiltonians traditionally used as an SCF initial guess. The original SPA$^\mathrm{H}$M variant is built from occupied-orbital energies (ie, eigenvalues) and naturally contains all the information about nuclear charges, atomic positions, and symmetry requirements. Its advantages were demonstrated on datasets featuring a wide variation of charge and spin, for which traditional structure-based representations commonly fail. SPA$^{\mathrm{H}$M(a,b),} as introduced here, expand the eigenvalue SPA$^\mathrm{H}$M into local and transferable representations. They rely upon one-electron density matrices to build fingerprints from atomic and bond density overlap contributions inspired from preceding state-of-the-art representations. The performance and efficiency of SPA$^{\mathrm{H}$M(a,b)} is assessed on the predictions for datasets of prototypical organic molecules (QM7) of different charges and azoheteroarene dyes in an excited state. Overall, both SPA$^{\mathrm{H}$M(a)} and SPA$^{\mathrm{H}$M(b)} outperform state-of-the-art representations on difficult prediction tasks such as the atomic properties of charged open-shell species and of $\pi$-conjugated systems.