From Promise to Practice: Benchmarking Quantum Chemistry on Quantum Hardware

Published 30 Nov 2025 in quant-ph and math-ph | (2512.01012v1)

Abstract: We provide a systematic evaluation of the sample-based quantum diagonalization (SQD) method for electronic structure based on the W4-11 thermochemistry dataset, comprising 124 total atomization, 83 bond dissociation, 20 isomerization, 505 heavy-atom transfer, and 13 nucleophilic substitution processes, covering diverse bonding situations and reaction mechanisms. This is the largest study assessing the accuracy and precision of a quantum-hybrid algorithm on a digital quantum device across a variety of molecular systems and chemical reactions, using 16.85 hours on the superconducting quantum processor ibm_rensselaer and 724.22 node hours on the supercomputer AiMOS. To ensure a fair comparison, our study employs commensurate resource allocation for both classical and quantum simulations. Although SQD exhibits large statistical deviations from ground-state reference energies, energy extrapolations yield CCSD-level accuracy. While bond-breaking reactions show a systematic improvement as computational resources increase, nucleophilic substitution or heavy atom transfer reactions do not. The limitations quantified in this manuscript indicate opportunities for improvement in SQD-based algorithms. This work provides a benchmark and community resource for exploring new quantum algorithms and devices, supported by an online benchmark challenge and an open-source Python library for direct comparison.

Abstract PDF Upgrade to Chat

Summary

The paper benchmarks the SQD algorithm on quantum hardware by evaluating its capacity to solve the electronic Schrödinger equation for many-body systems.
It integrates classical preprocessing with LUCJ ansatz-based quantum circuit construction and advanced error mitigation to manage hardware-induced noise.
Energy-variance extrapolation elevates SQD accuracy to CCSD levels, highlighting both its potential and limitations in raw quantum sampling.

Benchmarking Quantum Chemistry Algorithms on Quantum Hardware: Evaluation of Sample-Based Quantum Diagonalization

Overview and Motivation

This work presents a systematic benchmarking of the Sample-Based Quantum Diagonalization (SQD) algorithm for quantum chemistry, executed on the superconducting quantum processor ibm{rensselaer} and supported by high-performance classical post-processing on the AiMOS supercomputer. The evaluation leverages the W4-11 benchmark suite, which encompasses 745 thermochemical reactions and 152 molecules, providing an exhaustive testbed for assessing algorithmic accuracy, resource requirements, error scaling, and chemical coverage.

The challenge lies in accurately solving the electronic Schrödinger equation for many-body systems, where the exponential growth of Hilbert space precludes exact solutions beyond modest electron counts using classical approaches. Quantum algorithms have the potential to overcome this bottleneck, yet current pre-fault-tolerant devices impose significant constraints on circuit depth, gate fidelity, and qubit coherence, requiring hybrid quantum-classical techniques and rigorous benchmarking to establish numerical reliability.

SQD Algorithmic Workflow and Quantum Hardware Integration

The paper employs SQD—a quantum-classical approach wherein determinant configurations (Slater determinants) are probabilistically sampled via quantum circuits, and a classical diagonalization is performed in the subspace spanned by these sampled configurations. The SQD workflow comprises:

Classical Preprocessing: Hartree-Fock, MP2, CISD, CCSD, and CCSD(T) calculations provide reference energies and furnish amplitude tensors for circuit parametrization.
Quantum Circuit Generation and Execution: Circuits are constructed using the local unitary cluster Jastrow (LUCJ) ansatz, which leverages low-rank decompositions of CCSD amplitudes to reduce gate complexity and enable heavy-hex hardware connectivity.
Postprocessing and Error Mitigation: Configuration recovery techniques enforce conservation of electron number and spin projection, mitigating noise-induced state pollution. Final energies are obtained via classical diagonalization of the projected electronic Hamiltonian.
Figure 1: Computational flow of the SQD method, encompassing classical electronic-structure computations, circuit construction and transpilation, quantum device execution, and classical postprocessing.

The study ensures resource commensurability, matching quantum circuit sizes and classical configuration space truncations, allowing direct algorithmic performance comparisons.

Resource Scaling and Quantum Configuration Subspace Selection

The LUCJ ansatz employed in SQD offers favorable gate and depth scaling, with circuit depth $D \simeq 2.35 N_q$ and two-qubit gate count $n_g \simeq 0.98 N_q^2$ , where $N_q=2M$ is the qubit count for $M$ orbitals. The selection of sampled configurations is parameterized by $d \le \zeta N_{CCSD}$ , with $\zeta$ sweeping $25\%$ to $400\%$ relative to CCSD parameter space, and $N_{CCSD} = \mathcal{O}(N^2 M^2)$ .

Figure 2: Gate count and depth scaling (a), and achievable configuration subspace dimensions (b), relative to practical and full CI limits.

This systematic scaling allows the study to disentangle quantum noise effects from intrinsic algorithmic truncation and enables critical assessment of SQD against equivalent configuration-interaction spaces in CISD.

Accuracy and Error Propagation across Chemical Domains

Absolute and relative energy errors in ground state and reaction energies are reported across five chemical reaction classes: total atomization (TAE), bond dissociation (BDE), isomerization (ISO), heavy atom transfer (HAT), and nucleophilic substitution (SN).

Figure 3: Statistical distributions of ground-state and reaction energy errors and averaged per-reaction-class error metrics for SQD, CCSD, CISD, and MP2.

Key findings:

Raw SQD performance: SQD without extrapolation exhibits larger mean and outlier errors compared to CISD and CCSD, with error distributions broadening and mean errors decreasing as configuration subspace size increases. Crucially, SQD with configuration spaces matching CISD dimension remains less accurate, evidencing suboptimal configuration selection from quantum sampling.
Extrapolated SQD: Using energy-variance extrapolation, SQD achieves CCSD-level accuracy, with mean absolute ground-state energy errors reduced to $\sim 0.005\, E_h$ . The extrapolation is essential—unextrapolated SQD does not reach CCSD accuracy even at substantial configuration counts.
Reaction class dependency: Bond-breaking reactions benefit directly from increased configuration spaces, while SN reactions counterintuitively display increased errors with larger subspaces, suggesting an intricate interplay between quantum noise and configuration selection in highly charge-separated systems. Isomerization and HAT reactions reveal method-dependent deficiencies, tied to multireference character and electron correlation balance.
Figure 4: Extrapolation workflow and statistical error analysis for selected molecules, demonstrating performance of linear mixture and generalized eigenvalue approaches.

Extrapolation Techniques and Statistical Reliability

Two extrapolation strategies are benchmarked:

Linear Mixture Model (LMM): Clustering energy-variance pairs and performing local regressions in each cluster, requiring manual labeling and visual inspection.
Generalized Eigenvalue Extrapolation (GEV): Automated, constructs lowest-energy linear combinations via regularized generalized eigenvalue problems, yielding linearly extrapolated ground-state energies.

Statistical uncertainties are higher in GEV compared to LMM but more representative of actual performance limits. The presence of outlier molecules for which extrapolation is statistically incompatible with CCSD(T) reference highlights algorithmic boundaries and quantum sampling limitations.

Quantum Sampling, Error Suppression, and Hardware-Induced Limitations

The analysis systematically quantifies hardware-induced errors, specifically violations of electron number and spin conservation due to noise channels. Dynamical decoupling (DD) techniques are shown to improve signal fidelity, with the DD- $XY_4$ sequence offering superior suppression without excess gate overhead.

Figure 5: Asymmetry in sampled states due to noise, illustrating improved signal symmetry using advanced dynamical decoupling sequences across orbital counts.

The study characterizes the failure modes of SQD arising from noise-induced spin and particle number violations, thereby attributing part of the residual error to hardware finite fidelity rather than algorithmic structure.

Chemical and Algorithmic Deficiency Analysis

Extensive chemical-domain-specific analyses demonstrate that SQD fails to recover correct energetics in systems with:

Delocalized $\pi$ -bonding networks
Multicenter and near-degenerate states
Strong charge separation and correlation-sensitive rearrangements (e.g., SN and isomerization pathways)

CCSD shows robust performance, with deficiencies only in a narrow, strongly correlated chemical space. SQD's error mechanism is identified as configuration sampling suboptimality, amplified by quantum noise and configuration recovery limitations.

Practical and Theoretical Implications

This benchmark provides:

A community resource for systematic device and algorithm evaluation, enabling direct comparison across hardware generations and alternative quantum algorithms.
A foundation for algorithmic improvements targeting configuration sampling, ansatz design (e.g., more expressive circuits or improved amplitude surrogation), noise resilience, and resource economization.
Empirical baselines for method development in quantum computational chemistry, emphasizing the necessity of classical postprocessing and error mitigation even as hardware scales.

Future work must address the extension to larger basis sets (e.g., 6-31G), adaptivity in quantum circuit construction, and automated recovery schemes to lower extrapolation sensitivity and improve raw quantum sampling fidelity.

Conclusion

The study establishes rigorous accuracy and precision benchmarks for SQD quantum chemistry algorithms executed on modern superconducting quantum hardware. The necessity of systematic configuration sampling, sophisticated error mitigation, and energy-variance extrapolation is underscored by extensive numerical results. While extrapolated SQD approaches CCSD accuracy in many cases, raw sampling remains suboptimal, and quantum noise imposes statistically significant errors, particularly in electronically complex systems. The results delineate boundaries for practical quantum chemistry on extant hardware and furnish a scalable reference point for future algorithm and device advances.