Neural-Network Quantum States

Updated 4 February 2026

Neural-network quantum states (NQS) are variational wavefunction ansätze that use neural architectures to represent amplitudes and phases in many-body quantum systems.
They employ architectures like RBMs, CNNs, and RNNs to capture key quantum correlations, entanglement properties, and phase transitions with efficient optimization.
NQS have practical applications in ground state approximation, quantum phase detection, and circuit simulation, offering scalable and flexible computational tools.

Neural-network quantum states (NQS) are variational wavefunction ansätze for many-body quantum systems in which the amplitudes and phases of the quantum state are represented by artificial neural networks. NQS unify variational Monte Carlo approaches with the representational flexibility of neural architectures, enabling the approximation, simulation, and analysis of quantum many-body wavefunctions beyond the reach of traditional methods. They have been shown to accurately capture complex correlations, entanglement structure, and emergent order in a variety of prototypical spin, bosonic, and fermionic models, in both one and higher dimensions (Vivas et al., 2022, Deng et al., 2017, Zen et al., 2020, Pei et al., 2021, Kim et al., 2023, Döschl et al., 2024). Below, the formulation, methodologies, expressive power, entanglement properties, practical applications, and contemporary research directions in NQS are systematically detailed.

1. Formalism and Representative Architectures

In the NQS paradigm, a many-body quantum state $|\psi\rangle = \sum_{\mathbf{s}} \psi(\mathbf{s};\theta) |\mathbf{s}\rangle$ is parameterized via a neural network mapping the $N$ -body configuration $\mathbf{s} = (s_1,\dots,s_N)$ (with $s_i\in\{\pm1\},\{0,1,\dots,p-1\}$ ) to a complex amplitude $\psi(\mathbf{s};\theta)$ . The variational principle drives the optimization of the weights $\theta$ to approximate ground or low-lying excited states.

Key architectures include:

Restricted Boltzmann Machines (RBMs): Two-layer networks with visible units (physical degrees of freedom) and hidden binary units, with the amplitude given by

$\psi_{\rm RBM}(\mathbf{s};\theta) = \exp\left[\sum_{i} a_i s_i\right] \prod_{j=1}^{M} 2\cosh\left(b_j + \sum_{i} W_{ij}s_i\right)$

where $\theta = \{a_i, b_j, W_{ij}\}$ and $M = \alpha N$ sets the hidden-unit density (Zen et al., 2020, Vivas et al., 2022).

Feedforward and Deep Networks: Multi-layer perceptrons with nonlinearities acting on the input configuration (Choo et al., 2018, Vivas et al., 2022).
Convolutional Neural Networks (CNNs): Architectures encoding locality and translational symmetry, often used for lattice models (Gutiérrez et al., 2019, Vivas et al., 2022).
Autoregressive and Recurrent Networks (RNNs): Models with outputs factorized as products of conditional probabilities, enabling exact sampling and efficient scaling to larger systems (McNaughton et al., 24 Jul 2025, Döschl et al., 2024).
Hybrid or Enhanced Ansätze: Combinations include Slater or Pfaffian determinants (for antisymmetry), graph neural networks, or tensor-network augmentations (Pei et al., 2021, Kim et al., 2023).

Parameter optimization employs gradient-based methods, notably stochastic reconfiguration (SR), natural gradient descent, RMSprop, or Adam (Vivas et al., 2022, Zen et al., 2020).

2. Variational Monte Carlo and Optimization Schemes

The variational energy for a Hamiltonian $H$ is

$E[\psi_\theta] = \frac{\langle \psi_\theta | H | \psi_\theta \rangle}{\langle \psi_\theta | \psi_\theta \rangle} = \sum_{\mathbf{s}} \frac{|\psi_\theta(\mathbf{s})|^2}{Z} E_{\rm loc}(\mathbf{s})$

with $E_{\rm loc}(\mathbf{s}) = \sum_{\mathbf{s}'} H_{\mathbf{s},\mathbf{s}'} \frac{\psi_\theta(\mathbf{s}')}{\psi_\theta(\mathbf{s})}$ , and $Z$ the normalization.

Expectation values and gradients are sampled via Markov chain Monte Carlo (MCMC) from $|\psi_\theta(\mathbf{s})|^2$ or, for autoregressive NQS, via direct sampling (Vivas et al., 2022). The variational gradient

$\partial_{\theta_k} E \approx 2 \left( \langle O_k E_{\rm loc} \rangle - \langle O_k \rangle \langle E_{\rm loc} \rangle \right)$

where $O_k(\mathbf{s}) = \partial_{\theta_k} \ln \psi_\theta(\mathbf{s})$ , informs parameter updates (Zen et al., 2020). For complex NQS, natural gradient (stochastic reconfiguration) is often used, updating $\delta\theta$ by solving $S\delta\theta = -\epsilon F$ , with SR matrix $S_{kk'} = \langle O_k^* O_{k'} \rangle - \langle O_k^* \rangle \langle O_{k'} \rangle$ and force $F_k = \langle O_k^* E_{\rm loc} \rangle - \langle O_k^* \rangle \langle E_{\rm loc} \rangle$ (Choo et al., 2018, Vivas et al., 2022).

Architectural and optimizer hyperparameters (e.g., hidden-unit density, learning rate, batch size) are adjusted according to system size and model complexity. Adaptive schemes, such as growth of RNN hidden-state dimension, further improve computational efficiency and stability (McNaughton et al., 24 Jul 2025).

3. Expressivity, Entanglement, and Relation to Tensor Networks

NQS exhibit expressive power surpassing many standard tensor network approaches:

Entanglement Structure:
- Short-range RBMs rigorously satisfy area-law bounds on the entanglement entropy:
$S_\alpha(A) \leq 2R |\partial A| \log 2$

for any Rényi index $\alpha$ , region $A$ , and hidden-visible range $R$ (Deng et al., 2017, Jia et al., 2018). - Fully-connected RBMs can realize exact volume-law scaling, $S_\alpha(A) = |A| \log 2$ , with only $O(N)$ parameters, capturing massive long-range entanglement typically inaccessible to matrix product states (MPS) of moderate bond dimension (Deng et al., 2017, Pei et al., 2021).
Comparison with Tensor Networks:
- Short-range RBMs are equivalent to Entangled Plaquette States (EPS) with cosh-parameterized local correlators.
- Fully-connected RBMs map to nonlocal String-Bond States with bond-dimension $D=2$ .
- Deep Boltzmann Machines can exactly represent finite-depth quantum circuits of polynomial size (Jia et al., 2018, Glasser et al., 2017).
Hybrid Constructions: NQS can be combined with tensor networks (e.g., MPS or PEPS) to form states with enriched expressivity while retaining the benefits of each framework (Glasser et al., 2017, Vivas et al., 2022).

Key implication: By tuning connectivity and the number of hidden units, NQS interpolate between conventional area-law–dominated and highly entangled volume-law–dominated quantum states, providing a universal variational toolbox (Deng et al., 2017, Pei et al., 2021).

4. Practical Applications and Extensions

NQS have demonstrated high fidelity in various prominent physical contexts:

Ground State Approximation: RBM and deep architectures recover ground-state energies of the 1D and 2D transverse-field Ising model, Heisenberg model, and Hubbard model within $10^{-4}$ – $10^{-3}$ per site compared to exact diagonalization or density-matrix renormalization group (DMRG), with substantially fewer parameters (Vivas et al., 2022, Zen et al., 2020, Jia et al., 2018).
Quantum Phase Transitions: Neural-network states can efficiently locate quantum critical points via unsupervised detection of order-parameter inflection points, using transfer learning to accelerate parameter scans and analytic RBM initialization in deep phases (Zen et al., 2020, Nomura, 2022). Optimized network weights directly reflect changes in quantum phases, exposing phase diagrams without explicit measurement.
Real-time and Imaginary-time Dynamics: TDVMC with NQS or implicit midpoint rule integrators preserves the unitary structure of real-time evolution and matches the accuracy of established approaches, with computational cost scaling linearly in network parameters and batch size (Gutiérrez et al., 2019, Vivas et al., 2022).
State Tomography: Neural-network tomography reconstructs pure or mixed quantum states from measurement data, leveraging NQS as parametrized density operators or via purification methods (Vivas et al., 2022, Jia et al., 2018).
Classical Simulation of Quantum Circuits: NQS provide an efficient representation for simulating quantum gates and circuits, with exact updates for $Z$ -diagonal gates and variational learning for nondiagonal gates (e.g., Hadamard) (Jónsson et al., 2018, Vivas et al., 2022). Performance can exceed hardware with gate error rates of $10^{-3}$ per gate.
Fermionic and Topologically Ordered States: Hybrid architectures such as Pfaffian-Jastrow NQS with message-passing backflow have produced lower ground-state energies for ultra-cold Fermi gases than fixed-node diffusion Monte Carlo, and can be generalized to capture strong pairing and symmetry constraints (Kim et al., 2023, Glasser et al., 2017, Pei et al., 2021).

5. Architectural Advances and Current Research Directions

The scope of NQS has broadened considerably:

Autoregressive Models: PixelCNN, masked autoencoding, and RNN-based NQS allow exact sampling and improved training scalability for large systems (McNaughton et al., 24 Jul 2025, Döschl et al., 2024, Vivas et al., 2022). Tensorized RNNs are practical for 2D models with large local Hilbert space and long-range interactions, discovering crystalline, stripe, and fractionalized quantum phases (Döschl et al., 2024).
Adaptive Model Growth: Dynamic increase of model capacity during training, for example by expanding RNN hidden-state dimension, enables rapid convergence, better handling of rugged optimization landscapes, and more efficient GPU resource usage (McNaughton et al., 24 Jul 2025).
Hybrid and Graph-based Networks: Tensor-network/NQS hybrids, self-attention, and message-passing architectures have improved accuracy for frustrated spin systems, topologically ordered states, and models with arbitrary graph connectivity (Vivas et al., 2022, Kim et al., 2023).
Compact and Exact Representations: NQS can provide exact, compact representations of Jastrow and stabilizer states with only $M = N-1$ hidden units, explaining observed numerical behavior and suggesting efficient architectural designs for further generalization (Pei et al., 2021).

Limitations and Challenges:

Sampling bottlenecks and slow mixing in conventional MCMC-based NQS, especially near phase transitions (Vivas et al., 2022).
SR matrix inversion becomes computationally intensive for large parameter counts, but can be alleviated with autoregressive and gradient-descent–based approaches (Gutiérrez et al., 2019).
Generalization to continuous degrees of freedom and explicit enforcement of particle statistics (antisymmetry for fermions, Bose symmetry) remain open technical challenges (Vivas et al., 2022, Kim et al., 2023).

6. Benchmarks, Accuracy, and Computational Cost

Empirical evidence supports the competitiveness and scalability of NQS:

Model / Method	System Size	Error (per site or total)	Parameters / Features	Reference
1D TFIM (RBM NQS)	$N$ up to 128	$J_c$ within $10^{-3}$	$M=2N$ hidden units	(Zen et al., 2020)
2D Heisenberg, CNN NQS	$10\times10$	$10^{-3}$ – $10^{-2}$	$\mathcal{O}(10^5)$ parameters	(Vivas et al., 2022)
2D Hofstadter (RNN NQS)	$12\times12$	$<1\%$ energy error	$\sim2.6\times 10^5$	(Döschl et al., 2024)
1D TFIM (Adaptive RNN NQS)	$N=100$	energy error $<10^{-5}$	wall time $<40\%$ static	(McNaughton et al., 24 Jul 2025)
Ultra-cold Fermi gas (Pfaffian NQS)	$N=14$	ground-state energy lower	neural backflow, message passing	(Kim et al., 2023)

This demonstrates practical applicability across a spectrum of quantum regimes, often with computational speedups exceeding an order of magnitude versus traditional tensor-network approaches and with far fewer variational parameters.

7. Perspectives and Future Outlook

NQS have established a new computational paradigm for quantum many-body physics, characterized by:

Universality and flexibility: Encompassing area-law and volume-law entangled states, topologically ordered phases, and strong correlations (Deng et al., 2017, Pei et al., 2021).
Algorithmic advances: Adaptive architectures, autoregressive sampling, variational principle–based optimization, and efficient classical simulation of quantum circuits (McNaughton et al., 24 Jul 2025, Jónsson et al., 2018, Vivas et al., 2022).
Cross-fertilization: Interplay with tensor networks, machine learning techniques, and developments in classical and quantum hardware.

Future directions include deeper neural architectures, adaptive capacity during real- and imaginary-time evolution, symmetry-enforced learning, generalization to larger Hilbert spaces, practical simulation of fermionic systems at scale, and integration with modern optimization and sampling algorithms. Algebraic characterization of NQS expressivity, especially in the context of quantum criticality and non-Abelian topological phases, remains an active research frontier (Pei et al., 2021, Zen et al., 2020).

References:

(Deng et al., 2017, Glasser et al., 2017, Choo et al., 2018, Jónsson et al., 2018, Jia et al., 2018, Gutiérrez et al., 2019, Zen et al., 2020, Pei et al., 2021, Nomura, 2022, Vivas et al., 2022, Kim et al., 2023, Döschl et al., 2024, McNaughton et al., 24 Jul 2025)