Quantum Support Vector Machines

Updated 18 January 2026

Quantum Support Vector Machines (QSVMs) are quantum-enhanced models that leverage quantum feature maps and kernel methods to potentially outperform classical SVMs.
They encode classical data into high-dimensional Hilbert spaces using optimized quantum circuits, enabling efficient kernel estimation through techniques like compute-uncompute and HHL inversion.
QSVM designs focus on minimizing circuit depth and mitigating noise, aligning with NISQ hardware constraints to solve convex learning problems effectively.

Quantum Support Vector Machines (QSVM) realize the kernel support vector machine paradigm in the context of quantum information processing. By leveraging quantum feature maps—parametrized quantum circuits that embed classical inputs into high-dimensional Hilbert spaces—and quantum algorithms for kernel evaluation or linear-system solving, QSVMs seek to achieve computational enhancements in sample complexity, runtime, and representational power relative to classical SVMs when solving convex learning problems or structured least-squares tasks. The QSVM architecture encompasses data encoding, kernel estimation, convex optimization, and classification, and demands precise co-design with the physical constraints, noise models, and connectivity of near-term quantum hardware (Yang et al., 2019, Gentinetta et al., 2022).

1. Data Embedding and Quantum Feature Maps

The foundation of QSVMs is the encoding (or feature mapping) of classical inputs $x\in\mathbb{R}^d$ into quantum states $|\Phi(x)\rangle$ in Hilbert space. This embedding proceeds through unitary circuits designed for expressivity and hardware efficiency:

Affine and L² normalization: For low-dimensional applications, data may be rescaled and translated via an affine transformation so that points map onto the unit circle, with eigenvalues of the resulting kernel matrix normalized to a specific interval (e.g., [½, 1.5]). Subsequent L² normalization ensures unit Bloch-vector magnitude (Yang et al., 2019).
Single-qubit and multi-qubit feature maps: Simple problems (e.g., linearly separable in two dimensions) admit embedding into a single-qubit state using a rotation $R_y(2\theta)$ $R_{y} (2 θ)$ , where $\theta = \arg(x)$ $θ = ar g (x)$ . For general d-dimensional problems, common quantum feature maps include:
- Product states: $|\Phi(x)\rangle = \bigotimes_{j=1}^n R_{Z}(x_j)|0\rangle$
- Entangled maps: Pauli-structured (e.g., $U(x) = \exp(i\sum_j x_j Z_j + i\sum_{jk} f_{jk}(x) Z_j Z_k)$ ), hardware-efficient SU(2)-rotational blocks, and compositions thereof (Yang et al., 2019).
Hardware-aware embedding: For NISQ devices, circuits are optimized for depth and connectivity, often favoring shallow layers and direct implementation of product or entangling unitaries with bounded two-qubit gate count.

2. Quantum Kernel Estimation

QSVMs employ quantum circuits to evaluate the kernel matrix $K_{ij} = \langle \Phi(x_i) | \Phi(x_j) \rangle$ . Distinct quantum routines are used:

Overlap estimation via the compute-uncompute method: Prepare the state $U(x_i)|0\rangle$ , then apply $U(x_j)^\dagger$ , and measure the probability of the all-zeros outcome. This yields $|\langle \Phi(x_i) | \Phi(x_j) \rangle|^2$ , the fidelity quantum kernel.
Constant-depth data loading: For scaling to larger datasets, a "one-shot" multi-qubit oracle can load M vectors in parallel with circuit depth 1 at the cost of increased qubit count; classical post-processing reconstructs the reduced kernel matrix, obviating the need for full quantum state tomography (Yang et al., 2019).
Efficient statistical readout: By classically aggregating computational basis outcomes into raw counts, it is possible to estimate the required principal components of the kernel matrix with high efficiency, sidestepping quantum tomography overhead.

3. QSVM Optimization: Least-Squares and Dual Formulations

QSVM optimization is most commonly formulated in two convex settings:

Least-squares SVM (LS-SVM): The linear system

$F \begin{pmatrix} b \ \alpha \end{pmatrix} = \begin{pmatrix} 0 \ y \end{pmatrix}, \quad F = \begin{pmatrix} 0 & 1^T \ 1 & K+\gamma^{-1}I \end{pmatrix}$

is solved for dual variables $\alpha$ (support vector coefficients) and offset $b$ , with $\gamma$ the regularizer. The kernel matrix $K$ is supplied by quantum evaluation. For problems with no offset ( $b=0$ ), the system collapses to $(K+\gamma^{-1}I)\alpha = y$ (Yang et al., 2019).

Classical dual (quadratic programming): The dual objective, with quantum kernel, is

$\max_\alpha \sum_i \alpha_i - \frac{1}{2} \sum_{i,j} \alpha_i \alpha_j y_i y_j K_{ij} - \frac{\lambda}{2} \sum_i \alpha_i^2, \quad \alpha_i \geq 0$

The solution yields a classifier $f(x) = \mathrm{sign}[\sum_i \alpha^*_i y_i K(x_i, x)]$ (Gentinetta et al., 2022).

Quantum HHL-based inversion: The HHL quantum linear system solver is used for small matrices, requiring only $O(\log M)$ queries for well-conditioned, sparse systems, with circuit depth as low as 7 for 2×2 cases. This delivers an exponential quantum speed-up over classical inversion in theory, though with NISQ caveats (Yang et al., 2019).

4. Circuit Depth, Complexity, and NISQ Constraints

Resource and scaling analysis is critical for QSVM implementation:

Circuit depth: Optimized quantum circuits for feature mapping, kernel estimation, and HHL-based inversion can achieve total depths in the 7–20 range, compatible with current superconducting qubit coherence times. For example, kernel matrix generation can be realized in depth=1 (multi-qubit loading variant), and the full pipeline in depth<20 for two-dimensional problems (Yang et al., 2019).
Scaling of operations: Classical LS-SVM training scales as $O(M^3)$ for matrix inversion, or $O(M^2N)$ with kernel tricks. Quantum HHL inversion reduces this to $O(\log M)$ queries and polylogarithmic gates, assuming the kernel matrix is well-conditioned and sparse.
Entrenched bottlenecks: The M×M kernel matrix requires quadratic scaling in the number of quantum circuits or shots for direct estimation, making large-scale QSVM training currently challenging. For support size M>10 in NISQ-era devices, the "multi-qubit one-shot" circuit can reduce depth at qubit overhead.
NISQ-specific limitations: Stability is challenged by noise, absence of full fault tolerance, and the need for error-mitigation strategies (e.g., measurement error correction using Ignis). Circuit connectivity may necessitate additional SWAP gates depending on hardware topology.

5. Empirical Results, Performance, and Use Cases

QSVM pipelines have been implemented and benchmarked on real and simulated quantum hardware:

Simple linearly separable data: The full QSVM flow can classify two-dimensional, origin-separable datasets perfectly using optimized preprocessing, single-qubit rotations, constant-depth kernel estimation, and a shallow HHL solver (Yang et al., 2019).
Kernel eigenvalue normalization: Preprocessing to align kernel eigenvalues into a designated range [½, 1.5] is essential to minimize circuit overhead and ensure binary-encoded representations can be directly realized on a minimal qubit register.
Circuit and measurement efficiency: Removal of unnecessary tomography and classical aggregation of measurement results lead to notable reductions in runtime and gate depth, critical for executing within NISQ coherence times.
Scalability prospects: Exponential speedups are theoretically realizable under the assumptions of kernel matrix sparsity and conditioning, particularly for high-dimensional data or quantum-accessible oracles (Yang et al., 2019). However, actual implementations in current technology are limited by noise and the practical costs of kernel evaluation.

6. Comparison with Classical SVMs and Quantum Advantage

The central promise of QSVMs arises from the exponential scaling advantage in both sample and computational complexity in scenarios where the quantum feature map accesses high-dimensional entangled Hilbert spaces intractable to classical simulators. For standard SVMs, kernel evaluation and optimization require $O(M^3)$ steps in the general case, but quantum approaches asymptotically deliver polylogarithmic scaling in dataset and feature size, assuming ideal quantum resources.

Careful quantum circuit design, including tailored preprocessing, normalized embeddings, shallow oracles, and resource-efficient inversion, is fundamental to extracting advantages over classical methods, especially in the presence of realistic hardware noise and limited system size (Yang et al., 2019).

References:

"Support Vector Machines on Noisy Intermediate Scale Quantum Computers" (Yang et al., 2019)
"The complexity of quantum support vector machines" (Gentinetta et al., 2022)

Markdown Report Issue Upgrade to Chat

References (2)

Support Vector Machines on Noisy Intermediate Scale Quantum Computers (2019)

The complexity of quantum support vector machines (2022)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Quantum Support Vector Machines (QSVM).