Projected Quantum Kernels Overview

Updated 15 February 2026

Projected quantum kernels are defined by mapping quantum states to lower-dimensional classical features using local observables, reducing quantum resource requirements.
They decouple quantum state preparation from classical kernel computations, enabling efficient hyperparameter tuning and enhanced noise resilience compared to fidelity-based kernels.
Practical implementations using shallow circuits and local measurements achieve competitive performance in pattern recognition tasks while mitigating vanishing similarity issues.

Projected quantum kernels are a class of quantum kernels that leverage quantum data-embedding circuits followed by local classical projections to construct resource-efficient and hardware-robust similarity measures for quantum machine learning. Unlike fidelity-based (global) quantum kernels, which involve overlaps in the full Hilbert space, projected quantum kernels extract subsystem or local observables from quantum states and apply conventional classical kernels on these lower-dimensional classical feature vectors. This decouples the quantum and classical layers, reduces measurement overhead, and allows efficient hyperparameter tuning and expressivity control, while mitigating the vanishing similarity problem that affects deep or highly entangling quantum circuits.

1. Mathematical Formulation and Construction

Let $x \in \mathbb{R}^d$ be a classical input encoded into an $n$ -qubit quantum state via a parameterized unitary $U(x)$ :

$|\psi(x)\rangle = U(x) |0\rangle^{\otimes n}, \qquad \rho(x) = |\psi(x)\rangle\langle\psi(x)|$

Projected quantum kernels "project" the quantum state to a classical feature vector using local (subsystem) observables. Denoting a collection of subsystems $S_\kappa$ (e.g., single qubits or qubit pairs), the reduced density matrix is:

$\rho_\kappa(x) = \operatorname{Tr}_{\overline{S_\kappa}}[\rho(x)]$

Features can then be defined by measuring local observables $O_j$ over these marginals:

$f_{j}(x) = \operatorname{Tr}[O_j\, \rho_\kappa(x)]$

Collecting all such $f_j(x)$ defines a feature vector $f(x) \in \mathbb{R}^k$ , on which a classical positive-definite kernel $\kappa$ (typically Gaussian/RBF):

$K_{\text{proj}}(x, x') = \kappa(f(x), f(x'))$

or, commonly,

$K_{\text{proj}}(x, x') = \exp\left( -\gamma \|f(x) - f(x')\|^2 \right)$

The special case of the linear projected kernel is:

$K_{\text{proj}}^{\text{lin}}(x, x') = f(x) \cdot f(x')$

This construction projects the global quantum feature space into a classical subspace defined by physically measurable quantities, reducing the dimensionality and resource requirements (Suzuki et al., 2023, Marcantonio et al., 2022, d'Amore et al., 20 May 2025, Gil-Fuster et al., 2023, Schnabel et al., 2024).

2. Theoretical Hierarchy and Expressivity

Projected quantum kernels fit into the broader framework of trace-induced quantum kernels, as formalized by Gan et al. (Gan et al., 2023). Any kernel of the form:

$K(x, x') = \operatorname{Tr}[V(\rho(x)) V(\rho(x'))]$

can be expressed as a sum over "Lego" kernels, i.e., products of expectation values of orthonormal operator basis elements. Projected quantum kernels correspond to choosing weights on a small subset of such operators (e.g., all local Pauli strings up to weight $H$ ), thereby trading off expressivity and resource requirements.

The expressivity of a projected quantum kernel is controlled by the number $p$ of nonzero weights (active Lego kernels), which simultaneously bounds the hypothesis space dimensionality and generalization error. Increasing the subsystem size or observable set increases expressivity at the cost of higher measurement and classical processing overhead. The generalized projected quantum kernel offers a systematic path for complexity tuning between local and global kernels (Gan et al., 2023).

3. Implementation Strategies and Resource Considerations

Circuit Architecture and Measurement

The typical procedure involves three components:

Data Parameterization: Input $x$ is mapped via a shallow or moderate-depth parametric quantum circuit (PQC), often with separate single-qubit rotations and optionally local entangling layers (d'Amore et al., 20 May 2025, Suzuki et al., 2023).
Projection: Local observables (often Pauli operators) are measured on reduced density matrices of chosen qubit subsets. For each data sample, measurements in several bases (e.g., $X, Y, Z$ ) are required per qubit.
Classical Kernel Computation: The resulting expectation values form a feature vector for classical kernel construction (e.g., RBF kernel).

Resource efficiency arises from the following mechanisms:

The number of quantum circuit executions per data point is $O(n\cdot\text{shots})$ , as opposed to $O(N^2\cdot\text{shots})$ for pairwise fidelity kernels in a dataset of size $N$ (Marcantonio et al., 2022, Miroszewski et al., 2024).
Avoidance of deep circuits and global SWAP tests; only local measurements are needed.
Option to use classical shadow tomography to further reduce sample complexity (Suzuki et al., 2023, d'Amore et al., 20 May 2025).

However, as subsystem size and number of observables grow, the total number of required measurements and classical computation increase, bringing a trade-off between trainability, generalization, and resource usage.

4. Trainability, Variance Scaling, and Vanishing Similarity

Projected quantum kernels were originally motivated by the need to avoid the exponential concentration ("vanishing similarity") of global quantum kernels. For a fully random n-qubit circuit (2-design), the variance of the projected kernel decays exponentially with the number of qubits, quickly rendering the kernel matrix nearly constant and uninformative (Suzuki et al., 2023).

Key findings from analytic and numerical studies:

Variance of local projected terms depends not on $n$ , but on the local block size $m$ , circuit depth $L$ , and the entanglement structure of the initial state (Suzuki et al., 2023).
For shallow alternating-layered ansatzes (ALA) with low entanglement, the variance decays only as $2^{-2mL}$ , thus preserving non-vanishing similarity in the large- $n$ limit.
Highly entangled initial states or deep circuits cause rapid concentration, reproducing the infeasibility of global quantum kernels.
Practical recommendation: Use shallow, low-locality circuits and low-entanglement inputs to maintain trainability (Suzuki et al., 2023).

These results are essential for scalable quantum kernel methods on real hardware, as they enable nontrivial learning even as the problem size grows.

5. Practical Performance: Benchmarks, Hyperparameter Tuning, and Comparison

Comprehensive benchmarking studies (Schnabel et al., 2024, Alagiyawanna et al., 6 Jan 2026, d'Amore et al., 20 May 2025) have yielded the following empirical insights:

Projected quantum kernels perform competitively with fidelity-based kernels and classical RBF kernels across various classification and regression problems, provided that hyperparameters (notably, bandwidth, regularization, and projection choices) are carefully tuned.
On small, noisy, or data-scarce tasks (e.g., MNIST/CIFAR-10 with $N \sim 10^3$ ), PQK-enhanced classifiers (e.g., CNNs) can significantly outperform baseline classical architectures (Alagiyawanna et al., 6 Jan 2026).
For practical datasets, the choice of projection (which qubits/observables), outer kernel function, and feature rescaling dominate performance more than the presence of entanglement in the state-preparation circuit (Schnabel et al., 2024).
PQKs are robust against hardware noise due to reliance on local measurements rather than global overlap.

However, extensive hyperparameter tuning is required, as bandwidth settings alone can make PQKs nearly indistinguishable from classical RBF or low-degree polynomial kernels, implying that observed performance gains in many regimes do not necessarily reflect quantum advantage (Flórez-Ablan et al., 7 Mar 2025).

Kernel Type	Expressivity Controlled By	Resource Scaling	Noise Robustness	Empirical Performance
Fidelity (Global) QK	Circuit depth, number of qubits	$O(N^2)$ shots	Low	Matches or < PQK with tuning
Projected QK	Subsystem/observable choice, bandwidth	$O(N k)$ shots	High	On par/ > with tuning
Classical RBF	Bandwidth, feature scaling	$O(N^2 d)$ classical	N/A	Very strong with tuning

6. Limitations, Resource Scaling, and Classical Simulatability

Despite their practical advantages, projected quantum kernels are subject to several limitations:

Exponential decay of kernel value variance (vanishing similarity) can re-emerge from high circuit depth, extensive entanglement, or use of global projections, thereby demanding an exponential number of measurement shots for precise kernel estimation (Suzuki et al., 2023, Miroszewski et al., 2024).
Hardware resource savings may be offset by the increased number of features or measurement bases required for high expressivity.
Bandwidth and hyperparameter tuning frequently cause the PQK to align closely—sometimes identically—with classical RBF or low-order polynomial kernels, thus erasing any quantum advantage in standard classical-data settings (Flórez-Ablan et al., 7 Mar 2025).
No clear regime of provable quantum advantage has been observed for classical datasets or typical projections, except in group-structured learning tasks with quantum-inaccessible features (Naguleswaran, 2024).

In large- $n$ regimes or when operating with deep circuits, the practical value of PQKs is limited unless combined with advanced techniques such as classical shadows or careful hardware-aware ansatz design (Suzuki et al., 2023, Miroszewski et al., 2024).

7. Applications and Outlook

Projected quantum kernels have been applied in a variety of settings:

Pattern recognition and small dataset learning, including hybrid PQK–CNN architectures for MNIST and CIFAR-10 classification (Alagiyawanna et al., 6 Jan 2026).
IoT data analysis, where PQKs match or slightly outperform classical RBF SVMs and facilitate shot-efficient measurement (d'Amore et al., 20 May 2025).
All-optical experimental setups demonstrating exponential compression of the required quantum resources for certain kernel families (Bartkiewicz et al., 2019).

Emerging directions include integration with convolutional architectural motifs, dynamical projection strategies, and efficient quantum feature extraction for NISQ devices (Naguleswaran, 2024, Altmann et al., 30 Jan 2026).

Open questions remain concerning the development of genuinely quantum-advantaged kernels, the mitigation of resource and concentration-related bottlenecks, and the identification of learning tasks where quantum projections yield provable improvements over classical analogs (Gil-Fuster et al., 2023, Schnabel et al., 2024).

In summary, projected quantum kernels provide a principled and highly tunable mechanism for extracting classical features from quantum-encoded data, forming a bridge between quantum and classical kernel methods that is well-suited to the current capabilities and constraints of NISQ hardware. Their performance and practicality are heavily contingent upon circuit depth, subsystem/projector designs, input state entanglement, and hyperparameter selection, with their expressivity and quantum resource demand modulated via the size and structure of the projection and the associated classical kernel function.