Gaussian Kernel Feature Space

Updated 28 January 2026

Feature Space for Gaussian Kernels is the infinite-dimensional reproducing kernel Hilbert space induced by the Gaussian function, enabling inner product representations and universal approximations.
Various constructions including Mercer, Hermite, Fourier, and Segal–Bargmann expansions provide distinct analytical frameworks based on eigen-decompositions and spectral methods.
Practical techniques like random Fourier features and explicit finite-dimensional mappings offer efficient approximations and implementations in kernel-based statistical learning.

A Gaussian kernel is a symmetric, positive-definite function of the form

$k(x, y) = \exp\left(-\frac{\|x - y\|^2}{2\sigma^2}\right), \quad x, y \in \mathbb{R}^d$

or, more generally, defined on a real separable Hilbert space $\mathcal{H}$ . The feature space corresponding to a Gaussian kernel is the Hilbert space into which data are mapped such that $k(x, y)$ equals the inner product between the mapped points. This feature space is typically infinite-dimensional, and its structure underlies the expressive power, universality, and practical implementation of kernel methods in statistical learning, Gaussian processes, SVMs, and related fields. Several analytically distinct but isometrically equivalent constructions capture the geometry and analytic regularity of the Gaussian kernel's feature space across finite- and infinite-dimensional domains.

1. Reproducing Kernel Hilbert Space of the Gaussian Kernel

Let $\mathcal{H}$ be a real separable Hilbert space. The Gaussian kernel $k(x, y)=\exp(-\|x-y\|^2/(2\sigma^2))$ is positive definite on $\mathcal{H}$ . By Aronszajn’s theorem, there exists a unique reproducing kernel Hilbert space (RKHS) $\mathcal{H}_k$ consisting of functions $f:\mathcal{H}\to\mathbb{R}$ with the following properties (Guella, 2020):

For each $y\in\mathcal{H}$ , $k(\cdot, y) \in \mathcal{H}_k$
The reproducing property: $\forall f \in \mathcal{H}_k$ , $f(y) = \langle f, k(\cdot, y) \rangle_{\mathcal{H}_k}$
The span $\{k(\cdot, y) : y \in \mathcal{H}\}$ is dense in $\mathcal{H}_k$

Every $f \in \mathcal{H}_k$ admits a representation $f(\cdot) = \sum_{i=1}^\infty c_i k(\cdot, x_i)$ (converging in $\mathcal{H}_k$ ), with norm $\|f\|_{\mathcal{H}_k}^2 = \sum_{i,j} c_i c_j k(x_i, x_j)$ . The feature map $\Phi: x \mapsto k(\cdot, x)$ embeds $\mathcal{H}$ into $\mathcal{H}_k$ so that $k(x, y) = \langle \Phi(x), \Phi(y) \rangle_{\mathcal{H}_k}$ .

2. Mercer and Hermite Expansions: Explicit Feature Maps

On $\mathbb{R}^d$ , Mercer’s theorem applies to the Gaussian kernel and gives an eigen-decomposition: $k(x,y) = \sum_{n=0}^\infty \lambda_n \varphi_n(x) \varphi_n(y)$ The expansion is explicitly realized in terms of Hermite polynomials. For the one-dimensional case, with standard normal weight $\mu_0(dx)$ and orthonormal Hermite functions $\psi_n(x)$ , the kernel admits

$\exp\left(-\sigma^2(x-y)^2\right) = \sum_{n=0}^\infty \lambda_n \varphi_n(x) \varphi_n(y)$

where the $\varphi_n$ are scaled Hermite functions and $\lambda_n$ depends explicitly on $\sigma^2$ (Gnewuch et al., 2021). In higher dimensions, the eigenfunctions and eigenvalues tensorize across input coordinates.

The associated feature map is

$\Phi(x) = \left(\sqrt{\lambda_\beta} \Phi_\beta(x) \right)_{\beta \in \mathbb{N}_0^s} \in \ell^2(\mathbb{N}_0^s)$

where $\Phi_\beta(x)$ is the tensor product of univariate Hermite functions. For general (possibly infinite) $s$ , the maximal domain consists of sequences $(x_j)$ with $\sum_j \sigma_j^2 x_j^2 < \infty$ , and the RKHS is the incomplete tensor product of the univariate spaces (Gnewuch et al., 2021).

3. Fourier (Bochner) Representations and Random Features

For shift-invariant kernels, Bochner’s theorem yields a spectral representation: $k(x,y) = \int_{\mathbb{R}^d} e^{i \omega \cdot (x-y)} \mu(d\omega)$ For the Gaussian kernel, $\mu$ is a Gaussian measure with density $\mu(d\omega) = (2\pi\sigma^2)^{-d/2} e^{-\sigma^2 \|\omega\|^2/2} d\omega$ (Jorgensen et al., 2017). The (complex-valued) feature map is $\phi(x,\omega) = e^{i \omega \cdot x}$ and $\phi(x, \cdot)$ belongs to $L^2(\mu)$ .

In practical applications, random Fourier features (RFF) approximate the Gaussian kernel by finite-dimensional real-valued feature maps (Ton et al., 2017). Drawing $w_i \sim \mathcal{N}(0,\sigma^{-2} I_d)$ and $b_i \sim \text{Uniform}[0, 2\pi]$ ,

$\phi(x) = \sqrt{\frac{2}{D}} \left( \cos(w_1^\top x + b_1), \ldots, \cos(w_D^\top x + b_D) \right)^T$

The inner product of finite feature vectors converges to $k(x,y)$ at rate $O(1/\sqrt{D})$ in spectral norm, as $D\to\infty$ (Ton et al., 2017).

4. Polynomial and Maclaurin Expansions; Localized Feature Maps

Equivalently, the exponential can be expanded via Maclaurin series,

$k(x,y) = \sigma^2 \exp\left(-\frac{\|x\|^2 + \|y\|^2}{2\ell^2} \right) \sum_{n=0}^\infty \frac{(x\cdot y)^n}{n! \ell^{2n}}$

yielding the "polynomial sketch" feature map, where monomials are approximated by random projections (polynomial sketches) (Wacker et al., 2022). Localization—centering the feature map around the test point—cures the pathologies of naive Maclaurin truncation, yielding accurate, finite-dimensional local approximations, especially effective for high-frequency (short lengthscale) data regimes.

5. Finite-Dimensional Explicit Feature Maps for Finite Sets

For a finite dataset $\{x_1, ..., x_N\}$ , an explicit, exact $N$ -dimensional feature map can be constructed (Ghiasi-Shirazi et al., 2024). Defining the kernel matrix $K \in \mathbb{R}^{N \times N}$ , its inverse square root $K^{-1/2}$ , and $k_z = (k(x_1, z), ..., k(x_N, z))^T$ , one defines

$\phi(z) = K^{-1/2} k_z$

This map reproduces the kernel exactly between any pair $x, y$ where at least one argument is a training point: $\phi(x)^T \phi(y) = k(x, y)$ . The explicit feature representation enables kernel PCA and other linear methods to be directly implemented in the primal space.

6. Fock and Segal–Bargmann Space Constructions

The Gaussian feature space can be realized as a Fock (Segal–Bargmann) space of entire functions $F: \mathbb{C}^d \to \mathbb{C}$ ,

$\|F\|_{H_\sigma}^2 = \left(\frac{2}{\pi \sigma^2}\right)^d \int_{\mathbb{C}^d} |F(z)|^2 e^{-2|z|^2 / \sigma^2} dA(z) < \infty$

with an orthonormal basis of monomials and explicit reproducing kernel $K_F(z,w) = \exp(a z \cdot w)$ ( $a = 2/\sigma^2$ ) (Alpay et al., 2022). The Segal–Bargmann transform provides an isometric isomorphism between $L^2(\mathbb{R}^d)$ and $H_\sigma$ . The Gaussian kernel is realized as the inner product in this space,

$k_\sigma(x, y) = \langle \Phi_\sigma(x), \Phi_\sigma(y) \rangle_{H_\sigma}$

with explicit Hermite-basis expansion.

7. Universality, ISPD, and Domain Extensions

The Gaussian kernel is universal, integrally strictly positive definite (ISPD), and $C_0$ -universal on Hilbert spaces; $\mathcal{H}_k$ is dense in $C_0(\mathcal{H})$ , ensuring that the feature-map embedding is rich enough to approximate all continuous functions vanishing at infinity (Guella, 2020). These properties ensure strong consistency and flexibility for statistical learning. Analogous constructions exist for Gaussian-type kernels on non-Euclidean domains, including hyperbolic spaces via Schoenberg’s theorem and generalizations to conditional negative definiteness (CND) kernels.

Table: Major Constructions of Gaussian Kernel Feature Spaces

Construction Type	Space / Domain	Feature Map / Expansion
Hermite/Mercer Expansion	$\mathbb{R}^d,\mathcal{H}$	Hermite polynomials via Mercer, $\ell^2$
Bochner/Fourier Representation	$\mathbb{R}^d$	$L^2(\mu)$ (Fourier basis, RFF)
Polynomial/Maclaurin	$\mathbb{R}^d$	Monomials/polynomial sketches, localized
Fock/Segal–Bargmann Space	$\mathbb{C}^d$	Entire holomorphic functions, Hermite basis
Explicit finite-dimensional	$N$ -point subset	$N$ -dimensional exact construction

Each construction realizes the RKHS with different analytical and computational properties: Hermite gives orthonormal bases, Bochner yields spectral sampling, polynomial/Maclaurin facilitates sparse approximations, Fock spaces bring connections to complex analysis and quantum mechanics, while finite-dimensional explicit maps enable primal implementations over finite data sets.

References

(Guella, 2020): On Gaussian kernels on Hilbert spaces and kernels on Hyperbolic spaces
(Gnewuch et al., 2021): Countable Tensor Products of Hermite Spaces and Spaces of Gaussian Kernels
(Jorgensen et al., 2017): Reproducing kernels and choices of associated feature spaces, in the form of $L^{2}$ -spaces
(Wacker et al., 2022): Local Random Feature Approximations of the Gaussian Kernel
(Ghiasi-Shirazi et al., 2024): An Exact Finite-dimensional Explicit Feature Map for Kernel Functions
(Ton et al., 2017): Spatial Mapping with Gaussian Processes and Nonstationary Fourier Features
(Alpay et al., 2022): An approach to the Gaussian RBF kernels via Fock spaces