Papers
Topics
Authors
Recent
Search
2000 character limit reached

Subsampled Randomized Hadamard Transform (SRHT)

Updated 11 November 2025
  • SRHT is a structured random projection method that combines randomized sign-flips, the fast Walsh–Hadamard transform, and uniform subsampling for efficient dimensionality reduction.
  • It provides nearly optimal subspace embeddings with faster geometric error decay than Gaussian projections, enhancing iterative solver performance.
  • SRHT achieves condition-number-free computational complexity and scales efficiently to large datasets, making it ideal for high-precision, distributed environments.

The Subsampled Randomized Hadamard Transform (SRHT) is a structured random projection technique that enables efficient dimensionality reduction, fast randomized matrix algorithms, and nearly optimal subspace embeddings for large-scale computational linear algebra and statistical learning. By combining a randomized diagonal sign-flip, a fast Walsh–Hadamard transform, and uniform subsampling, SRHT matches the embedding quality of dense Gaussian projections at a fraction of the computational cost. Recent advances have rigorously analyzed its spectral properties, convergence rates in iterative sketching, explicit polynomial acceleration schemes, and practical deployment in distributed and high-precision environments.

1. Formal Definition and Algorithmic Structure

Given input dimension n=2pn=2^p, the SRHT is typically defined by the matrix

S=BHnDPS = B\,H_n\,D\,P

where:

  • HnRn×nH_n \in \mathbb{R}^{n \times n} is the normalized Walsh–Hadamard matrix, constructed by recursion:

H1=1,H2k=12(HkHk HkHk)H_1 = 1, \quad H_{2k} = \frac{1}{\sqrt{2}} \begin{pmatrix} H_k & H_k\ H_k & -H_k \end{pmatrix}

  • PP is a random n×nn \times n permutation matrix.
  • D=diag(di)D = \text{diag}(d_i) has i.i.d. Rademacher signs (di=±1d_i = \pm 1).
  • B=diag(bi)B = \text{diag}(b_i) is a diagonal sampling matrix with biBernoulli(m/n)b_i \sim \text{Bernoulli}(m/n).

The all-zero rows are discarded, resulting in a final sketch S=BHnDPS = B\,H_n\,D\,P0 with S=BHnDPS = B\,H_n\,D\,P1 and S=BHnDPS = B\,H_n\,D\,P2.

Application of S=BHnDPS = B\,H_n\,D\,P3 for S=BHnDPS = B\,H_n\,D\,P4 is performed via the fast Walsh–Hadamard transform in S=BHnDPS = B\,H_n\,D\,P5 time. The transform randomizes and spreads the data energy, mitigating coordinate-wise sparsity and enabling near-uniform sampling.

2. Limiting Spectral Distribution and Random Matrix Theory

Let S=BHnDPS = B\,H_n\,D\,P6, and consider S=BHnDPS = B\,H_n\,D\,P7. Under the proportional asymptotic regime,

S=BHnDPS = B\,H_n\,D\,P8

the empirical spectral distribution of S=BHnDPS = B\,H_n\,D\,P9 converges to a deterministic law HnRn×nH_n \in \mathbb{R}^{n \times n}0 supported in HnRn×nH_n \in \mathbb{R}^{n \times n}1, with density

HnRn×nH_n \in \mathbb{R}^{n \times n}2

where

HnRn×nH_n \in \mathbb{R}^{n \times n}3

This explicit description enables precise analysis of the preconditioned Hessian HnRn×nH_n \in \mathbb{R}^{n \times n}4 critical to accelerated first-order optimization. The edge eigenvalues control the numerical stability and error contraction.

3. Polynomial Accelerators and Orthogonal Polynomial Recurrences

For accelerated iterative methods, one constructs normalized orthogonal polynomials HnRn×nH_n \in \mathbb{R}^{n \times n}5 w.r.t.\ the SRHT spectral measure HnRn×nH_n \in \mathbb{R}^{n \times n}6. Key steps:

  • Define auxiliary parameters HnRn×nH_n \in \mathbb{R}^{n \times n}7 as above.
  • The standard orthogonal polynomials HnRn×nH_n \in \mathbb{R}^{n \times n}8 for the Marchenko–Pastur law on HnRn×nH_n \in \mathbb{R}^{n \times n}9 satisfy a three-term recurrence:

H1=1,H2k=12(HkHk HkHk)H_1 = 1, \quad H_{2k} = \frac{1}{\sqrt{2}} \begin{pmatrix} H_k & H_k\ H_k & -H_k \end{pmatrix}0

  • The optimal recurrence polynomials for the SRHT-weighted measure are

H1=1,H2k=12(HkHk HkHk)H_1 = 1, \quad H_{2k} = \frac{1}{\sqrt{2}} \begin{pmatrix} H_k & H_k\ H_k & -H_k \end{pmatrix}1

where H1=1,H2k=12(HkHk HkHk)H_1 = 1, \quad H_{2k} = \frac{1}{\sqrt{2}} \begin{pmatrix} H_k & H_k\ H_k & -H_k \end{pmatrix}2. These polynomials realize optimal error decay and are rescaled for use in practical acceleration.

4. Optimal First-Order Method Construction

For least-squares minimization

H1=1,H2k=12(HkHk HkHk)H_1 = 1, \quad H_{2k} = \frac{1}{\sqrt{2}} \begin{pmatrix} H_k & H_k\ H_k & -H_k \end{pmatrix}3

one utilizes preconditioned Heavy-Ball style updates: H1=1,H2k=12(HkHk HkHk)H_1 = 1, \quad H_{2k} = \frac{1}{\sqrt{2}} \begin{pmatrix} H_k & H_k\ H_k & -H_k \end{pmatrix}4 where H1=1,H2k=12(HkHk HkHk)H_1 = 1, \quad H_{2k} = \frac{1}{\sqrt{2}} \begin{pmatrix} H_k & H_k\ H_k & -H_k \end{pmatrix}5, and H1=1,H2k=12(HkHk HkHk)H_1 = 1, \quad H_{2k} = \frac{1}{\sqrt{2}} \begin{pmatrix} H_k & H_k\ H_k & -H_k \end{pmatrix}6 are extracted from the recurrence and normalization of the SRHT polynomial sequence.

The iterates achieve an asymptotic error decay rate determined by the SRHT spectrum,

H1=1,H2k=12(HkHk HkHk)H_1 = 1, \quad H_{2k} = \frac{1}{\sqrt{2}} \begin{pmatrix} H_k & H_k\ H_k & -H_k \end{pmatrix}7

where H1=1,H2k=12(HkHk HkHk)H_1 = 1, \quad H_{2k} = \frac{1}{\sqrt{2}} \begin{pmatrix} H_k & H_k\ H_k & -H_k \end{pmatrix}8 denotes the normalized polynomial. Full formulas for H1=1,H2k=12(HkHk HkHk)H_1 = 1, \quad H_{2k} = \frac{1}{\sqrt{2}} \begin{pmatrix} H_k & H_k\ H_k & -H_k \end{pmatrix}9 are provided as functions of PP0.

5. Comparative Convergence Analysis: SRHT vs Gaussian Sketches

Classical Gaussian projections yield contraction rate

PP1

under analogous asymptotics. For SRHT,

PP2

Given PP3 (i.e., sketch size exceeds data dimension), SRHT delivers strictly faster geometric error decay than Gaussian sketching.

Method Asymptotic contraction rate
Gaussian PP4
SRHT/Haar PP5

Equivalently, for a fixed sketch-size ratio PP6, SRHT always beats Gaussian in convergence rate.

6. Computational Complexity, Scaling, and Condition Number Independence

For accuracy PP7, the iteration count is

PP8

Per iteration cost is PP9 for matrix-vector products. The one-time factorization n×nn \times n0 costs n×nn \times n1. Constructing n×nn \times n2 costs n×nn \times n3.

Thus,

n×nn \times n4

For the optimal n×nn \times n5,

n×nn \times n6

Notably, the complexity is independent of n×nn \times n7, the condition number. Compared to randomized preconditioned conjugate gradient (PCG), SRHT improves by a factor of n×nn \times n8 in the regime n×nn \times n9.

7. Practical Impact and Implementation Considerations

The SRHT embedding supplies:

  • Explicit limiting spectral distributions for accurate algorithm design.
  • Closed-form optimal polynomial accelerators for Heavy-Ball style iterative methods.
  • Provably faster convergence for sketch-based solvers than Gaussian analogues.
  • Fast matrix-vector products leveraging the FWHT, scaling to massive datasets.
  • Full independence from the conditioning of the data matrix, enabling robust randomized solvers.

Empirically, SRHT-based solvers substantially outperform Gaussian-sketch and PCG solvers when computational or memory constraints preclude dense embedding, especially when D=diag(di)D = \text{diag}(d_i)0 is large. SRHT should be preferred in large-scale least-squares or when spectral sketch properties impact statistical estimator variance (Lacotte et al., 2020).

8. Contextual Significance and Future Directions

Recent random matrix theory results, particularly the derivation of limiting SRHT spectra and the polynomials that arise from them, have enabled both practical deployment and theoretical guarantees for fast, robust linear solvers. The optimal Heavy-Ball algorithm constructed with SRHT embedding provides condition-number-free complexity, which, up to logarithmic factors, is the best known in contemporary randomized numerical linear algebra (Lacotte et al., 2020). Extensions to block-wise distributed architectures, streaming algorithms, and mixed precision computations (e.g., RHQR with SRHT) further demonstrate its versatility across computational environments.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Subsampled Randomized Hadamard Transform (SRHT).