Discrete Convolutions

Updated 31 January 2026

Discrete convolution is a bilinear operator on discrete domains that systematically combines signals while preserving translation equivariance.
Computational approaches leverage FFT and optimized algorithms to reduce complexity from O(N²) to O(N log N), enhancing high-throughput processing.
Generalizations extend discrete convolutions to adaptive grids, group structures, and spectral methods, expanding applications in machine learning and additive number theory.

A discrete convolution is a bilinear operator acting on sequences, functions on finite sets, or more broadly, objects indexed by discrete domains. It forms the backbone of digital signal processing, probability, numerical methods, combinatorics, and modern machine learning. The algebraic and algorithmic properties of discrete convolution are tightly linked to harmonic analysis, computational group theory, and functional analysis, exhibiting rich interactions with fast transforms (FFT, FWT), sparsity, algebraic number theory, operator theory, and adaptive representations.

1. Algebraic Definition and Structural Theory

Let $f,g$ be sequences indexed by $\mathbb{Z}$ (or, more generally, $\mathbb{Z}^d$ or any discrete group), with $f,g \in \ell(\mathbb{Z}^d)$ . The (linear) discrete convolution is defined as

$(f * g)(n) = \sum_{k \in \mathbb{Z}^d} f(k) g(n - k)$

with natural adaptations to finite abelian groups or finite intervals. The operator $f \mapsto f * h$ for a fixed $h$ is translation equivariant and admits a representation as a difference operator whose symbol is a Laurent polynomial $h^*(z) = \sum_{\alpha} h(\alpha) z^\alpha$ . The kernel of a discrete convolution operator $h^*(\tau^{-1})$ consists precisely of exponential polynomials of the form $p(n)\,\theta^n$ , where $p$ is a polynomial and $\theta$ a root (possibly of higher multiplicity) of the associated symbol. This structure generalizes to zero-dimensional ideals generated by multiple masks in $\mathbb{C}[z_1, \dots, z_s]$ : the kernel is a direct sum of spaces of exponential-polynomial sequences, with each summand corresponding to a distinct common zero of the symbol and polynomial spaces determined by multiplicity theory as in Gröbner's work and Hermite interpolation (Sauer, 2014).

2. Computational Approaches and FFT-Based Algorithms

Direct computation of an $N$ -point convolution requires $O(N^2)$ operations. However, the convolution theorem states that for periodic (circular) convolution:

$\mathrm{DFT}(f * g)_m = \mathrm{DFT}(f)_m \cdot \mathrm{DFT}(g)_m,$

and thus

$f * g = \mathrm{IDFT}( \mathrm{DFT}(f) \cdot \mathrm{DFT}(g) ).$

The Fast Fourier Transform (FFT) reduces computation to $O(N \log N)$ for $N$ a power of two. For purely discrete, lattice-supported distributions, zero-padding to length at least the sum of supports prevents aliasing artifacts. For continuous distributions, convolution can be performed via discretization on a linear lattice followed by FFT, smoothing, and renormalization. This approach, as implemented in the R package "distr," achieves near-machine-precision error for lattice distributions and $O(\varepsilon)$ error in continuous approximations (for typical $\varepsilon \sim 10^{-8}$ ) (Ruckdeschel et al., 2010).

Permutation-avoiding FFT convolution kernels further optimize performance by eliminating costly index-reversal permutations, especially beneficial for repeatedly convolving with a fixed filter. This results in significant memory-bound speedups in 1D ($2$-- $3\times$ ) and modest improvements in higher dimensions, primarily by deferring index permutations to an offline pre-processing step on the filter, then only executing high-arithmetic-intensity butterfly operations for new inputs (Venkovic et al., 15 Jun 2025).

3. Generalizations and Alternative Discrete Transforms

Discrete convolution naturally extends to more general algebra settings, including:

Krawtchouk Transforms: Define an efficient operator calculus over the binomial space, using Krawtchouk polynomials and their orthogonal transform. The associated discrete convolution (" $\star$ ") in this basis is performed via transform–multiply–inverse, with a complexity of $O(N^2)$ . The convolution itself is defined by explicit multinomial sums and maintains commutativity, associativity, and invertibility properties. This approach is especially suitable for finite-support problems with binomial boundary conditions (e.g., discrete image moments, random walk models) (Feinsilver et al., 2014).
Affine Discrete Fractional Fourier Transform (ADFrFT): Generalizes the DFT convolution-multiplication property to the circular convolution for the ADFrFT, with additional phase and chirp pre-compensation terms. At $\alpha = \pi/2$ (the classical DFT), the standard convolution theorem for the DFT is recovered; for $\alpha \ne \pi/2$ , the correspondence incorporates a quadratic phase modulation, vital for digital communications and time-frequency signal analysis (Nafchi et al., 2020).
Walsh–Hadamard and Polynomial Methods: Fast algorithms for the Walsh (XOR) and ordinary (shift) convolutions depend on length reduction for sparse inputs. Modulo hashing works for shift-convolution but fails for XOR; polynomial-evaluation-based hashing allows efficient $O(n \log^2 n)$ time sparse convolution for both settings by mapping nonzero entries to random field locations according to their polynomial index, detecting and resolving collisions (Amir et al., 2014).

4. Discrete Convolutions in Adaptive and Structured Domains

Discrete convolutions extend beyond uniform grids to adaptive representations:

Adaptive Particle Representation (APR): For content-adaptive, multiresolution image representations, convolution is defined natively over irregular, particle-based grids. Stencils are adapted to the local resolution, with data structures enabling efficient patch assembly and convolution by parallel iteration over particle indices. GPU and multi-core parallelization via row-wise domain decomposition yield throughput up to $1$ TB/s and memory reductions by factors of $10$–$300$, significantly outpacing traditional uniform-grid convolutions for large and sparse images (Jonsson et al., 2021).
Rotationally-Invariant Settings via Discrete Spectral Methods: On domains like disks in $\mathbb{R}^2$ , convolution can be structured using discrete Sturm-Liouville eigenfunction expansions (Fourier–Bessel series). Here, convolution coefficients are computed in the frequency domain, with the radial discretization enforced through the zeros of Bessel functions (for Dirichlet or Neumann conditions). This approach achieves efficient and mathematically precise convolutions for functions with circular support, useful for rotation-invariant feature extraction and analysis (Farashahi et al., 2019).

5. Analytical Properties and Norm Inequalities

Sharp operator inequalities for convolutions are central in analysis, probability, and combinatorics. For functions $f_i : \{0,1\}^d \to \mathbb{R}$ , the optimal constant $C$ for

$\|f_1 * \cdots * f_k\|_\infty \geq C \prod_{i=1}^k \|f_i\|_1$

has been computed exactly for all $d$ and $k \geq 2$ :

$C_{k,d} = {k \choose \lfloor k/2 \rfloor}^d \left( \frac{\lfloor (k+1)/2 \rfloor \lceil (k+1)/2 \rceil}{(\lfloor (k+1)/2 \rfloor + \lceil (k+1)/2 \rceil)^2} \right)^{kd/2}$

with sharpness achieved when the $f_j$ are suitably structured product functions. This yields exact Sidon set bounds in the hypercube and informs continuous autoconvolution bounds (Gaitan et al., 20 Dec 2025).

In additive number theory, discrete convolution of indicator functions models representation functions (e.g., the number of ways to write $n$ as a sum of two elements from a set $A$ ). Erdős–Fuchs type theorems show that no set $A$ can achieve pointwise or cumulative approximations to a given smooth model convolution with error smaller than the square-root of the main term (up to logarithmic factors), a direct consequence of central-limit-fluctuation phenomena for additive problems (Sándor, 2020).

6. Discrete Scale and Group-Convolution Generalizations

Recent extensions focus on equivariant convolutions over symmetry groups:

Scale-Equivariant Discrete Convolution (DISCO): For convolutional neural networks with discrete scale-equivariance, the discrete scale convolution operator ensures equivariance precisely for integer scaling factors via dilated kernels, and provides optimal approximations for non-integer scaling by minimizing equivariance error. In practical networks, this yields significant reductions in both error and computation compared to continuous rescaling, and improved classification and tracking performance (Sosnovik et al., 2021).
The discrete framework supports further extensions to any group structure where convolution and equivariance constraints are relevant.

7. Implementation Paradigms and Applications

Object-oriented frameworks (e.g., S4 classes in the "distr" R package) encapsulate discrete convolution logic, enabling overloading and generic convolution arithmetic over both discrete and continuous distributions. Platform-specific optimizations—such as permutation-avoiding FFTs—are critical for high-throughput applications. Discrete convolution remains essential in probability (convolution of measures), signal processing, fast numerical linear algebra, combinatorial design, adaptive image analysis, and the effective deployment of equivariant neural architectures (Ruckdeschel et al., 2010, Venkovic et al., 15 Jun 2025, Jonsson et al., 2021, Sosnovik et al., 2021).

Collectively, the theoretical, algorithmic, and computational innovations in discrete convolution underpin much of modern digital and statistical processing, with ongoing expansions into group-theoretic, adaptive, and numeric domains. The highly optimized computational frameworks and sharp analytical inequalities established in this literature provide the technical foundation for further advancements across both classical and emerging applications.