Deep Composite Polynomial Approximations

Updated 7 January 2026

Deep composite polynomial approximations are functions defined as compositions of low-degree polynomials (often with nonlinear mappings) that enable exponential convergence for nonsmooth or singular targets.
They exploit a layered, compositional structure to achieve significant parameter efficiency and improved error bounds compared to traditional single-layer polynomial methods.
These methods integrate analytic recursion schemes with adaptive, graph-based and homeomorphic parameterizations to effectively address high-dimensional regression, PDE surrogates, and manifold learning.

Deep composite polynomial approximations are a class of function approximation strategies wherein the target function is represented as a composition of several low-degree polynomials, often accompanied by auxiliary nonlinear transformations or weights. This approach generalizes classical polynomial approximation by exploiting the compositional structure, leading to substantially improved approximation rates, particularly for nonsmooth functions, singularities, or functions with complex geometric or topological features. Deep composite polynomial architectures form the foundation of both theoretical advancements in approximation theory and practical developments in high-dimensional scientific computing, machine learning, and PDE surrogate modeling.

1. Fundamental Definitions and Structures

A deep composite polynomial approximant is a function of the form

$F(x) = P_L \circ P_{L-1} \circ \cdots \circ P_1(x),$

where each $P_j$ is a (typically low-degree) univariate or multivariate polynomial. In some frameworks, the composition may include non-polynomial bijections such as homeomorphisms or conformal maps. The overall degree of $F$ is the product of the degrees of the individual $P_j$ . However, after normalization constraints (e.g., monic polynomials, affine-free layers), the total number of free parameters grows only linearly in the number of layers and their degrees, enabling highly efficient representations of complex mappings with nontrivial total degree (Yeon, 2 Mar 2025, Corral et al., 14 Dec 2025).

In weighted or asymmetric settings, the model generalizes to

$Q(x) = [w(x)]^{\alpha} \, (P_L \circ \cdots \circ P_1)(x),$

where $w$ is a continuous weight controlling growth or decay, and $\alpha$ is a tunable parameter designed to match global target behavior, especially for functions with unbounded or one-sided features (Yeon et al., 26 Jun 2025).

Key architectural features in practice include:

Depth (number of compositions, $L$ )
Per-layer degree sequence $(d_1, \dots, d_L)$
Parameter count: For univariate layers and normalized composites, $N = \sum_{j=1}^L d_j - L + 2$ (Yeon, 2 Mar 2025)
Optional nonlinear transformations: e.g., homeomorphisms $h: \Omega \to \Omega_h$ or invertible neural networks as pre-processors (Corral et al., 14 Dec 2025)

2. Theoretical Approximation Results

Deep composite polynomial approximations fundamentally break the classical barrier imposed by unadorned polynomial approximation—namely, the algebraic decay rates for error provided by Jackson-type theorems. Several theoretical advances define the error bounds and approximation possibilities of the composite approach:

Exponential convergence for nonsmooth functions: For functions such as $|x|$ , $x^{1/p}$ , and more generally $|x-a_j|^{\alpha_j}$ with $0 < \alpha_j < 1$ , explicit iterative constructions (such as Newton-type recursions) yield composite polynomials with uniform (or $L^p$ ) error decaying exponentially in the number of free parameters (Yeon, 2 Mar 2025, Yeon et al., 26 Jun 2025). A typical result is

$\|\,|x| - P_k(x)\|_{L^\infty([-1,1])} \leq C \exp(-c N_k),$

where $P_k$ is a composite with $N_k = O(k)$ parameters and total degree $D_k = 3^k - 1$ (Yeon, 2 Mar 2025).

Weighted and asymmetric targets: For functions exhibiting different asymptotic behaviors on different sides, e.g., $e^{-x}$ on $(-\infty, \infty)$ , one-sided deep polynomials $Q(x) = [w(x)]^{\alpha} (p_L \circ \cdots \circ p_1)(x)$ can achieve exponential or root-exponential convergence, with the weight $w$ matched to the target's growth or decay (Yeon et al., 26 Jun 2025).
Universal approximation via homeomorphisms: Any continuous univariate function $f$ with $M$ local extrema can be approximated arbitrarily well by a composition $p \circ h$ , where $h$ is a homeomorphism and $p$ is a degree $M+1$ polynomial. This degree is both necessary and sufficient, and the construction is constructive via prescribed critical points and piecewise monotonic warping (Corral et al., 14 Dec 2025).
Error estimates in weighted Sobolev spaces: Precise bounds are available for the approximation of composite functions $f \circ g$ in weighted spaces, reflecting how errors propagate through layers and the composition depth. For deeply composed functions, the best-approximation error can be controlled in terms of the products of Sobolev norms and Bell numbers associated to each layer's regularity; degree allocation across layers is critical for attaining a global error target (Fermo et al., 2023).

3. Methodologies and Algorithmic Implementations

The construction and optimization of deep composite polynomial approximants leverage both analytic recursion schemes and data-driven learning approaches:

Division-free iterative schemes: Recursions, such as Newton-type iterations for fractional powers, yield explicit and numerically stable polynomials for certain singular targets (e.g., $|x|$ , $x^{1/p}$ ) (Yeon, 2 Mar 2025).
Adaptive graph-based parameterization: For weighted deep polynomials, parameters are organized in a directed acyclic graph (DAG), each node corresponding to a layer or a weighted monomial. Optimization is performed using automatic differentiation and empirical loss minimization (e.g., via stochastic gradient descent), with backpropagation through the composition (Yeon et al., 26 Jun 2025).
Homeomorphism parameterization via invertible neural networks: In high-dimensional or locally intricate settings, the warping function $h$ is represented by an iResNet (invertible ResNet), trained jointly with the outer polynomial to minimize regression or interpolation loss (Corral et al., 14 Dec 2025).
Compositional networks for manifold learning: Deep polynomial decoders, built as tree-structured compositions, reconstruct low-dimensional PDE solution manifolds from samples. Adaptive algorithms balance accuracy, Lipschitz-continuity, and parameter efficiency (Bensalah et al., 7 Feb 2025).
Lanczos-based composite surrogates: Tensorized quadrature and Lanczos iterations construct efficient approximations of $h(x) = g(f(x))$ using only a small number of evaluations of the expensive outer function $g$ , yielding dramatic computational savings for high-dimensional or expensive forward models (Constantine et al., 2011).

4. Applications and Empirical Performance

Deep composite polynomial methods have demonstrated superior empirical performance across a range of scenarios compared to single-layer, high-degree polynomials:

Singular and cusp-type functions: Construction of division-free, composite approximants achieves exponential accuracy for $|x|$ , $x^{1/p}$ , and $|x-a_j|^{\alpha_j}$ , where classical least-squares or Chebyshev polynomials are limited to algebraic rates (Yeon, 2 Mar 2025, Yeon et al., 26 Jun 2025).
Functions with asymmetric growth/decay: Weighted deep polynomials provide accurate, uniformly convergent approximants even for targets growing unboundedly on one side, whereas traditional Chebyshev or Taylor polynomials fail for such domains (Yeon et al., 26 Jun 2025).
High-dimensional regression and molecular modeling: Regression tasks and molecular potential-energy surface fitting with homeomorphic-lifted polynomials achieve order-of-magnitude reductions in the number of basis functions and errors compared to direct polynomial fits. For example, quantum chemistry PES fitting on H₂S with a composite variable fit (degree 4, 35 functions) yields RMSE = 18.93 cm⁻¹, versus RMSE = 105.54 cm⁻¹ for degree-18 direct polynomials with 1330 functions (Corral et al., 14 Dec 2025).
Manifold approximation for PDE solution sets: Tree-structured compositional polynomials for low-dimensional manifolds in Hilbert spaces yield 1–3 orders of magnitude lower mean-squared errors than additive or quadratic decoders, with significant reductions in total degrees of freedom (Bensalah et al., 7 Feb 2025).
Privacy-preserving deep nets: In privacy-preserving DNN inference using polynomial replacement of ReLU, deep composite quadratics (specifically $x^2 + x$ ) outperform plain squaring in empirical test accuracy by 4–10.4 percentage points across multiple datasets and architectures (Ali et al., 2020).
Polynomial approximation of Boolean circuits: Compositional/probabilistic polynomial frameworks achieve improved degree bounds for $\varepsilon$ -error approximations of $\mathrm{AC}^0$ circuits, enabling optimal pseudorandom generators with tight dependence on the circuit size, depth, and error (Harsha et al., 2016).

5. Computational Complexity and Practical Tradeoffs

Deep composite polynomial approximations offer substantial gains in efficiency and stability:

Method	Degrees of Freedom	Error Rate	Applicability
Single polynomial	$D+1$	Algebraic: $O(1/D)$	Smooth or mildly nonsmooth
Deep composite polynomial	$N \ll D$	Exponential: $O(e^{-cN})$	Nonsmooth, singular, high-complexity
Weighted deep polynomial	$N$	Exponential/uniform	Asymmetric, growing/decaying targets
Composed polynomial with homeomorphism	$M+1$	Arbitrarily small	Functions with $M$ extrema
Lanczos-based composite surrogate	$k \ll m$ evals of $g$	$\le C\rho^{-k}$	High-dimensional compositional models

For a fixed parameter budget $N$ , composite polynomials can represent much higher overall degrees, leveraging the parameter efficiency of layered structures (Yeon, 2 Mar 2025).
Conditioning is typically better in deep composites, as operations consist only of additions and multiplications, avoiding instability from division or high-degree monomials (Yeon et al., 26 Jun 2025).
Graph-based optimization scales efficiently due to modular layerwise loss and sparsity (Yeon et al., 26 Jun 2025, Bensalah et al., 7 Feb 2025).
Parameter allocation and degree splitting among layers is crucial; exponential convergence is attained when each stage's error contracts geometrically, motivating balanced layerwidth and depth selection depending on target regularity (Fermo et al., 2023, Yeon, 2 Mar 2025).

Rational and conformal approximations: Classical rational approximations achieve root-exponential rates for certain singular functions via composition, but require divisions and are numerically ill-conditioned. Weighted deep polynomials recover these rates in a purely polynomial, division-free framework (Yeon et al., 26 Jun 2025).
Neural network theory: The universality and expressivity properties of composite polynomial approximants are closely related to the theoretical underpinnings of deep neural networks, with polynomials serving as surrogates for activation or layerwise nonlinearities; for instance, entire ReLU networks can be replaced by depth- $L$ composites with polynomial activation, yielding a degree- $2^L$ map (Ali et al., 2020).
Boolean function and circuit complexity: Deep/probabilistic polynomial approximations to $\mathrm{AC}^0$ circuits provide optimal or near-optimal degree and $k$ -wise independence bounds for pseudorandomness and derandomization (Harsha et al., 2016).

7. Open Problems and Future Directions

Several challenges and avenues for further development remain:

Rigorous characterization of exponential convergence for weighted deep composites in the most general nonsmooth/asymmetric settings; presently, sharp theorems are available primarily for division-free composite schemes in compact or analytic cases (Yeon, 2 Mar 2025, Yeon et al., 26 Jun 2025).
Systematic weight design and regularization to optimize parameter efficiency and approximant uniqueness, particularly in multivariate and nonconvex regimes (Yeon et al., 26 Jun 2025).
Extension to multivariate and non-Euclidean domains, with applications to manifold learning, PDE approximation, and scientific modeling (Bensalah et al., 7 Feb 2025, Corral et al., 14 Dec 2025).
Automated parameter allocation across layers and links, guided by analysis of Sobolev regularity and best-approximation error propagation (Fermo et al., 2023).
Compositional surrogates for high-dimensional scientific computation, further exploiting efficiency gains in reduced-order modeling, uncertainty quantification, and inverse problems (Constantine et al., 2011, Bensalah et al., 7 Feb 2025).

The deep composite polynomial framework thus unifies advances in approximation theory, computational efficiency, and practical data-driven modeling, offering robust strategies for the approximation and representation of complex functions well beyond the classical bounds of single-layer polynomials.