Solving Linear-Quadratic Stochastic Control Problems with Signatures

Published 26 Feb 2026 in math.OC | (2602.23473v1)

Abstract: We study a signature-driven numerical scheme to solve multi-dimensional linear-quadratic (LQ) stochastic control problems. Using that linear signature functionals are dense in the natural class of admissible controls, we show that our approach turns the original LQ problem into a deterministic convex quadratic polynomial optimisation. To underpin a numerical approach based on truncated signatures, we prove that the problem's value function can be approximated by finite-dimensional polynomial approximations when the truncation levels are chosen sufficiently high. Remarkably, our numerical experiments show very decent accuracy already for small truncation levels. Key tools for our analysis are (i) the algebraic representation of controlled stochastic differential equations and the associated cost function as linear functionals of the path signatures of the driving noise, (ii) the convergence of the truncated linear functionals, and (iii) the density of signature controls.

Abstract PDF Upgrade to Chat

Summary

The paper presents a novel method using path signatures to convert multi-dimensional LQ stochastic control problems into deterministic convex quadratic optimization tasks.
It rigorously proves the density and universality of linear signature functionals in approximating admissible controls, ensuring consistent convergence as truncation levels increase.
Numerical experiments under both Brownian and fractional noise validate the approach, demonstrating rapid convergence and strong performance even with low truncation orders.

Signature-Based Approaches for Linear-Quadratic Stochastic Control

Introduction and Problem Context

This paper introduces a novel numerical methodology for solving multi-dimensional Linear-Quadratic (LQ) stochastic control problems using path signatures as parametrization tools for admissible controls (2602.23473). LQ stochastic control constitutes a classical paradigm, with wide-ranging applications, often necessitating the solution of an HJB PDE or a backward SDE under the dynamic programming or Pontryagin framework. These approaches, while theoretically robust, encounter intractability in high-dimensional or non-Markovian regimes. The signature-based method sidesteps these limitations by directly parametrizing the control variable in terms of linear functionals of time-augmented path signatures of the driving noise.

The central theoretical contribution asserts the density of linear signature functionals in the admissible control space, thus asymptotically maintaining optimality while transforming the stochastic optimization problem into a deterministic and convex quadratic polynomial optimization. The paper rigorously proves that value functions are consistently approximated by finite-dimensional polynomial problems as truncation levels on the tensor and control are increased.

Signature Representations and Control Parametrization

The authors employ the algebraic machinery of path signatures, building on the rough path literature, to represent admissible controls as finite linear functionals on the tensor algebra of the time-augmented driving path. The essential theoretical step is the universality result for linear signature functionals on Brownian path space, guaranteeing dense approximation in the relevant $L^2$ topologies. Formally, admissible controls $u_t$ are represented as

$u_t^{(k)} = \langle \ell^{(k)}, S_t \rangle,$

where $S_t$ is the truncated signature of the path up to $t$ . The multi-dimensional controlled state process exhibits a closed recursive algebraic dependency on these functionals, leading to a linear system for the associated control tensors. Uniqueness and existence of solutions to the resulting tensor equations are rigorously established.

Reduction to Deterministic Convex Optimization

Both the controlled SDE and cost functional are re-expressed as pairings on the tensor algebra, with the quadratic form of the LQ cost represented as a finite combination of inner products of truncated signature variables and associated tensors. Through careful truncation arguments—on both the state representation (level $L$ ) and on the control parametrization (level $M$ )—the authors reduce the original infinite-dimensional stochastic control problem to a finite deterministic optimization:

$\inf_{\ell \in (T^M((\mathbb{R}^{D+1})^*))^K} \langle J^{L, M}(\ell), \mathbb{E}[S_T] \rangle$

where $J^{L, M}(\ell)$ is the truncated polynomial representation of the performance criterion. The convexity and strict convexity of the polynomial problem with respect to coefficients of the signature functional are proven, preserving uniqueness of the optimal control parametrization in the feasible truncated set.

Rigorous Consistency and Convergence Analysis

A core theoretical result of the paper is the two-stage limit characterization: as the truncation levels $L,M \to \infty$ , the infimum of the truncated polynomial problems converges to the value of the true LQ problem. This is shown via a careful application of universality theorems for signatures and stability properties of linear SDEs with respect to weak $L^2$ convergence. The ordered limits—first increasing state truncation, then control truncation—are shown to be essential for achieving consistency. Further, weak convergence in $L^2(dP \otimes dt)$ of the control laws from the truncated problem to the true optimal control is also established.

Numerical Experiments

The paper provides detailed numerical experiments to validate the proposed approach. In the uni-dimensional Brownian case, the methodology was benchmarked against the analytic HJB-based control with all drift, volatility, and cost coefficients set to unity (except for vanishing $A(t), C(t), D(t), G$ ). Even with low truncation levels ( $L \leq 5$ , $M \leq 3$ ), the approximated value function and estimated $L^2$ distance to the benchmark demonstrate rapid convergence.

Figure 1: Empirical estimation of cost $J(u)$ and $L^2(dP \otimes dt)$ discrepancies between the truncated signature-control and benchmark in the Brownian setting.

Critically, the results highlight a dependence on the order of limits: achieving high accuracy requires the state truncation level to exceed the control truncation, with increasing $M$ for fixed $L$ potentially resulting in overfitting and sub-optimality.

The methodology is also applied to the case of a state process driven by fractional Brownian motion with Hurst index $H=1/4$ , outside the classical Markovian field. Here, expected signatures are estimated through Monte Carlo simulation, and the performed cost functional matches closely with the benchmark with increasing truncation.

Figure 2: Signature-based policy performance after cost minimization, with $J(u)$ and $L^2(dP \otimes dt)$ error versus benchmark control under fBm noise.

Despite theoretical gaps in universality and stability for fBm-driven dynamics, empirical results mirror the Brownian regime, favoring the robustness of the signature framework even in rough, non-Markovian contexts.

Implications, Extensions, and Future Directions

This methodology exhibits strong practical implications:

High-Dimensional Control: By reducing the stochastic dynamic problem to finite deterministic optimization over polynomial coefficients, the approach is scalable to much higher dimensions where PDE-based and backward SDE-based algorithms are computationally infeasible.
Non-Markovian and Rough Systems: The parameterization via path signatures generalizes, at least heuristically, to signals beyond Brownian motion, accommodating memory and long-range dependence in the driving noise.
Unified Algebraic Framework: The method connects stochastic control, rough path theory, and machine learning representations, suggesting a unified paradigm for control synthesis based on signature learning.

The central theoretical limitation stems from the absence of universality and growth estimates for path signatures under general non-semimartingale signals, and from the challenge in transferring the algebraic cost representation to ‘robust’ signature settings, such as for fBm, where only approximate universality results exist. Advances in universality theory for rough path signatures (e.g., via orthogonal polynomial bases on pathspace) could further broaden the applicability of this approach.

Given ongoing developments in the intersection of path signature theory and functional approximation in stochastic analysis and AI, future research may address high-order and model-free control tasks, joint learning of cost and control, and direct integration into neural signature architectures for reinforcement learning and data-driven control.

Conclusion

This study provides rigorous theory and effective algorithms for applying path signature representations to LQ stochastic control, including practical validation in both Brownian and fractional Brownian motion settings. The signature approach achieves strong numerical accuracy with low truncation, maintains convexity and uniqueness, and is supported by universality results in the admissible control space. While some open theoretical questions remain for more general noise models, the proposed methodology offers a promising direction for high-dimensional, non-Markovian, and data-driven stochastic control.

Markdown Report Issue