Log-Signature (Signature Cumulant)
- Log-signature is the logarithm of a path's iterated integral signature, offering a compressed, coordinate-independent encoding of geometric features.
- It is computed via the Magnus expansion and BCH formula, placing the result in a free Lie algebra for efficient, hierarchical representation.
- Applications include generative modeling, time-series analysis, and writer identification, leveraging robustness to noise and reparameterization invariance.
A log-signature—also known as the signature cumulant—is the logarithm, in the completed tensor algebra, of the (iterated-integral) signature of a continuous path. The log-signature provides a compressed, coordinate-independent, and Lie-algebraic encoding of the geometric features of a path. It is fundamentally linked to the algebraic and analytic structure of rough paths, stochastic processes, and high-dimensional sequential data, acting simultaneously as a canonical non-commutative generalization of classical cumulants and as a central object in practical algorithms for generative modeling, inference, and dimension reduction.
1. Algebraic Structure and Definition
The signature of a path is the sequence of all its iterated integrals, viewed as a group-like element in the completed tensor algebra . Formally,
The log-signature, or signature cumulant, is defined as the non-commutative (tensor algebra) logarithm of the signature,
where the sum converges in the appropriate topology and each is an explicit Lie polynomial in the increments of . By Chen–Ree's theorem, lies in the free Lie algebra and can be expanded as
with a Hall (or Lyndon) basis for the Lie algebra and the signature cumulants.
2. Magnus Expansion and Lie Series
The algebraic structure of the log-signature is made explicit via the Magnus expansion, a universal Lie-series which recursively expresses the logarithm of the signature in terms of the path derivative and its nested commutators: with the th homogeneous piece
generalizing the cumulant structure of scalar random variables to the non-commutative, tensor-algebraic case (Friz et al., 9 Sep 2025, Friz et al., 2024, Friz et al., 2021). The log-signature thus provides an intrinsic, hierarchically ordered, and compact description of the path's geometry, with higher-order terms encoding complex geometric correlations such as areas and iterated areas.
3. Computational and Algorithmic Methods
Efficient algorithms for computing log-signatures exploit:
- The truncated tensor algebra at order and the corresponding projection onto a Hall or Lyndon basis for the free Lie algebra (dimension ) (Reizenstein, 2017, Barancikova et al., 2024, Lai et al., 2019).
- Baker–Campbell–Hausdorff (BCH) formula for concatenating segments: For , the log-signature of their concatenation is
with all terms in the chosen Lie basis (Reizenstein, 2017).
- For piecewise linear or polygonal paths, the log-signature can be maintained incrementally under concatenation; widely used libraries such as iisignature provide optimized implementations (Reizenstein, 2017).
- In stochastic/semimartingale settings, the expected log-signature is computed via Magnus-type expansion or non-commutative Riccati-type functional equations, sometimes reducing to finite-dimensional PDEs/ODEs in Markov, affine, or Lévy-type models (Friz et al., 9 Sep 2025, Friz et al., 2021).
4. Statistical and Machine Learning Applications
The log-signature enables powerful statistical summarization and inference on path-valued data:
- Truncated log-signature coordinates provide a geometrically rich, low-dimensional embedding for sequential data, invariant under time-reparameterization and robust to noise (Lai et al., 2019, Barancikova et al., 2024).
- In explicit generative modeling, e.g., SigDiffusions, score-based diffusion processes are trained directly on the space of log-signatures, enabling sample generation and closed-form inversion to recover the underlying path or its coefficients in suitable bases (e.g., Fourier, orthogonal polynomials) (Barancikova et al., 2024).
- In writer identification, log-signature embeddings of pathlets extracted from image contours yield compact, discriminative codes for clustering or retrieval, outperforming raw signatures in both computational efficiency and classification performance (Lai et al., 2019).
- For stochastic processes, the expectation of the signature (and hence its log) uniquely determines the law under growth conditions (Chevyrev–Lyons theorem), making log-signatures central in statistical modeling of time series, financial processes, and rough signals (Friz et al., 2024, Friz et al., 2021).
5. Analytic Properties: Decay, Convergence, and Rigidity
While signature coefficients decay factorially, log-signature coefficients typically decay only geometrically. The analytic radius of convergence of the log-signature's power series is finite except for straight-line paths, as formalized in the Lyons–Sidorova conjecture and subsequent results:
- For rough, tree-reduced paths, log-signature coefficients have strictly smaller radii of convergence than signature coefficients (Boedihardjo et al., 22 Jun 2025).
- Infinite radius of convergence for log-signature can only arise for line segments (or paths that are locally linear on all subintervals).
- Explicit vanishing (determinantal) identities on iterated integrals arise as geometric constraints in such “entire” cases, implying rigidity of the log-signature transform and reinforcing its discriminative power for generic paths (Boedihardjo et al., 22 Jun 2025).
6. Numerical, Algorithmic, and Basis Considerations
- The dimension of the truncated signature up to level is , while the dimension of the truncated log-signature (Hall/Lyndon basis for the free Lie algebra) is much smaller, allowing for more efficient representation and clustering in high dimensions (Lai et al., 2019, Barancikova et al., 2024).
- Closed-form inversion formulae permit explicit, basis-adapted reconstruction of paths from log-signature data (Fourier, polynomials), with controlled numerical properties and scalable preprocessing (Barancikova et al., 2024).
- Modern library implementations exploit precomputed Hall or Lyndon bases, BCH expansions, and vectorized computation for practical deployment in scientific computing and ML pipelines (Reizenstein, 2017).
7. Domain-Specific Applications and Empirical Findings
Empirical studies demonstrate that log-signature features are highly effective for graph-based retrieval and identification tasks. In offline writer identification, log-signature codebooks coupled with codeword co-occurrence features yield top-1 accuracy exceeding 94% on the IAM dataset, 99% on CVL, and outperform raw signatures and many deep learning baselines (Lai et al., 2019). The structural and computational advantages—principled geometric compressiveness, scale-invariance, and reparametrization invariance—are central to these results, and similar architectures are leveraged in generative time-series models, rough path classification, and stochastic kernel learning (Lai et al., 2019, Barancikova et al., 2024, Friz et al., 2024).
References:
- (Lai et al., 2019)
- (Friz et al., 9 Sep 2025)
- (Boedihardjo et al., 22 Jun 2025)
- (Friz et al., 2024)
- (Barancikova et al., 2024)
- (Friz et al., 2021)
- (Reizenstein, 2017)