DALI: Derivative Approximation for Likelihoods

Updated 10 February 2026

DALI is a systematic framework that extends the Fisher matrix by incorporating third and higher derivatives to capture non-Gaussian features in likelihoods.
It reorganizes Taylor expansions to ensure each truncated approximation remains positive definite and normalizable, bridging fast Gaussian forecasts and full MCMC sampling.
Applications in cosmology, gravitational-wave astronomy, and machine learning demonstrate DALI's capability to capture skewness, kurtosis, and complex posterior geometries.

The Derivative Approximation for Likelihoods (DALI) is a formalism for systematically approximating posterior distributions and likelihood functions in parameter inference problems by incorporating higher-order derivative information beyond the canonical Fisher matrix approach. By including third and higher derivatives of the log-likelihood, DALI captures non-Gaussian characteristics such as skewness, kurtosis, deformed or curved posterior shapes, and extended degeneracies, while ensuring the resulting approximation remains positive definite and normalizable at each truncation order. DALI stands as a controlled, perturbative bridge between ultra-fast, purely Gaussian Fisher forecasts and computationally intensive sampling-based approaches such as full Markov Chain Monte Carlo runs, with demonstrated utility across cosmology, gravitational-wave astronomy, ODE-constrained inference, and machine learning contexts (Heavens, 2016, Sellentin et al., 2014, Sellentin, 2015, Glasserman et al., 4 Dec 2025, Ryan et al., 2022, Šarčević et al., 8 Feb 2026, Souza et al., 2023, Souza et al., 19 Oct 2025, Wang et al., 2022).

1. Mathematical Foundations and Taylor Expansion

The core idea of DALI is to expand the log-posterior or log-likelihood in a Taylor series about its maximum (usually the maximum likelihood estimate $\hat{\theta}$ ): $\ln L(\theta) \approx \ln L(\hat{\theta}) - \frac12 F_{ij} \Delta\theta_i \Delta\theta_j + \frac{1}{6} D_{ijk} \Delta\theta_i \Delta\theta_j \Delta\theta_k - \frac{1}{24} Q_{ijkl} \Delta\theta_i \Delta\theta_j \Delta\theta_k \Delta\theta_\ell + \cdots$ where $\Delta\theta = \theta - \hat{\theta}$ , $F_{ij}$ is the Fisher matrix (second derivatives), $D_{ijk}$ the third-derivative tensor, and $Q_{ijkl}$ the fourth-derivative tensor, all evaluated at the expansion point (Heavens, 2016, Sellentin et al., 2014, Röver et al., 2022). The Fisher approximation retains only the quadratic term, resulting in a symmetric, ellipsoidal (Gaussian) posterior. DALI systematically includes higher derivatives to capture deviations from Gaussianity.

Derivative Tensors

Fisher matrix: $F_{ij} = -\left.\partial_i \partial_j \ln L\right|_{\hat{\theta}}$
Skewness tensor: $D_{ijk} = \left.\partial_i \partial_j \partial_k \ln L\right|_{\hat{\theta}}$
Kurtosis tensor: $Q_{ijkl} = -\left.\partial_i \partial_j \partial_k \partial_l \ln L\right|_{\hat{\theta}}$

These tensors encode the shape corrections for the likelihood surface: the third-order term generates skewness, while the fourth modifies kurtosis and tail behavior (Sellentin, 2015, Röver et al., 2022).

2. Positive-Definite and Normalizable Expansion

Direct Taylor expansion of the log-likelihood, followed by naively exponentiating, can yield density approximations that become negative or non-normalizable at large distances from the peak. The essential innovation within DALI is a reorganization of terms: grouping contributions by the order of model derivatives (rather than powers of $\Delta\theta$ or Taylor order) such that every truncation yields a manifestly positive-definite, normalizable approximate distribution. For instance, in multivariate problems the quartic terms are constructed as sums of squares, ensuring that the exponent remains dominated by negative-definite contributions at large $|\Delta\theta|$ (Sellentin et al., 2014, Röver et al., 2022).

3. Algorithmic Construction and Computational Complexity

Derivative Evaluation

Derivatives can be computed:

Analytically (if model structure allows)
Numerically (finite-difference central stencils with extrapolation, polynomial fitting, as in DerivKit or GWDALI) (Šarčević et al., 8 Feb 2026, Souza et al., 2023, Souza et al., 19 Oct 2025)
Automatic differentiation (JAX, PyTorch, etc., especially for high-dimensional problems or waveform models) (Souza et al., 19 Oct 2025)

Cost Scaling

The number of parameters $N_p$ rapidly increases memory and CPU requirements with expansion order:

Fisher: $O(N_p^2)$
Third-order tensor: $O(N_p^3)$
Fourth-order tensor: $O(N_p^4)$

For most applications in cosmology and gravitational-wave astronomy, $N_p$ in the range $5$–$20$ is tractable for up to the cubic or quartic DALI expansions (Heavens, 2016, Souza et al., 19 Oct 2025).

4. Applications and Empirical Performance

4.1 Cosmological Inference and Forecasting

DALI enables rapid construction of non-Gaussian, high-fidelity analytic posteriors for cosmological parameters, including cases with severe degeneracies (e.g., ring-shaped, banana-shaped, or multi-modal posteriors arising from non-linear parameter dependence) (Sellentin et al., 2014, Ryan et al., 2022, Röver et al., 2022). Quantitatively, DALI corrections can shift credible regions by tens of percent in parameter space and accurately capture ring-shaped contours which the Fisher ellipse fails to reproduce (Sellentin, 2015, Röver et al., 2022).

DALI also provides a diagnostic for the breakdown of the Gaussian approximation: visual or extrinsic-curvature-based comparison of Fisher and DALI credible-level contours immediately reveals when Fisher ellipses are inadequate (Ryan et al., 2022).

4.2 Gravitational-Wave Parameter Estimation

In modern GW inference, DALI is used to extend Fisher-matrix predictions for parameter uncertainties to the highly non-Gaussian posteriors typical for networks of detectors or low SNR signals (Souza et al., 19 Oct 2025, Wang et al., 2022, Souza et al., 2023). The computational cost for doublet-DALI is typically $\sim$ two orders of magnitude lower than full MCMC sampling while achieving substantial improvements in accuracy—e.g., a factor of five reduction in difference to the true posteriors for certain parameters in real GW events (Souza et al., 19 Oct 2025, Wang et al., 2022). DALI regularizes flat or ill-conditioned Fisher matrices and handles strong degeneracies (e.g., face-on binaries at $ι\to 0,π$ ), yielding credible intervals robust to "turn-around" pathologies seen in Gaussian-only Fisher forecasts (Souza et al., 2023).

A comparative table highlights performance:

Method	1-D JSD (11D GW)	Typical Runtime	Notes
Fisher	0.08	0.19 h	Ellipsoidal, can diverge in flat directions
Singlet-DALI	0.04	0.19 h	Fastest, captures most marginals
Doublet-DALI	0.04	1.97 h	Robust, matches non-Gaussian features
Triplet-DALI	0.04	46.1 h	Highest-fidelity, higher cost
Full MCMC	—	109 h	Gold standard, slowest

(Souza et al., 19 Oct 2025)

4.3 Parameter-Dependent Covariance and Ring-Shaped Likelihoods

In contexts where the data-covariance depends nonlinearly on parameters (e.g., power spectrum, cluster counts), the DALI formalism expands $C(\theta)$ and constructs the likelihood in a form that maintains the positive-definiteness and normalization. DALI reconstructs degenerate "ring" or "box-ring" posteriors exactly or to arbitrary accuracy at higher order, far beyond the reach of the Fisher method (Sellentin, 2015).

4.4 Differential Machine Learning and Sensitivity Estimation

In machine learning applied to financial derivatives pricing, the DALI framework provides unbiased sensitivity (Greek) labels via likelihood-ratio estimators, dramatically reducing test error and eliminating pathwise/adjoint estimator bias for discontinuous payouts, as in digital or barrier options (Glasserman et al., 4 Dec 2025).

4.5 ODE-Constrained and "Likelihood-Free" Inference

DALI-style expansions with Gaussian filtering or adjoint-state techniques yield efficient estimators for gradients and Hessians in inverse problems defined by dynamical systems (ODEs), allowing Newton-type and Hamiltonian MCMC optimization with orders-of-magnitude speedup (Melicher et al., 2016, Kersting et al., 2020).

5. Comparison to Other Approaches and Limitations

Fisher Matrix

The Fisher approach is efficient and valuable for approximately Gaussian or weakly non-Gaussian posteriors and for experimental design. It fails in the presence of:

Strong degeneracies
Highly non-linear parameter dependence
Regions with heavy tails or banana-shaped credible regions

DALI generalizes the Fisher matrix by including higher-order derivative tensors, remedying these limitations and providing a diagnostic for Fisher approximation breakdown (Heavens, 2016, Ryan et al., 2022).

MCMC and Sampling-Based Inference

Full MCMC remains the gold standard for sampling from non-Gaussian posteriors, but is orders of magnitude slower. DALI typically achieves $\sim 10^3 \times$ reduction in computational cost with credible-region errors at the 10% level or better relative to MCMC in the studied scenarios (Sellentin, 2015, Souza et al., 19 Oct 2025).

Limitations

DALI is a controlled, local expansion. Its accuracy degrades far in the wings of highly non-Gaussian posteriors or if the local higher-order expansion breaks down.
For very high-dimensional parameter spaces ( $N_p \gtrsim 20$ –$100$), storage and computational costs for high-order tensors become prohibitive (Heavens, 2016).
Analytic marginalization is generally not available: credible intervals require grid integration or MCMC over the DALI surrogate (Sellentin et al., 2014).

6. Software Implementations and Practical Workflow

Several Python packages implement DALI:

GWDALI: For gravitational-wave data, supporting "singlet," "doublet," and "triplet" orders; utilizes autodiff in JAX, links to the LAL suite and Bilby for waveform and sampling (Souza et al., 2023, Souza et al., 19 Oct 2025).
DerivKit: General-purpose, includes robust finite-difference and derivative-assembly utilities, bridging Fisher and DALI approaches for arbitrary (including black-box) forward models (Šarčević et al., 8 Feb 2026).
LNAsellentin/DALI: Reference implementation for cosmology and likelihoods with parameter-dependent covariance (Sellentin, 2015).

Typical workflow involves:

Identifying or computing the fiducial (best-fit) parameter vector.
Evaluating all required derivatives (possibly by auto-diff, finite differences, or polynomial fitting).
Assembling derivative tensors into the DALI expansion.
Using the resultant approximate log-likelihood analytically or as a surrogate in MCMC, grid, or variational samplers.
Comparing Fisher, DALI, and (if feasible) full posterior contours to assess the validity of the approximation (Sellentin et al., 2014, Šarčević et al., 8 Feb 2026, Souza et al., 2023).

7. Theoretical Structure and Generalizations

DALI is linked to cumulant expansions (e.g., Gram–Charlier), the Laplace approximation, and partition function formalisms. Cumulants of the posterior are obtained by derivatives of the log-partition function with respect to sources, and the DALI expansion builds a Gram–Charlier-like series for the posterior, with the Fisher matrix as leading order, and subsequent tensors encoding skewness, kurtosis, and higher moments (Röver et al., 2022, Sellentin et al., 2014).

In large-sample or asymptotic regimes, DALI connects to the implicit function theorem and higher-order expansions of the maximum-likelihood estimator, with each order linked to the corresponding moments of the score function (Lejay et al., 2022).

References

(Heavens, 2016) Generalisations of Fisher Matrices
(Sellentin et al., 2014) Breaking the spell of Gaussianity: forecasting with higher order Fisher matrices
(Sellentin, 2015) A fast, always positive definite and normalizable approximation of non-Gaussian likelihoods
(Röver et al., 2022) Partition function approach to non-Gaussian likelihoods: Formalism and expansions for weakly non-Gaussian cosmological inference
(Ryan et al., 2022) Beyond Fisher Forecasting for Cosmology
(Glasserman et al., 4 Dec 2025) Differential ML with a Difference
(Šarčević et al., 8 Feb 2026) DerivKit: stable numerical derivatives bridging Fisher forecasts and MCMC
(Souza et al., 2023) GWDALI: A Fisher-matrix based software for gravitational wave parameter-estimation beyond Gaussian approximation
(Souza et al., 19 Oct 2025) On the use of the Derivative Approximation for Likelihoods for Gravitational Wave Inference
(Wang et al., 2022) Extending the Fisher Information Matrix in Gravitational-wave Data Analysis
(Brümmer, 2014) The EM algorithm and the Laplace Approximation
(Melicher et al., 2016) Fast derivatives of likelihood functionals for ODE based models using adjoint-state method
(Kersting et al., 2020) Differentiable Likelihoods for Fast Inversion of 'Likelihood-Free' Dynamical Systems
(Lejay et al., 2022) Beyond the delta method