fANOVA Sensitivity Analysis
- Sensitivity analysis (fANOVA) is a method that decomposes model output variance into main and interaction effects using orthogonal function components.
- It employs variance-based Sobol indices and Shapley values to measure input contributions, ensuring analytical tractability in high-dimensional and complex settings.
- Integration with kernel methods and Gaussian processes enables efficient sensitivity analysis for black-box and functional-output models.
Sensitivity analysis via the functional ANOVA (fANOVA) decomposition provides a principled framework for quantifying how input variables and their interactions contribute to the output variability of complex models. The fANOVA approach rigorously decomposes output variance, supports both classical and modern global sensitivity metrics, and offers tractable methodology for high-dimensional, black-box, or functional-output models when combined with kernel and Gaussian process techniques.
1. Mathematical Foundation: The fANOVA Decomposition
The fANOVA (also known as Hoeffding or Sobol'–Hoeffding) decomposition expresses any square-integrable function as a sum of orthogonal components:
where , , and the terms correspond to main effects () and higher-order interactions (). Each component satisfies strict centering and orthogonality: for all and for . This construction yields a variance decomposition:
with Sobol indices quantifying the proportion of output variance attributable to each effect or interaction (Mohammadi et al., 20 Aug 2025, Ginsbourger et al., 2014, Huet et al., 2017). When the input law is not unique, e.g. under a mixture model, existence and uniqueness of the ANOVA components hold within cores of equivalent measures, and mixture expansions can be defined explicitly (Borgonovo et al., 2018).
2. Variance-based Sensitivity Indices: Sobol' and Generalizations
Sobol' sensitivity indices are standard metrics for quantifying the importance of input variables and their interactions. The first-order Sobol index measures the effect of varying input alone; higher-order indices capture joint and interaction effects. These indices are invariant to monotonic and orthogonal transformations of the input space when orthogonality conditions are enforced (Mohammadi et al., 20 Aug 2025, Tan et al., 15 Jun 2025). Advanced frameworks allow the definition of other, possibly non-variance-based, sensitivity indices via user-defined importance measures, such as quantiles, entropy, or exceedance probabilities (Mazo, 2024). In this context, main and interaction effects still admit systematic factorial decompositions beyond variance.
3. Shapley Values and Connection to Sensitivity Analysis
Shapley values, originating from cooperative game theory, provide a principled attribution scheme for assigning importance to each input variable through the lens of averaging over all possible feature coalitions. In the fANOVA context, the Shapley value for feature is
where are the Möbius-inverted pure interaction variances (Mohammadi et al., 20 Aug 2025, Mazo, 2024). Shapley indices enjoy efficiency (), symmetry, and the null-player property. Unlike Sobol indices, Shapley values systematically average over all permutations of variable inclusion, yielding a fair allocation of variance contributions even under dependent or non-identically distributed inputs. Factorial and weighted versions generalize this to arbitrary user importance measures and higher-order interactions.
4. Kernel, RKHS, and Gaussian Process Frameworks for fANOVA
Kernel-based approaches enable nonparametric implementation of fANOVA by constructing reproducing kernel Hilbert spaces (RKHS) that mirror the Sobol–Hoeffding decomposition. Specific RKHS are constructed with direct sum decompositions over all subsets of variables, enabling simultaneous sparsity and smoothness (Huet et al., 2017). The KANOVA decomposition of kernels provides a full tensor product projection of covariance functions, enabling Gaussian processes (GPs) and kriging models to emulate arbitrary main-plus-interaction structures (Ginsbourger et al., 2014). Orthogonal additive kernels and GP priors allow one to enforce zero-mean and mutual orthogonality of ANOVA components, ensuring analytic tractability of effect size estimations even for high-dimensional or functional-output models (Tan et al., 15 Jun 2025).
A key practical advance is the FANOVA-GP prior, which makes a GP whose ANOVA components have orthogonal prior and posterior distributions. This structure allows closed-form computation of effect variances, expected conditional variances, and Sobol indices. The KANOVA machinery further allows projection kernels to design GPs with tailored sparsity and independence across effects.
5. Shapley Attribution in Quadratic Time for FANOVA GPs
The standard computation of Shapley values involves a sum over all coalitions per variable, leading to exponential scaling. For FANOVA GPs, the covariance structure and Möbius closure permit an exact, closed-form computation of Shapley indices via a recursion based on Newton's identities, yielding quadratic time complexity in input dimension (Mohammadi et al., 20 Aug 2025). For each variable, the requisite sums over interaction sets reduce, via the elementary symmetric polynomial recursion, to operations, enabling practical sensitivity analysis in moderate to large regimes for both global (variance-based) and local (instancewise) explanations. The method provides not only attributions for the predictive mean but also exact uncertainty quantification for local Shapley values, i.e. the mean and variance of the stochastic cooperative game.
6. Extensions: Functional Outputs, Mixtures, and Generalizations
Functional-output models require sensitivity analysis that captures how input variables affect not just scalar but functional (curve, surface) outputs. The FOAGP framework exploits separable additive GP priors to realize functional-output analogs of the Sobol–Hoeffding decomposition, deriving both local (at each ) and global (expected conditional variance) Sobol indices in analytic closed-form (Tan et al., 15 Jun 2025). All indices are computed without additional Monte Carlo sampling, exploiting kernel and Hadamard algebra for scaling. Under mixtures of input distributions, the ANOVA expansion and sensitivity indices are reformulated via structural variance averaging and explicit variance-of-mean corrections, maintaining orthogonality and dimension distribution consistency (Borgonovo et al., 2018).
Recent generalizations define sensitivity indices with respect to arbitrary user-chosen importance measures, not restricted to variance, and extend the factorial-Sobol (Möbius) and Shapley decompositions accordingly. This reframes and unifies classical and novel sensitivity metrics, and provides systematic definitions for main and multiway interaction effects under arbitrary input dependence and measure (Mazo, 2024).
7. Practical and Computational Guidance
Modern fANOVA-based sensitivity analysis in statistical emulation follows a sequence of steps: (1) model fitting via kernel-based or Gaussian process surrogates with built-in ANOVA structure, (2) effect extraction and orthogonal variance decomposition, (3) sensitivity index computation using analytic trace formulas or, for Shapley values, efficient recursions, (4) result interpretation, including exhaustive identification of dominant main and interaction effects, and (5) uncertainty quantification where the predictive model is stochastic (Mohammadi et al., 20 Aug 2025, Tan et al., 15 Jun 2025, Ginsbourger et al., 2014). High-dimensional applications are facilitated by projection or kernel design to restrict effect sets, and by leveraging sparsity patterns. In simulation studies, GP-based fANOVA surrogates yield near-true Sobol and Shapley indices with modest data, outperforming classical basis expansions, particularly under nonuniform or unknown input distributions (Tan et al., 15 Jun 2025).
Practical implementation recommendations include prescaling inputs, exploiting Kronecker structure for gridded functional outputs, and adapting kernels to the physics (e.g., periodic kernels for angular variables). For arbitrary user importance, estimation proceeds via conditional resampling and pick–freeze or nested Monte Carlo schemes, with computational acceleration via screening, sparse designs, or randomized weighting.
For rigorous development and detailed algorithms, see "Exact Shapley Attributions in Quadratic-time for FANOVA Gaussian Processes" (Mohammadi et al., 20 Aug 2025), "Effect Decomposition of Functional-Output Computer Experiments via Orthogonal Additive Gaussian Processes" (Tan et al., 15 Jun 2025), "A new paradigm for global sensitivity analysis" (Mazo, 2024), "Metamodel construction for sensitivity analysis" (Huet et al., 2017), "On ANOVA decompositions of kernels and Gaussian random field paths" (Ginsbourger et al., 2014), and "Functional ANOVA with Multiple Distributions" (Borgonovo et al., 2018).