DD-GPCE-Kriging Surrogates

Updated 20 December 2025

DD-GPCE-Kriging surrogates are advanced metamodels that combine dimensionally decomposed polynomial chaos and Gaussian Process Regression for efficient uncertainty quantification in high-dimensional systems.
They employ dimensional truncation and measure-consistent orthonormal polynomial whitening to manage computational complexity with potentially dependent random inputs.
Their applications span surrogate-assisted optimization and multifidelity simulations, achieving significant accuracy improvements and dramatic computational savings.

Dimensionally Decomposed Generalized Polynomial Chaos Expansion–Kriging (DD-GPCE-Kriging) surrogates constitute an advanced family of metamodels for efficiently and accurately approximating nonlinear and potentially nonsmooth functionals in high-dimensional stochastic systems. These surrogates merge the tractability and orthogonality properties of dimensionally decomposed generalized polynomial chaos expansions (DD-GPCE) with the nonparametric flexibility and uncertainty quantification capabilities of Gaussian-process (Kriging) regression. DD-GPCE-Kriging surrogates are routinely used in stochastic simulation, reliability analysis, multifidelity inference, and in surrogate-assisted evolutionary optimization, particularly where high-dimensionality and dependent random inputs render classical polynomial or GPR models infeasible or inaccurate (Lee et al., 2022, &&&1&&&).

1. DD-GPCE Construction and Dimensional Decomposition

The DD-GPCE framework generalizes the classical polynomial chaos expansion (PCE) for numerical models with arbitrary dimension $N$ and possibly dependent random inputs $X=(X_1,\ldots,X_N)^\top$ distributed with density $f_X(x)$ . A square-integrable output $y(X)$ admits a PCE of the form:

$y_m(x) = \sum_{|\alpha|=0}^m c_\alpha \Psi_\alpha(x)$

where $\Psi_\alpha$ are orthonormal polynomials indexed by multi-indices $\alpha \in \mathbb{N}_0^N$ of total order $|\alpha| \leq m$ , orthonormal under $f_X$ . In high dimensions, the full set of basis functions grows combinatorially with $N$ , posing computational challenges.

DD-GPCE introduces dimensional truncation: only basis polynomials $\Psi_\alpha$ whose support involves at most $S$ input variables ( $|supp\,\alpha| \leq S$ ) and whose total degree $|\alpha| \leq m$ are retained. The reduced multi-index set $\mathcal{A}_{N,S,m}$ has cardinality

$L_{N,S,m} = 1 + \sum_{s=1}^{S} \binom{N}{s} \binom{m}{s}.$

This yields the surrogate

$y_{S,m}(x) = c^\top \Psi_{S,m}(x),$

where $c$ is solved via moment inversion:

$c = [E[\Psi\,\Psi^\top]]^{-1} E[y\,\Psi].$

Measure-consistent orthonormal polynomials are constructed using a whitening procedure: assemble the monomial vector $M(x)$ over $\alpha \in \mathcal{A}_{N,S,m}$ , compute the Gram matrix $G = E[M(X)M(X)^\top]$ , factor $G = UU^\top$ via Cholesky, set $W = U^{-1}$ , and define $\Psi(x) = WM(x)$ , so that $E[\Psi\Psi^\top] = I$ (Lee et al., 2022).

2. Gaussian Process (Kriging) Foundation and Universal Kriging Formulation

The Kriging (Gaussian Process Regression) surrogate augments a parametric trend (here, the DD-GPCE model) with a zero-mean Gaussian process to capture nonlinear residuals. The output field is modeled as

$Y(x) = \mu + Z(x),$

where $\mu$ is the mean and $Z(x)$ has covariance $\sigma^2 R(x,x';\theta)$ ; $R$ is a valid correlation kernel.

In universal Kriging, the mean is no longer assumed constant but is modeled as $g(x) = c^\top \Psi_{S,m}(x)$ from DD-GPCE, leading to:

$\overline{Y}(x) = c^\top \Psi_{S,m}(x) + \sigma^2 Z(x;\theta)$

The predictor mean and variance for a training set $\{ x^{(\ell)}, y^{(\ell)} \}_{\ell=1\ldots L'}$ are evaluated as:

$\hat{y}(x) = \Psi_{S,m}(x)^\top \hat{c} + r(x)^\top R^{-1}(y - \hat{\Psi} \hat{c})$

where $r(x)$ collects kernel correlations to the training points, and $\hat{c}$ solves the Kriging normal equations:

$\hat{c} = (\hat{\Psi}^\top R^{-1} \hat{\Psi})^{-1} \hat{\Psi}^\top R^{-1} y.$

The Kriging variance at $x$ is given by:

$Var[\hat{y}(x)] = \hat{\sigma}^2 \left\{ 1 - r(x)^\top R^{-1} r(x) + (\Psi_{S,m}(x) - \hat{\Psi}^\top R^{-1} r(x))^\top (\hat{\Psi}^\top R^{-1} \hat{\Psi})^{-1} (\Psi_{S,m}(x) - \hat{\Psi}^\top R^{-1} r(x)) \right\}$

with

$\hat{\sigma}^2 = \frac{1}{L'} (y - \hat{\Psi}\hat{c})^\top R^{-1} (y - \hat{\Psi}\hat{c}).$

3. Hyperparameter Estimation and Algorithmic Process

The DD-GPCE-Kriging surrogate involves both polynomial coefficients $c$ and kernel hyperparameters $\theta$ (correlation lengths for each input), estimated via data-driven procedures:

Polynomial basis selection and whitening are done via sample or quadrature-based moment estimation.
Kernel parameters are tuned by leave-one-out cross-validation (LOOCV), minimizing squared prediction errors when each training point is omitted.
The LOOCV objective can be computed explicitly from $R^{-1}$ and $\hat{\Psi}$ (cf. Bachoc 2013), preventing full retraining.

Algorithmic workflow:

Step	Operation	Output
1	Monomial index selection ( $\mathcal{A}_{N,S,m}$ )	Monomial vector $M(x)$
2	Gram/moment computation ( $G$ )	Orthonormal-polynomial whitening matrix $W$
3	Polynomial whitening ( $\Psi(x)$ )	Measure-consistent basis $\Psi_{S,m}(x)$
4	LOOCV over kernel family $R(\cdot;\theta)$	Optimal kernel hyperparameters $\theta^*$
5	Universal Kriging fit	Surrogate mean $\hat{y}(x)$ , variance $Var[\hat{y}(x)]$

Practical guidelines: Computational cost of DD-GPCE scales as $O(L_{N,S,m}^3)$ for moment inversion, $O(L'^3)$ for Kriging regression. For high-dimensional problems, keeping $S \leq 2$ and small $m$ (1–3) is critical for tractability. Sufficient training samples ( $L'$ ) should be 2–4 times $L_{N,S,m}$ for reliable fits (Lee et al., 2022).

4. Fusion Methodologies: SMBO and Multifidelity Simulation

DD-GPCE-Kriging is applied in multiple research contexts. In surrogate-assisted Bayesian optimization (SMBO), as detailed in symbolic regression benchmarks, surrogate models are iteratively trained and leveraged to guide expected improvement (EI) acquisition, trading off global exploration and local refinement (Zaefferer et al., 2018). Each iteration selects candidate input configurations predicted to maximize EI by the surrogate, executes an expensive evaluation, augments the training set, and refits the surrogate.

In multifidelity conditional value-at-risk (CVaR) estimation, DD-GPCE-Kriging is utilized for both standard Monte Carlo simulation (MCS) and as the bias density generator for multifidelity importance sampling (MFIS). MFIS constructs the biasing density using a cheap low-fidelity model and draws a few high-fidelity samples to achieve unbiased CVaR estimates and dramatic computational savings, particularly when modeling nonsmooth random outputs.

5. Performance Benchmarks and Empirical Insights

Numerical results substantiate the efficacy of DD-GPCE-Kriging:

Accuracy: DD-GPCE-Kriging MCS yields relative errors and mean relative differences (MRD) of 0.2–2.3% versus 45% (pure DD-GPCE) or 4% (PCE-Kriging) in highly oscillatory/smooth or nonsmooth test functions. MFIS with DD-GPCE-Kriging achieves errors as low as 0.98–1.15% using an order of magnitude fewer high-fidelity evaluations compared to standard MCS.
Scalability: For complex engineering problems with up to 28 random inputs (2D composite laminate), MFIS with DD-GPCE-Kriging showed 104× speedup and accurate estimation (1.15% error) using only 250 simulations, compared to 10,000 for MCS (CPU times: 8.2 h vs. 859 h). In 3D T-joint (20 inputs), similar acceleration and robust accuracy were observed (Lee et al., 2022).
Optimization: Surrogate-assisted SMBO with Kriging kernels based on linear combinations of tree genotype and phenotype distances consistently outperforms model-free GP and random search on symbolic regression benchmarks; phenotypic distances dominate in data-scarce regimes, genotypic (esp. tree-edit) distances grow in importance as data accumulate (Zaefferer et al., 2018).

6. Kernel Design: Linear Combination of Tree Distances (GP Context)

In genetic programming and symbolic regression, Kriging kernels are constructed via a linear combination of multiple tree distances:

Phenotypic distance ( $d_p$ ): $d_p(x,x') = 1 - |\mathrm{corr}(\hat{y}(x,1), \hat{y}(x',1))|$ , where all constants are set to 1 and outputs are compared on training inputs.
Tree-edit distance ( $d_{ted}$ ): Minimal edit operations to transform $x$ into $x'$ , using APTED implementation.
Structural Hamming distance ( $d_{shd}$ , variant SHD $_2$ ): Recursively compares node labels and child alignments with minimized aggregate mismatches.

The composite distance for kernel argument is $D(x,x') = \beta_1 d_{shd}(x,x') + \beta_2 d_{ted}(x,x') + \beta_3 d_p(x,x')$ , with nonnegative weights $\beta_i \geq 0$ learned by maximum likelihood.

Kernels are formulated as

$k(x,x') = \exp[-(\beta_1 d_{shd} + \beta_2 d_{ted} + \beta_3 d_p)]$

and hyperparameters (including the $\beta_i$ ) are fit via global optimization (DIRECT algorithm; up to 1,000 likelihood evaluations) (Zaefferer et al., 2018).

7. Practical Guidelines, Limitations, and Interpretive Context

Construction of DD-GPCE-Kriging surrogates requires careful selection of truncation order $S$ and polynomial degree $m$ , tradeoffs between expressivity and overfitting, and judicious kernel selection (squared-exponential for smooth, exponential or Matérn for nonsmooth or rough residuals).

Empirical evidence demonstrates that:

DD-GPCE-Kriging dramatically accelerates stochastic simulations and reliability studies under high-dimensional, dependent, and nonsmooth random inputs.
Multifidelity frameworks leveraging DD-GPCE-Kriging can deliver unbiased risk estimates with an order-of-magnitude reduction in expensive model evaluations (Lee et al., 2022).
Surrogate-based optimization using composite tree-distance kernels enables enhanced discovery performance in symbolic regression with explicit distance importance profiling: phenotypic similarity is informative in early iterations, genotypic measures such as tree-edit distance accrue utility as more data is observed (Zaefferer et al., 2018).

A plausible implication is that adaptive weighting schemes for kernel components, informed by the evolving data regime, may further improve surrogate efficacy in both simulation and search contexts.

References

"Multifidelity conditional value-at-risk estimation by dimensionally decomposed generalized polynomial chaos-Kriging" (Lee et al., 2022)
"Linear Combination of Distance Measures for Surrogate Models in Genetic Programming" (Zaefferer et al., 2018)

Markdown Report Issue Upgrade to Chat

References (2)

Multifidelity conditional value-at-risk estimation by dimensionally decomposed generalized polynomial chaos-Kriging (2022)

Linear Combination of Distance Measures for Surrogate Models in Genetic Programming (2018)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to DD-GPCE-Kriging Surrogates.