Residual DeepONet Scheme
- Residual DeepONet is a neural network scheme that learns operator residuals to correct fast reduced-order models in high-dimensional parametric PDEs.
- The approach employs hybrid architectures, including multi-fidelity, physics-informed, and ResUNet-DeepONet variants, to boost predictive accuracy and stability.
- It enables efficient error correction and out-of-distribution detection, proving essential for sensitivity analysis, optimization, and uncertainty quantification.
The Residual DeepONet scheme encompasses a family of operator-learning neural network architectures and methodologies for enhancing surrogate modeling of high-dimensional parametric systems, particularly those governed by partial differential equations (PDEs), by explicitly learning and correcting the residual error between a fast surrogate (e.g., a reduced order model, ROM) and a high-fidelity reference. This paradigm is implemented in several forms, notably for stationary reduced-order modeling via multi-fidelity DeepONet, for time-dependent PDE surrogates via physics-informed time-integrated DeepONet, and for encoding geometric and parametric complexity in elastoplastic systems using residual U-Net–DeepONet hybrids. The unifying principle is to combine classical projection-based or data-driven surrogates with a DeepONet architecture designed to predict, correct, or monitor model residuals, thereby achieving dramatic improvements in predictive accuracy, stability, and generalization—often at negligible additional online computational cost (Demo et al., 2023, Mandl et al., 7 Aug 2025, He et al., 2023).
1. Mathematical Formulation of Residual Learning in DeepONet
Let denote the spatial domain and the parameter domain. For a parametric PDE, denote by the high-fidelity (“truth”) solution at parameter and spatial location . A reduced-order model (ROM) provides a fast, low-dimensional approximation. The pointwise residual is defined as
Residual DeepONet seeks to learn a neural approximation , so that the corrected output
approaches the true solution with much reduced error. For time-dependent models, the Physics-Informed Time-Integrated (PITI) DeepONet learns to approximate the time-derivative and integrates it forward via explicit or implicit numerical schemes, while monitoring residuals as a proxy for out-of-distribution detection (Demo et al., 2023, Mandl et al., 7 Aug 2025).
2. DeepONet Architectures for Residual Learning
2.1 Standard and Multi-Fidelity DeepONet
DeepONet decomposes the operator learning task into two neural sub-networks:
- Branch Network : encodes parametric input (e.g., , or sensor projections in gappy-POD) to output coefficients.
- Trunk Network : encodes the location into basis functions.
The residual is expressed as an inner product:
2.2 Physics-Informed Dual-Output DeepONet (Time Integration)
PITI-DeepONet employs a dual-output architecture:
- Branch: takes the discretized current field .
- Two Trunks: output the state and its time-derivative , each assembled as a branch–trunk inner product.
- Time-Stepping: Network-predicted is advanced using schemes such as Euler, RK4, or Adams–Bashforth–Moulton (Mandl et al., 7 Aug 2025).
2.3 ResUNet-DeepONet for Complex Geometries
For prediction in elastoplasticity under complex geometries, a residual U-Net (ResUNet) trunk encodes 2D spatial patterns, while a fully-connected branch network encodes scalar loads. Element-wise multiplication fuses the branch and trunk in the latent space, with final output by channel-wise reduction.
3. Multi-Fidelity Training and Loss Functions
3.1 Training Objectives
Residual DeepONet is trained to minimize the mean squared residual:
where and index the parameter and spatial samples. In gappy-POD, the branch takes from sensor placements as input.
PITI-DeepONet is trained with hybrid losses including PDE residual, initial condition matching, boundary consistency, and optional data-driven losses on the state and time-derivative:
(Demo et al., 2023, Mandl et al., 7 Aug 2025).
3.2 Implementation and Hyperparameters
Empirically, effective configurations include:
- Shallow branch/trunk (2–3 layers, 20–50 neurons, softplus or PReLU), –$30$ outputs for standard DeepONet (Demo et al., 2023).
- ResUNet trunk with four-level encoder–decoder, 3.5 million parameters, with small MLP branch (32 output channels) for geometry–load fusion in elastoplasticity (He et al., 2023).
- Training with Adam optimizer, learning rates $3$–, – epochs depending on the problem.
4. Validation, Performance, and Benchmark Results
Numerical results consistently show Residual DeepONet significantly outperforms both standalone ROMs and vanilla operator networks. Representative findings:
| Problem | Baseline ROM | DeepONet | MFDeepONet/PITI |
|---|---|---|---|
| 1D toy (POD ) | 0.324 | 0.270 | 0.247 |
| 2D Navier–Stokes (POD ) | 0.105 | 0.072 | 0.022 |
| 1D Heat Eq. () (FR) | 1.5 | – | 0.24 |
| 1D Burgers’ () (FR) | 0.13 | – | 0.018 |
| 2D Allen-Cahn () (FR) | 0.20 | – | 0.12 |
| Elastoplastic ResUNet-DeepONet | 8.2% | 27.5% (FC) | 8.5% |
Mean errors for MFDeepONet are reduced by up to an order of magnitude (e.g., in gappy-POD Navier–Stokes). Computational speed-up is substantial: online costs are the sum of the ROM and DeepONet forward evaluations, both s per case, contrasting with finite-element simulations at s (Demo et al., 2023, Mandl et al., 7 Aug 2025, He et al., 2023).
5. Critical Analysis and Scope of Application
Strengths
- Data Efficiency: No extra high-fidelity queries are required beyond initial ROM construction.
- General-Purpose Compatibility: Can be applied to any non-intrusive ROM (POD-RBF, gappy-POD, DEIM, manifold methods).
- Projection-Focused Learning: Targets the easier sub-task of learning projection error rather than the entire solution field.
- Substantial Error Correction: Error reductions are most pronounced when baseline surrogates are only moderately accurate.
- Physics-Informed Generalization: In time-dependent settings, explicit residual monitoring serves as a zero-cost out-of-distribution detector and quality estimator with high correlation () to true error.
Limitations
- Marginal utility if the baseline ROM is already highly accurate (e.g., high POD energy threshold ).
- Offline costs for network training and hyperparameter optimization can be nontrivial.
- Extension to unsteady or history-dependent systems requires additional modeling (e.g., recurrent or convolutional trunk/branch).
- No guaranteed a priori error bounds—entirely data-driven.
Applicability
Residual DeepONet frameworks have demonstrated efficacy in:
- Parametric PDE surrogates for real-time inference.
- Outer-loop tasks such as sensitivity analysis, optimization, and uncertainty quantification, especially with variable and complex geometric domains (Demo et al., 2023, He et al., 2023).
- Time-dependent multi-physics models where traditional rollout methods (full rollout, autoregressive) fail due to error accumulation or instability; PITI-DeepONet offers significant improvements for long-horizon prediction (Mandl et al., 7 Aug 2025).
6. Architectures and Mechanisms: Extensions and Future Directions
Intrusive Variants
Learning residual source-terms for projection-based Galerkin ROMs offers a route to intrusive versions of the method, with the potential for enforcing hard constraints (e.g., conservation).
Hybrid Fusion and Advanced Networks
Elementwise fusion in latent space (ResUNet-DeepONet) enables effective encoding of rich geometric context and parameter coupling. This approach is extendable to recurrent, attention-based, or physics-constrained trunk/branch networks, and prospective work includes:
- Time-sequence and path-dependent loading via RNN/CNN branch extensions.
- Plastic-strain or joint field prediction for elastoplastic modeling.
- Physics-informed regularization within the residual network to enforce conservation laws (Demo et al., 2023, Mandl et al., 7 Aug 2025, He et al., 2023).
Adaptive Sampling and Quality Estimation
Residual-based error indicators from DeepONet enable active learning strategies that systematically enrich the training set where model confidence is low—a plausible direction for more robust outer-loop deployments.
7. Significance in Scientific Machine Learning
Residual DeepONet constitutes a key advance in operator learning for scientific machine learning, enabling the correction of model-form errors in real-time surrogates while leveraging deep neural networks’ flexibility for generalizing across high-dimensional, parametrically rich inputs and outputs. Its multi-fidelity, physics-informed, or hybrid instantiations systematically reduce surrogate error and offer practical reliability benefits (e.g., fail-safe detection of out-of-sample states), making it a foundational tool in the numerical simulation and design of complex physical systems (Demo et al., 2023, Mandl et al., 7 Aug 2025, He et al., 2023).