Heteroscedastic Bayesian PINNs

Updated 14 January 2026

Heteroscedastic B-PINN is a probabilistic model that solves PDEs while estimating input-dependent noise and uncertainties.
It employs Bayesian inference methods like HMC and VI, coupled with physics-informed residuals, to enforce physical laws and quantify both aleatoric and epistemic uncertainties.
Empirical results indicate that heteroscedastic B-PINNs deliver improved RMSE and calibration metrics compared to traditional PINNs in noisy, data-scarce scenarios.

A heteroscedastic Bayesian Physics-Informed Neural Network (B-PINN) is a probabilistic deep learning framework that unifies the learning of solutions to partial differential equations (PDEs), spatially or temporally varying noise estimation, enforcement of physical laws, and comprehensive uncertainty quantification. The model incorporates a Bayesian neural network (BNN) architecture, along with physics-based residual likelihood, to infer both the PDE solution $u(x)$ and the point-wise, input-dependent noise level $\sigma(x)$ in the presence of heteroscedastic (location-dependent) noise. This enables rigorous quantification of both aleatoric (data-driven) and epistemic (model-driven) uncertainties, supporting robust predictions and decision-making in scenarios with noisy, incomplete, or scattered observations (Yang et al., 2020, Ramirez et al., 7 Jan 2026, Flores et al., 9 May 2025).

1. Probabilistic Model Specification

The heteroscedastic B-PINN models the PDE solution and the noise field with separate, parameterized neural network surrogates:

$\hat u(x; w)$ : predicts $u(x)$ with neural network parameters $w$ .
$\hat\ell(x; w_\sigma)$ : predicts $\log \sigma^2(x)$ with parameters $w_\sigma$ .
The standard deviation prediction is $\hat{\sigma}(x; w_\sigma)=\exp(\tfrac12 \hat{\ell}(x; w_\sigma)) > 0$ .

Noisy data measurements ${(x_i, y_i)}$ are modeled as

$y_i = u(x_i) + \varepsilon_i, \qquad \varepsilon_i \sim N(0, \sigma^2(x_i))$

yielding the data likelihood

$p(y_i \mid w, w_\sigma) = \mathcal{N}(y_i \mid \hat u(x_i; w), \hat{\sigma}^2(x_i; w_\sigma))$

with the per-datapoint negative log-likelihood (NLL): $-\ln p(y_i \mid w, w_\sigma) = \tfrac12 \ln(2\pi \hat{\sigma}^2(x_i)) + \frac{(y_i - \hat{u}(x_i))^2}{2\hat{\sigma}^2(x_i)}$ Physical consistency is enforced through a physics-informed residual loss, introducing a Gaussian pseudo-likelihood for the PDE residual at collocation points: $r_j(w) = [\mathcal L[\hat u(\cdot; w)]](x_j^r) - f(x_j^r), \quad p(r_j \mid w) = \mathcal N(0, \sigma_r^2)$ where $\mathcal L$ is the (possibly nonlinear) PDE operator and $\sigma_r$ is a small, user-chosen scale.

The joint unnormalized posterior over all unknown parameters $\Theta = (w, w_\sigma)$ is then: $p(\Theta \mid \mathcal D, \text{PDE}) \propto p(\mathcal D \mid w, w_\sigma) \cdot p(\{r_j\} \mid w) \cdot p(w) p(w_\sigma)$ with either Gaussian (Yang et al., 2020) or Laplace (Ramirez et al., 7 Jan 2026) priors over network parameters.

2. Bayesian Inference and Posterior Approximation

Posterior inference in heteroscedastic B-PINNs is typically performed using one of:

Hamiltonian Monte Carlo (HMC): Constructs a Hamiltonian dynamics system over the parameter space to efficiently sample from the posterior, using leap-frog integration and Metropolis-Hastings acceptance. Tuning of step size ( $\delta t$ ), leap-frog steps ( $L$ ), and mass matrix ( $M$ ) is crucial. HMC provides asymptotically exact posterior samples and is robust to multimodality; experiments find HMC produces more reliable uncertainty estimates and avoids overfitting in high-noise settings (Yang et al., 2020).
Variational Inference (VI): Approximates the posterior with a tractable variational family $q_\phi(\Theta)$ , often a fully-factorized Gaussian or a richer deep normalizing flow (DNF) transformation, minimizing the KL divergence relative to the true posterior via the evidence lower bound (ELBO) (Yang et al., 2020, Ramirez et al., 7 Jan 2026). The “reparameterization trick” is applied to generate unbiased stochastic gradient estimates for optimization with Adam or SGD.

A notable extension is the use of deep invertible flows for $q_\phi$ to capture complex, non-Gaussian posteriors.

3. Heteroscedastic Uncertainty Quantification

The heteroscedastic B-PINN explicitly models both aleatoric and epistemic uncertainties:

Aleatoric uncertainty reflects input-dependent, irreducible data noise. It is estimated directly by the network output $\sigma_{\hat u}(x, t; w)$ (Ramirez et al., 7 Jan 2026).
Epistemic uncertainty quantifies posterior variance arising from model parameter uncertainty and can be estimated by the variance across Monte Carlo samples $w_k \sim q(w)$ of the predictive mean.
Total uncertainty is the sum of the epistemic and aleatoric variance: $\widehat{\rm TU}(x, t) = \widehat{\rm EU}(x, t) + \widehat{\rm AU}(x, t)$ where

$\widehat{\rm EU}(x, t) = \frac{1}{K} \sum_{k=1}^K [\mu_{\hat u}(x, t; w_k)]^2 - \left(\frac{1}{K} \sum_{k=1}^K \mu_{\hat u}(x, t; w_k)\right)^2$

and

$\widehat{\rm AU}(x, t) = \frac{1}{K} \sum_{k=1}^K \sigma_{\hat u}^2(x, t; w_k).$

Empirical studies report that aleatoric variances typically dominate, especially in data-scarce or high-noise regimes, while epistemic variances are smaller and respond to model/data limitations (Ramirez et al., 7 Jan 2026).

4. Training Procedures, Network Architectures, and Diagnostic Best Practices

Training proceeds either via joint optimization of the negative log-posterior (MAP), stochastic variational inference, or HMC sampling. Diagnostic practices include monitoring the HMC acceptance rate, effective sample size, Gelman-Rubin $\widehat{R}$ statistics, and the convergence of the ELBO in VI (Yang et al., 2020). Recommended architectures for 1D–2D settings use 2–4 hidden layers of width 50–100 (Yang et al., 2020, Ramirez et al., 7 Jan 2026), with separate or joint heads for mean and (log-)variance; $\tanh$ activations are common.

The selection of collocation points for enforcing physics is critical. Sensitivity analyses indicate that total predictive calibration (as measured by CRPS and NLL) is especially sensitive to the number and distribution of residual and boundary points, while over-constrained settings can lead to excessive aleatoric uncertainty (Ramirez et al., 7 Jan 2026).

A pseudocode summary of the B-PINN training loop, using stochastic VI, is as follows (Ramirez et al., 7 Jan 2026):

Algorithm: Heteroscedastic B-PINN Training
Input: Data (IC, BC, collocation), prior p(w), weighting λ's.
Initialize variational parameters φ = (μ, σ).
while not converged do
  for k = 1..K do
    Sample w_k ~ q(w|φ) via reparameterization: w_k = μ + σ⊙ε, ε∼N(0,I)
    Evaluate outputs μ̂(x,t;w_k), σ̂²(x,t;w_k)
    Compute residual R[u], data NLLs (IC, BC, residual), KL term
    Accumulate gradient of 𝓛^{(k)} w.r.t. φ
  end for
  Update φ ← φ − η ∇_φ (1/K)∑ₖ𝓛^{(k)} via Adam
end while
Return: q(w;φ)

5. Quantitative Performance, Calibration, and Empirical Findings

Comprehensive empirical comparisons against deterministic PINNs, PINNs with MC dropout, and homoscedastic B-PINNs yield consistent improvements in several uncertainty calibration and accuracy metrics. For instance, heteroscedastic B-PINNs have attained 57.3% lower RMSE and significant reductions in CRPS and NLL compared to vanilla PINNs, and more accurate quantile calibration than dropout-based PINNs (Ramirez et al., 7 Jan 2026). Using error bounds derived from deterministic PINNs to define a heteroscedastic variance provides further improvements in calibration, with reported reductions in miscalibration area from 0.49 (homoscedastic) to ≈0.05 (Flores et al., 9 May 2025).

Standard metrics for evaluation include median relative error (MRE), mean residual, negative log-likelihood (NLL), CRPS, miscalibration area, and sharpness (mean predictive standard deviation).

6. Alternative Surrogates and Error-Bound Integration

Where suitable, heteroscedasticity can be directly connected to explicit PINN error bounds. In such workflows, a deterministic PINN is first trained to approximate the PDE solution and its residual errors are used to construct tight, theoretically justified heteroscedastic variances (Flores et al., 9 May 2025). The resulting bounds then define the local variance in a Bayesian surrogate, ensuring that the noise level in the data likelihood reflects the worst-case PINN solution error. This “two-step” approach demonstrably enhances the calibration of uncertainty intervals without sacrificing median solution accuracy.

Additionally, KL expansions may be used as lower-dimensional surrogates for specific classes of PDEs, but these approaches do not scale favorably with input dimension and are thus more limited in application (Yang et al., 2020).

7. Practical Applications and Future Directions

B-PINNs have been applied in forward and inverse PDE problems under various noisy settings, including canonical elliptic and parabolic equations (Yang et al., 2020), real-world PHM scenarios such as insulation aging in transformer assets (Ramirez et al., 7 Jan 2026), and evidence-based parameter inference in cosmology (Flores et al., 9 May 2025). The framework enables not only point-wise prediction under uncertain data but also supports end-to-end uncertainty-aware optimization and risk-informed decision-making.

Ongoing research addresses the efficient scaling of posterior inference to higher-dimensional PDEs, improved flexibility of variational families (e.g., deep normalizing flows), integration of a priori or adaptive error bounds, and automated diagnostics for uncertainty quantification quality.

For an overview of the empirical validation of these models, see Table 1 summarizing key metrics (as directly reported):

Method Variant	RMSE	CRPS	Miscalibration Area	Sharpness
Vanilla PINN	2.55	—	—	—
MC Dropout-PINN (hetero)	—	—	—	—
Heteroscedastic B-PINN	1.09	—	0.05	0.10–0.22 (AU)
Homoscedastic B-PINN	≈1.09	—	0.49	—

All values are as reported in (Ramirez et al., 7 Jan 2026, Flores et al., 9 May 2025).

Heteroscedastic B-PINNs synthesize advances in Bayesian learning, error-bound theory, and deep physics-informed modeling. They offer robust and interpretable uncertainty quantification across a broad array of noisy PDE learning tasks, establishing a foundation for future developments in data-driven scientific computing under uncertainty.