Papers
Topics
Authors
Recent
Search
2000 character limit reached

Deep Ritz Method for PDEs

Updated 12 January 2026
  • Deep Ritz Method is a neural network approach that solves PDEs by minimizing energy functionals, enabling mesh-free and flexible solutions.
  • It employs Wachspress coordinates and transfinite interpolation to exactly impose Dirichlet boundary conditions on convex polygonal domains.
  • The technique demonstrates superior numerical stability and accuracy, achieving errors as low as O(10⁻⁹ compared to traditional methods.

The Deep Ritz Method is a neural network-based framework for solving partial differential equations (PDEs) using variational principles. By parameterizing trial functions with neural networks and minimizing energy functionals, this method enables mesh-free and geometry-flexible solution of boundary-value problems, especially those arising in computational physics and engineering. Recent advances leverage transfinite interpolation and generalized barycentric coordinates—such as Wachspress coordinates—to ensure the exact enforcement of Dirichlet boundary conditions on convex polygonal domains, yielding kinematically admissible neural trial spaces with advantageous regularity and numerical conditioning (Sukumar et al., 5 Jan 2026).

1. Mathematical Foundation and Variational Formulation

The foundational principle of the Deep Ritz Method is the variational formulation of elliptic PDEs, where the solution uu is characterized as a minimizer of an energy functional over a suitable Sobolev space H1(P)H^1(P). For Poisson-type equations:

Δu=f in P,u=B on P\Delta u = f \text{ in } P, \qquad u = B \text{ on } \partial P

the Ritz method seeks uu minimizing

J[u]=P(12u2fu)dx\mathcal{J}[u] = \int_P \left( \frac{1}{2} |\nabla u|^2 - f u \right) dx

subject to Dirichlet data. By parameterizing u(;θ)u(\cdot; \theta) as a neural network trial function and enforcing boundary conditions in the trial space, the minimization becomes an unconstrained optimization over network parameters θ\theta.

2. Wachspress Coordinates and Transfinite Interpolation

Wachspress coordinates provide a generalized barycentric coordinate system for convex nn-gons PR2P\subset\mathbb{R}^2, allowing affine interpolation on polygonal domains and rational interpolation in the interior. Given vertices v1,,vnv_1, \ldots, v_n and associated weights:

wi(x)=BiAi1(x)Ai(x),λi(x)=wi(x)j=1nwj(x),w_i(x) = \frac{B_i}{A_{i-1}(x)A_i(x)}, \qquad \lambda_i(x) = \frac{w_i(x)}{\sum_{j=1}^{n} w_j(x)},

with Ai1(x)A_{i-1}(x) and Ai(x)A_i(x) denoting signed triangle areas, λ(x)\boldsymbol\lambda(x) forms a partition of unity, reproduces vertex positions, and yields exactly two nonzero barycentric coordinates on each edge. These properties make Wachspress coordinates suitable as neural network inputs, encoding both geometry and boundary structure (Sukumar et al., 5 Jan 2026).

The transfinite interpolant g:PˉC0(Pˉ)g: \bar P \to C^0(\bar P) is constructed to lift a prescribed boundary function BB into the interior while exactly reproducing BB on P\partial P. The piecewise Dirichlet data, specified on each edge by functions αi(t)\alpha_i(t), is interpolated via:

g(λ)=i=1nλi[αi(λi+1)+αi1(1λi1)αi(0)],g(\lambda) = \sum_{i=1}^n \lambda_i \left[ \alpha_i(\lambda_{i+1}) + \alpha_{i-1}(1-\lambda_{i-1}) - \alpha_i(0) \right],

or more generally as a Boolean-sum style formula summing over faces incident on each vertex.

3. Neural Network Ansatz with Exact Dirichlet Imposition

To ensure "hard" or exact compliance with Dirichlet boundary conditions, the neural trial function is formulated as:

u(x;θ)=N(λ(x);θ)L[N](λ(x);θ)+g(λ(x)),u(x;\theta) = N(\lambda(x);\theta) - L[N](\lambda(x);\theta) + g(\lambda(x)),

where N(;θ)N(\cdot;\theta) is a multilayer perceptron (MLP) receiving Wachspress coordinates as input, gg is the transfinite interpolant of BB, and L[N]L[N] is an extension operator applying the same transfinite structure to NN projected onto boundary faces and vertices. The operator L[N]L[N] is constructed analogously to gg by projecting the input λ\lambda onto boundary subspaces, yielding a trial function in which NL[N]N-L[N] identically vanishes on P\partial P, guaranteeing uP=Bu|_{\partial P}=B.

This construction replaces previously-used approximate distance function (ADF) techniques, addressing their limitations by ensuring Δu\Delta u remains bounded everywhere in PP, including at polygonal corners—a property verified both analytically and numerically (Sukumar et al., 5 Jan 2026).

4. Algorithmic Implementation

The Deep Ritz method with Wachspress-based transfinite formulation is operationalized as follows:

  • Preprocessing: Compute polygon vertex locations {xi}\{x_i\}. For any spatial point xx, evaluate Wachspress coordinates λ(x)\lambda(x).
  • Boundary Lifting: Construct the transfinite interpolant g(λ)g(\lambda) for given edge-based Dirichlet data αi(t)\alpha_i(t).
  • Neural Network Forward Pass:
    • Input λ\lambda into MLP: λN(λ;θ)\lambda \to N(\lambda;\theta).
    • Compute extension operator L[N](λ;θ)L[N](\lambda; \theta) using the transfinite interpolation scheme.
    • Form the trial solution u(λ;θ)=NL[N]+gu(\lambda; \theta) = N - L[N] + g.
  • Variational or Collocation Loss:
    • Deep Ritz: Integrate P12u2fudx\int_P \frac{1}{2}|\nabla u|^2 - f u\,dx.
    • PINN (Physics-Informed Neural Networks): Collocation loss L(θ)=1Mk=1MΔxu(xk;θ)f(xk)2L(\theta)=\frac{1}{M}\sum_{k=1}^M |\Delta_x u(x_k;\theta) - f(x_k)|^2.
  • Optimization: Update θ\theta using first- or second-order optimizers (Adam, L-BFGS).
  • Parametric Domains: For families of shapes, append domain parameters pp to input, recompute λ(x,p)\lambda(x,p), and train a single network for a range of geometries.

5. Numerical Performance and Comparative Results

Comparative studies demonstrate the efficacy of the Wachspress-based transfinite Deep Ritz method versus traditional techniques, such as the ADF-based trial. Numerical results include:

  • On a unit square with boundary data sinπx\sin \pi x on one edge, transfinite interpolation yields bounded Δu\Delta u with training loss O(109)O(10^{-9}) compared to O(106)O(10^{-6}) for ADF, and maximal solution error O(106)O(10^{-6}) versus O(103)O(10^{-3}) for ADF near corners.
  • For highly oscillatory Dirichlet data (sin(10πx)\sin(10\pi x)), the method resolves thin boundary layers robustly across activation functions.
  • On non-rectangular quadrilaterals, loss terms converge to 101210^{-12}, with maximal solution errors O(109)O(10^{-9}).
  • On pentagonal domains, O(104)O(10^{-4}) solution error is observed relative to finite element (Abaqus) benchmarks.
  • Inverse problems, such as heat-source reconstruction, achieve 0.5%\leq0.5\% relative error.
  • For parametrically-varying quadrilaterals, the single-network approach maintains O(106)O(10^{-6}) maximum error over p[0,1]p\in[0,1].

A summary table of prominent numerical results follows:

Problem Class Max Error Achieved Notable Features
Unit Square Laplace, smooth BC O(106)O(10^{-6}) Bounded Δu\Delta u, outperforms ADF
Oscillatory boundary (PINN+TFI) O(103)O(10^{-3}) Robust to activation, resolves boundary
Non-rectangular quadrilateral O(109)O(10^{-9}) Training loss to 101210^{-12}
Pentagonal Laplace vs. FEM O(104)O(10^{-4}) Agreement with Abaqus FE solution
Inverse heat-source 0.5%\lesssim0.5\% Accurate source parameter recovery
Parametric quadrilaterals O(106)O(10^{-6}) Solution uniform over shape family

6. Boundary Condition Enforcement and Regularity

A central contribution is the exact (hard) enforcement of Dirichlet boundary conditions for neural network trial spaces on convex polygonal domains. The trial function is kinematically admissible by construction, with the property that Δu\Delta u remains bounded in PP and up to the boundary—including vertices—thereby averting the singularities typical for R-function or ADF-based methods. This leads to improved convergence and numerical stability, especially when using collocation points near geometric singularities (Sukumar et al., 5 Jan 2026).

On rectangles, the Wachspress-based transfinite interpolant specializes to the classical bilinear Coons patch, but generalizes seamlessly to arbitrary convex nn-gons, enabling mesh-free enforcement of complex geometric boundary data.

7. Extensions and Applications

The Wachspress-based Deep Ritz methodology is applicable to forward, inverse, and parametric geometric PDE problems, including nonlinear elliptic equations. Extensions encompass:

  • Families of parametrized domains, where the geometric parameter is treated as part of the neural network input, enabling rapid solution and design space exploration.
  • Inverse problems (e.g., source identification) by combining PDE and data-driven losses.
  • Nonlinear equations, demonstrated on problems of the form Δueu+f=0\Delta u - e^u + f = 0.

This framework enhances the accuracy, conditioning, and generality of PINN and Deep Ritz solvers for a wide spectrum of PDE-boundary value problems in computational physics, geometry, and engineering design (Sukumar et al., 5 Jan 2026).

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Deep Ritz Method.