Deep Ritz Method for PDEs
- Deep Ritz Method is a neural network approach that solves PDEs by minimizing energy functionals, enabling mesh-free and flexible solutions.
- It employs Wachspress coordinates and transfinite interpolation to exactly impose Dirichlet boundary conditions on convex polygonal domains.
- The technique demonstrates superior numerical stability and accuracy, achieving errors as low as O(10⁻⁹ compared to traditional methods.
The Deep Ritz Method is a neural network-based framework for solving partial differential equations (PDEs) using variational principles. By parameterizing trial functions with neural networks and minimizing energy functionals, this method enables mesh-free and geometry-flexible solution of boundary-value problems, especially those arising in computational physics and engineering. Recent advances leverage transfinite interpolation and generalized barycentric coordinates—such as Wachspress coordinates—to ensure the exact enforcement of Dirichlet boundary conditions on convex polygonal domains, yielding kinematically admissible neural trial spaces with advantageous regularity and numerical conditioning (Sukumar et al., 5 Jan 2026).
1. Mathematical Foundation and Variational Formulation
The foundational principle of the Deep Ritz Method is the variational formulation of elliptic PDEs, where the solution is characterized as a minimizer of an energy functional over a suitable Sobolev space . For Poisson-type equations:
the Ritz method seeks minimizing
subject to Dirichlet data. By parameterizing as a neural network trial function and enforcing boundary conditions in the trial space, the minimization becomes an unconstrained optimization over network parameters .
2. Wachspress Coordinates and Transfinite Interpolation
Wachspress coordinates provide a generalized barycentric coordinate system for convex -gons , allowing affine interpolation on polygonal domains and rational interpolation in the interior. Given vertices and associated weights:
with and denoting signed triangle areas, forms a partition of unity, reproduces vertex positions, and yields exactly two nonzero barycentric coordinates on each edge. These properties make Wachspress coordinates suitable as neural network inputs, encoding both geometry and boundary structure (Sukumar et al., 5 Jan 2026).
The transfinite interpolant is constructed to lift a prescribed boundary function into the interior while exactly reproducing on . The piecewise Dirichlet data, specified on each edge by functions , is interpolated via:
or more generally as a Boolean-sum style formula summing over faces incident on each vertex.
3. Neural Network Ansatz with Exact Dirichlet Imposition
To ensure "hard" or exact compliance with Dirichlet boundary conditions, the neural trial function is formulated as:
where is a multilayer perceptron (MLP) receiving Wachspress coordinates as input, is the transfinite interpolant of , and is an extension operator applying the same transfinite structure to projected onto boundary faces and vertices. The operator is constructed analogously to by projecting the input onto boundary subspaces, yielding a trial function in which identically vanishes on , guaranteeing .
This construction replaces previously-used approximate distance function (ADF) techniques, addressing their limitations by ensuring remains bounded everywhere in , including at polygonal corners—a property verified both analytically and numerically (Sukumar et al., 5 Jan 2026).
4. Algorithmic Implementation
The Deep Ritz method with Wachspress-based transfinite formulation is operationalized as follows:
- Preprocessing: Compute polygon vertex locations . For any spatial point , evaluate Wachspress coordinates .
- Boundary Lifting: Construct the transfinite interpolant for given edge-based Dirichlet data .
- Neural Network Forward Pass:
- Input into MLP: .
- Compute extension operator using the transfinite interpolation scheme.
- Form the trial solution .
- Variational or Collocation Loss:
- Deep Ritz: Integrate .
- PINN (Physics-Informed Neural Networks): Collocation loss .
- Optimization: Update using first- or second-order optimizers (Adam, L-BFGS).
- Parametric Domains: For families of shapes, append domain parameters to input, recompute , and train a single network for a range of geometries.
5. Numerical Performance and Comparative Results
Comparative studies demonstrate the efficacy of the Wachspress-based transfinite Deep Ritz method versus traditional techniques, such as the ADF-based trial. Numerical results include:
- On a unit square with boundary data on one edge, transfinite interpolation yields bounded with training loss compared to for ADF, and maximal solution error versus for ADF near corners.
- For highly oscillatory Dirichlet data (), the method resolves thin boundary layers robustly across activation functions.
- On non-rectangular quadrilaterals, loss terms converge to , with maximal solution errors .
- On pentagonal domains, solution error is observed relative to finite element (Abaqus) benchmarks.
- Inverse problems, such as heat-source reconstruction, achieve relative error.
- For parametrically-varying quadrilaterals, the single-network approach maintains maximum error over .
A summary table of prominent numerical results follows:
| Problem Class | Max Error Achieved | Notable Features |
|---|---|---|
| Unit Square Laplace, smooth BC | Bounded , outperforms ADF | |
| Oscillatory boundary (PINN+TFI) | Robust to activation, resolves boundary | |
| Non-rectangular quadrilateral | Training loss to | |
| Pentagonal Laplace vs. FEM | Agreement with Abaqus FE solution | |
| Inverse heat-source | Accurate source parameter recovery | |
| Parametric quadrilaterals | Solution uniform over shape family |
6. Boundary Condition Enforcement and Regularity
A central contribution is the exact (hard) enforcement of Dirichlet boundary conditions for neural network trial spaces on convex polygonal domains. The trial function is kinematically admissible by construction, with the property that remains bounded in and up to the boundary—including vertices—thereby averting the singularities typical for R-function or ADF-based methods. This leads to improved convergence and numerical stability, especially when using collocation points near geometric singularities (Sukumar et al., 5 Jan 2026).
On rectangles, the Wachspress-based transfinite interpolant specializes to the classical bilinear Coons patch, but generalizes seamlessly to arbitrary convex -gons, enabling mesh-free enforcement of complex geometric boundary data.
7. Extensions and Applications
The Wachspress-based Deep Ritz methodology is applicable to forward, inverse, and parametric geometric PDE problems, including nonlinear elliptic equations. Extensions encompass:
- Families of parametrized domains, where the geometric parameter is treated as part of the neural network input, enabling rapid solution and design space exploration.
- Inverse problems (e.g., source identification) by combining PDE and data-driven losses.
- Nonlinear equations, demonstrated on problems of the form .
This framework enhances the accuracy, conditioning, and generality of PINN and Deep Ritz solvers for a wide spectrum of PDE-boundary value problems in computational physics, geometry, and engineering design (Sukumar et al., 5 Jan 2026).