DP-FETA: Dual-Primal FETI-SQP Solver
- DP-FETA is a domain decomposition framework that integrates dual-primal FETI-DP preconditioners with SQP methods using quasi-Newton Hessian approximations to robustly solve nonlinear elliptic and structural problems.
- The algorithm partitions the global degrees of freedom into interior, primal, and dual sets and enforces interface continuity via Lagrange multipliers, facilitating efficient parallelism.
- Numerical experiments show polylogarithmic scaling in condition numbers and reduced Hessian factorizations, ensuring scalable performance even on heterogeneous, irregular meshes.
DP-FETA (Dual-Primal Finite Element Tearing and Interconnecting Algorithm) refers to a class of domain decomposition schemes that integrate dual-primal finite element tearing and interconnecting (FETI-DP) preconditioners with sequential quadratic programming (SQP) frameworks, including quasi-Newton Hessian approximations. These methods are designed for the scalable, robust parallel solution of (possibly nonlinear) elliptic and structural mechanics problems, particularly those discretized using finite element or virtual element methods on highly irregular meshes (Prada et al., 2018, Köhler et al., 15 Aug 2025).
1. Problem Formulation and Domain Decomposition
The foundational setup of DP-FETA is a (possibly nonlinear) PDE, for example:
posed on domain . Here, is piecewise bounded and positive. Discretization relies on a mesh into heterogeneously-shaped elements (e.g., arbitrary polygons in Virtual Element Method (VEM), or bricks/tetrahedra in standard FEM) (Prada et al., 2018, Köhler et al., 15 Aug 2025).
Domain decomposition splits into non-overlapping subdomains of diameter , and partitions the global degrees of freedom (dofs) into:
- Interior dofs: supported strictly inside subdomains,
- Interface dofs: located on the skeleton between subdomains, further categorized as:
- Primal dofs: nodes where three or more subdomains meet,
- Dual dofs: nodes where exactly two subdomains meet.
This splitting is crucial for parallelization, as interior solves and primal constraints can be processed independently across subdomains, while interface continuity is managed by Lagrange multipliers (dual variables).
2. Lagrangian Framework and KKT System
For nonlinear variants, the global variational problem with interface continuity constraints is written as:
where sums local nonlinear energy functionals for subdomain variables (interior and dual) and shared primal unknowns . Continuity across subdomains is enforced by a Boolean jump matrix , acting nontrivially only on dual dofs (Köhler et al., 15 Aug 2025).
The associated Lagrangian reads:
leading to KKT conditions:
- Gradient: ,
- Constraint: .
The Newton step for is given by the block system:
3. DP-FETA Algorithmic Framework
DP-FETA solves the nonlinear, constrained saddle-point system using a sequential quadratic programming (SQP) loop:
- At each outer iterate , formulate a quadratic program (QP) with current Hessian approximation :
where and is the Hessian or a quasi-Newton surrogate.
- Solve the KKT system (possibly via block Gaussian elimination):
- Schur complement reduction yields a dual problem for the Lagrange multiplier increment :
solved by preconditioned Krylov methods (CG or MINRES) with a Dirichlet preconditioner precomputed from the initial Hessian (Köhler et al., 15 Aug 2025).
Update primal variables using local solves, exploiting the block diagonal structure of in subdomain variables.
Use an Armijo line search on an exact penalty objective to enforce global convergence. The penalty parameter is dynamically updated.
Update the Hessian approximation:
- If sufficient decrease in and the KKT gradient norm is not achieved, recompute the true Hessian and its factorization (restart).
- Otherwise, perform a BFGS quasi-Newton update using secant pairs (Köhler et al., 15 Aug 2025).
- Iterate the process until convergence to a KKT point.
The full process preserves the parallelism and scalability of FETI-DP, and can be applied on highly heterogeneous, possibly nonlinear or high-order discretizations.
4. Preconditioning and Scalability Properties
The FETI-DP preconditioner applied to the saddle-point system takes the form:
where:
- is block-diagonal, consisting of subdomain Neumann problems;
- is a scaled version of the Boolean jump matrix;
- is the projector onto the primal constraint space (Prada et al., 2018).
Under quasi-uniform fine mesh and shape-regular decompositions, the condition number estimate is:
with independent of the number of subdomains, mesh size, coefficient jumps, and element geometry. Higher-order VEM discretizations contribute an additional growth in the condition number. This polylogarithmic dependence ensures the method remains robust under severe diffusion contrast and geometric irregularity (Prada et al., 2018).
5. Numerical Performance and Practical Recommendations
Numerical experiments on with Voronoi and conformally partitioned polygonal meshes, including close-to-degenerate geometries and heterogeneity across up to cells and hundreds of subdomains, demonstrate:
- Iteration counts and preconditioned condition numbers scale polylogarithmically with and the polynomial degree .
- The performance is robust with respect to both jumps in the diffusion coefficient and mesh-element shapes, verified by nearly identical results in both homogeneous and heterogeneous test cases.
- For fixed dof counts, higher-order schemes see only logarithmic growth in iteration count as increases.
- Parallel scalability is near-optimal as the dominant cost is in subdomain-local factorizations and solves, which are fully independent (Prada et al., 2018, Köhler et al., 15 Aug 2025).
For nonlinear problems, incorporating quasi-Newton Hessian approximations via BFGS within the SQP loop:
- Reduces the frequency of expensive Hessian factorizations by roughly a factor of two compared to Newton-Krylov methods, with only moderate increase in Krylov iterations and outer steps.
- Achieves substantial overall speed-up (up to 27% in 2D and >50% in 3D), maintaining scalability on up to several hundred thousand cores (Köhler et al., 15 Aug 2025).
5.1. Summary Table: Numerical Results (as reported in (Köhler et al., 15 Aug 2025))
| Subdomains | 2D Solve Time (s) | SQP+BFGS Steps | #Krylov Its | #Recomp Hessian |
|---|---|---|---|---|
| 40 | 321.1 | 25 | 589 | 8 |
| 160 | 421.0 | 26 | 624 | 8 |
| 360 | 381.0 | 24 | 578 | 7 |
| 640 | 422.4 | 25 | 615 | 8 |
A similar pattern holds in 3D, with SQP+BFGS requiring fewer Hessian updates and factorizations per solve.
6. Application Scope and Limitations
DP-FETA frameworks—encompassing both linear and nonlinear FETI-DP with primal–dual decomposition and quasi-Newton SQP strategies—are suited to large-scale, possibly nonlinear, and highly heterogeneous PDEs discretized by modern (VEM or FEM) schemes. They support general, even non-conforming, mesh geometries via conformal partitioning and are independent of the particular coarse subdomain shapes. The robust preconditioner allows extreme diffusion contrasts and mesh-geometry irregularities, preserving scalability and parallel efficiency (Prada et al., 2018, Köhler et al., 15 Aug 2025).
A plausible implication is that for challenging problems in computational mechanics or geoscience with complex geometry and material contrast, DP-FETA provides an effective parallel solver not only for linear but also for nonlinear scenarios, while suitably managing both computational load and memory constraints via inexact Hessian updates (BFGS).
7. Concluding Remarks
DP-FETA algorithms, integrating FETI-DP preconditioners with SQP and quasi-Newton logic, represent a robust, theoretically grounded, and empirically validated approach to large-scale domain-decomposed PDE solution. They inherit the polylogarithmic condition number scaling and full coefficient/morphology robustness of FETI-DP, and extend these properties to nonlinear and higher-order settings with modest additional cost by exploiting inexact Hessian updates and restart logic. The method’s structure ensures optimal exploitation of parallel architectures, with subdomain solves and primal assembly processes naturally distributed. For practical use, recommendations include enforcing strong continuity only at crosspoints, employing diagonal scaling for coefficient balance, and performing conformal mesh partitioning as needed (Prada et al., 2018, Köhler et al., 15 Aug 2025).