- The paper introduces a robust task-parallel solver that integrates dynamic scaling to ensure overflow protection in the triangular Sylvester equation.
- It leverages both scalar and tiled algorithms to perform robust updates through techniques like ProtectUpdate and ProtectDivision, enhancing reliability compared to standard LAPACK methods.
- Experimental results show that while the robust solver incurs minimal overhead during non-scaled cases, it reliably prevents overflow when scaling is necessary.
Robust Task-Parallel Solution of the Triangular Sylvester Equation
This paper addresses the robust solution of the triangular Sylvester equation, a critical step within the Bartels-Stewart algorithm. The authors introduce a novel task-parallel solver equipped with overflow protection, thereby expanding the range of problems solvable by existing task-parallel methods to match that of LAPACK. This enhancement is achieved through dynamic scaling of the solution matrix, preventing overflow and ensuring reliable results.
Background and Motivation
The Bartels-Stewart algorithm reduces the general Sylvester equation to a triangular form, which is then solved using backward substitution. A potential issue arises during this substitution process: the entries of the solution matrix, Y, can grow excessively, potentially exceeding the limits of floating-point representation. LAPACK mitigates this by employing a scaling factor, α, to dynamically downscale the solution, ensuring robustness. This paper bridges the gap between existing non-robust task-parallel solvers and LAPACK by incorporating overflow protection into a task-parallel implementation.
Robust Algorithms for Triangular Sylvester Equations
The paper details the development of two robust algorithms designed to solve the triangular Sylvester equation while preventing overflow. The core idea is to dynamically compute a scaling factor α∈(0,1] such that the solution Y of the scaled triangular Sylvester equation:
$\bm{\tilde{A}\bm{Y} + \bm{Y}\bm{\tilde{B} = \alpha \bm{\tilde{C}}$
can be obtained without exceeding the overflow threshold Ω>0.
Scalar Robust Algorithm
The first algorithm is a scalar approach that enhances LAPACK's dtrsyl routine by incorporating overflow protection into the linear updates. This is achieved using the ProtectUpdate building block, which computes a scaling factor ζ to prevent overflow during matrix updates. By applying ζ to both left and right updates in the triangular Sylvester equation, the algorithm ensures that the solution remains within representable bounds. Small Sylvester equations are solved robustly via Gaussian elimination with complete pivoting, utilizing ProtectUpdate for linear updates and ProtectDivision to guard against divisions that could lead to overflow. The complete process is outlined in Algorithm 1.
Tiled Robust Algorithm
The second algorithm is a tiled approach, which restructures the scalar algorithm to leverage efficient level-3 BLAS operations. Matrices are divided into tiles, and local scaling factors are introduced for each tile of Y. This leads to the concept of "augmented tiles," which consist of a scalar α and a matrix X, representing the scaled matrix Y=α−1X. Algorithm 2, RobustUpdate, performs robust updates on these augmented tiles. By combining RobustUpdate with the RobustSyl algorithm, the paper introduces a tiled solver, drsylv (Algorithm 3), for the triangular Sylvester equation.
Experimental Results
The performance of the proposed robust solver (drsylv) is evaluated against existing non-robust (FLA_Sylv, FLASH_Sylv) and robust (dtrsyl, recsy) solvers. The experiments were conducted on an Intel Xeon E5-2690v4 node with 28 cores. The test matrices were designed to control growth during the solve, enabling assessment of the cost of robustness.
Sequential Comparison


Figure 1: Sequential runtime comparison on systems that do not require scaling.
The sequential performance comparison (Figure 1) demonstrates that drsylv is slightly slower than the non-robust solvers FLA_Sylv and FLASH_Sylv when scaling is unnecessary. This is attributed to the overhead introduced by the robustness mechanisms.
Strong Scalability

Figure 2: Strong scalability on systems that do not require scaling.
The strong scalability analysis (Figure 2) reveals that drsylv maintains scalability comparable to FLASH_Sylv on systems where scaling is not required. This indicates that the robustness features do not significantly hinder parallel performance.
Cost of Robustness

Figure 3: Cost of robustness for m = n = 10000.
The cost of robustness is examined by varying the parameters μ and ν, which control the amount of scaling needed. The results (Figure 3) show that the cost of RobustUpdate increases when scaling is required for any of the input tiles. However, even with scaling, the algorithm achieves a reasonable fraction of the peak performance.
Conclusion
This paper successfully introduces a task-parallel solver with overflow protection for the triangular Sylvester equation. By incorporating dynamic scaling, the solver expands the range of solvable problems to match that of LAPACK. The experimental results indicate that the overhead of overflow protection is minimal when scaling is not needed. When scaling becomes necessary, the algorithm automatically applies it to prevent overflow, guaranteeing a representable result. This provides a significant advantage over non-robust solvers, as it ensures reliable solutions that can be evaluated within the context of specific applications.