Papers
Topics
Authors
Recent
Search
2000 character limit reached

Trust-Region Memory Updates

Updated 2 February 2026
  • Trust-region memory updates are algorithmic strategies that combine trust-region methods with memory-efficient quasi-Newton techniques to achieve robust second-order optimization in high dimensions.
  • They employ limited-memory representations like L-BFGS and spectral projections to solve shifted linear systems efficiently while maintaining positive definiteness and convergence guarantees.
  • These updates are vital for applications in reinforcement learning, PINNs, and sparse recovery, offering enhanced numerical stability and computational efficiency over full Hessian methods.

Trust-region memory updates refer to algorithmic strategies that couple trust-region optimization methods with memory-efficient quasi-Newton or curvature-aggregation techniques, enabling large-scale and robust second-order optimization. These mechanisms are critical in large-dimensional unconstrained and constrained optimization, reinforcement learning, and scientific machine learning settings where forming or inverting the full Hessian is computationally prohibitive. Trust-region memory updates facilitate the efficient solution of trust-region subproblems by utilizing limited-memory representations and update rules, while maintaining stability, accuracy, and convergence guarantees.

1. Foundations of Trust-Region Memory Updates

Classical trust-region methods iteratively solve subproblems of the form

mins q(s)=gkTs+12sTBkssubject tosΔk,\min_{s}\ q(s) = g_k^T s + \tfrac{1}{2} s^T B_k s\quad \text{subject to}\quad \|s\| \leq \Delta_k,

where gk=f(xk)g_k = \nabla f(x_k), BkB_k is a Hessian or an approximation, and Δk\Delta_k is the trust-region radius. Due to the cost of assembling and manipulating BkB_k, especially for large nn, memory-efficient approximations such as limited-memory BFGS (L-BFGS) or low-rank plus shift formats are employed.

Trust-region memory updates augment these schemes by (i) deriving compact recursions for shifted linear systems (Bk+σI)x=y(B_k + \sigma I) x = y (Erway et al., 2011), (ii) resetting or projecting curvature information through spectral/nearest-matrix projections (Berglund et al., 2024), or (iii) utilizing memory banks of previous policies or iterates as in policy optimization (Le et al., 2022).

2. Limited-Memory Recursive Solvers for Trust-Region Subproblems

The shifted linear systems central to the trust-region subproblem arise from the Moré–Sorensen optimality conditions. When BkB_k is an L-BFGS matrix, the canonical two-loop recursion efficiently computes Bk1zB_k^{-1} z for arbitrary zz in O(Mn)O(Mn), but does not address shifted systems: (Bk+σI)s=gk,(B_k + \sigma I) s = -g_k, where σ\sigma is a parameter determined (e.g., by Newton’s method) such that s=Δk\|s\| = \Delta_k.

Erway & Marcia developed a diagonal-update recursion for (Bk+σI)1z(B_k + \sigma I)^{-1} z that views Bk+σIB_k + \sigma I as a base matrix plus a sequence of rank-one updates, enabling a recursive application of the Sherman–Morrison–Woodbury formula (Erway et al., 2011). This approach maintains O(M2n)O(M^2 n) complexity—optimal for nMn \gg M—and preserves positive-definiteness under mild conditions. The recursion only involves vector inner products and can be implemented efficiently in high-level or low-level languages using standard linear algebra primitives.

3. Memory Update Strategies and Representations

Several trust-region memory update paradigms emerge in the literature:

  1. L-BFGS Memory with Trust-Region Control: L-BFGS stores the latest MM (si,yi)(s_i, y_i) pairs, and recursive strategies enable maintaining and applying approximate inverse Hessians during subproblem solves. Successful steps update the memory; rejected steps do not (Adhikari et al., 2016, Luo et al., 2020, Aravkin et al., 2021).
  2. Projection-Based Low-Rank Updates: Instead of canonical L-BFGS, the Hessian approximation can be constructed via a two-stage process: (i) perform a Broyden-class update; (ii) project the result onto the class of limited-memory (low-rank + shift) matrices via a nearest-matrix problem in a unitarily invariant norm (e.g., Frobenius or l2l^2), or a Stein divergence (Berglund et al., 2024). Storage and update are performed in terms of eigenvalue decompositions, allowing for efficient solutions of the trust-region subproblem via the spectral representation.
  3. Memory in Reinforcement Learning Trust Regions: In memory-constrained policy optimization (MCPO), memory buffers of previous policies are used to define a “virtual trust region,” with the update objective incorporating KL-divergence to both the latest policy and a convex combination (via a learned attention mechanism) of stored prior policies (Le et al., 2022). The weighting between current and virtual trust regions is dynamically adjusted based on advantage-weighted returns, enhancing robustness when recent policies perform poorly.

4. Algorithmic Procedures and Recursion Details

Below is a tabulation of core update mechanisms:

Method/Reference Curvature Update Memory Format Trust-Region Step
Erway & Marcia (Erway et al., 2011) Diagonal-update L-BFGS (SMW recursion) Last MM pairs (si,yi)(s_i, y_i) Recursively solve (B+σI)x=y(B+\sigma I) x = y
Projected Quasi-Newton (Berglund et al., 2024) Broyden-class + spectral projection Shift + eigenbasis + eigenvalues Direct spectral solution to TR subproblem
MCPO (Le et al., 2022) Policy memory buffer NN past policies, attention weights KL divergence to memory-derived “virtual” policy

The recursion from (Erway et al., 2011) computes (Bk+σI)1z(B_k + \sigma I)^{-1}z as:

  1. Initialize C01=(γ+σ)1IC_0^{-1} = (\gamma + \sigma)^{-1} I, r=C01zr = C_0^{-1} z.
  2. For j=0,,2k1j = 0,\dots,2k-1, recursively update intermediate vectors and correct rr using inner products and low-rank terms.

For projection-based updates (Berglund et al., 2024), the eigenstructure is updated after a Broyden step and projected back into the limited-memory constraint set, allowing for efficient subproblem solutions and memory resets compatible with curvature and trust-region constraints.

5. Integration with Nonlinear Constraints and Nonsmooth Terms

Recent advances incorporate trust-region memory update strategies into constrained and nonsmooth optimization:

  • trSQP-PINN applies a trust-region Sequential Quadratic Programming algorithm to PINN problems, where a quasi-Newton memory is used to approximate the Lagrangian Hessian, and a trust-region radius is adaptively updated via a soft-penalty merit function (Cheng et al., 2024). Quasi-Newton memory facilitates efficient curvature updates (using damped BFGS or SR1), ensuring regularization of search directions in ill-conditioned regions.
  • Proximal Trust-Region Quasi-Newton methods for nonsmooth composite problems maintain limited-memory curvature for the smooth term and couple it with a proximal term (for the nonsmooth part) within the trust-region subproblem, updating memory only upon successful steps and employing Powell-type damping in nonconvex regimes (Aravkin et al., 2021).

6. Stability Guarantees and Practical Performance

Trust-region memory updates possess key stability and efficiency properties:

  • Positive definiteness: The use of a trust-region (i.e., shift σ>0\sigma > 0) and the conditions on inner products (yiTsi>εy_i^T s_i > \varepsilon) maintain positive-definiteness throughout the update sequence (Erway et al., 2011).
  • Numerical stability: Recursive updates use only vector operations, minimizing roundoff error and permitting robust implementation for moderate MM.
  • Empirical efficiency: Across large-scale regression, sparse recovery, and control problems, limited-memory trust-region schemes outperform classical line-search and full-memory BFGS, requiring fewer stored vectors and trust-region iterations (Adhikari et al., 2016, Luo et al., 2020, Berglund et al., 2024); in policy optimization and PINNs, memory-based trust-region updates yield superior sample efficiency, resilience against poor local minima, and tolerance to ill-conditioning (Le et al., 2022, Cheng et al., 2024).

7. Extensions and Domain-Specific Adaptations

Memory update strategies have been adapted to diverse domains:

  • Sparse relaxation: Efficient removal of spurious solutions and better computational scaling in LASSO-type problems (Adhikari et al., 2016).
  • Physics-informed neural networks: trSQP-PINN leverages hard-constrained trust-region updates and quasi-Newton memory to overcome ill-conditioning endemic to penalty-based losses, showing two to three orders of magnitude error improvements (Cheng et al., 2024).
  • Deep RL: Memory-constrained policy optimization dynamically constructs a trust region from historical policy memory, enabling robust progress in sparse-reward and challenging environments (Le et al., 2022).

A plausible implication is the emergence of hybrid algorithms that combine projection-based curvature resetting, trust-region subproblem structure, and dynamic memory management to achieve scalability and stability across increasingly complex optimization landscapes.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Trust-Region Memory Updates.