Papers
Topics
Authors
Recent
Search
2000 character limit reached

Recursive Least Squares with Forgetting Factor

Updated 14 February 2026
  • RLS with Forgetting Factor is an adaptive algorithm that recursively estimates parameters using exponential weighting of past data to handle time-varying systems.
  • It balances rapid tracking with reduced steady-state estimation variance by adjusting the forgetting factor, whether through static or adaptive schemes.
  • Advanced variants incorporate variable-rate, partial, and directional forgetting to optimize performance in nonstationary and sparse environments.

Recursive Least Squares (RLS) with Forgetting Factor is a foundational family of algorithms for online identification, parameter estimation, and adaptive filtering in nonstationary environments. By regularly discounting the impact of old data, these algorithms adapt to time-varying systems while preserving the computational advantages of exact recursive least squares. The forgetting factor—whether scalar, vector, direction-dependent, or dynamically optimized—controls the balance between rapid tracking and estimation variance. This article reviews the unified mathematical structures, algorithmic variants, adaptation strategies, stability guarantees, and key application domains of RLS with forgetting, referencing recent arXiv research.

1. Exponentially Weighted Least Squares and Classic RLS Recursions

The prototype RLS with forgetting factor minimizes the exponentially weighted cost

Jk(θ)=i=0kλki(yiϕiθ)2,J_k(\theta) = \sum_{i=0}^k \lambda^{k-i} (y_i - \phi_i^\top\theta)^2,

where 0<λ10 < \lambda \le 1 is the forgetting factor. The scalar λ\lambda controls the exponential rate at which past data are down-weighted. This yields the hallmark matrix-vector recursions: Kk=Pk1ϕkλ+ϕkPk1ϕk, θk=θk1+Kk[ykϕkθk1], Pk=1λ[Pk1KkϕkPk1].\begin{aligned} K_k &= \frac{P_{k-1} \phi_k}{\lambda + \phi_k^\top P_{k-1} \phi_k}, \ \theta_k &= \theta_{k-1} + K_k [y_k - \phi_k^\top \theta_{k-1}], \ P_k &= \frac{1}{\lambda}\left[P_{k-1} - K_k \phi_k^\top P_{k-1}\right]. \end{aligned} Here PkP_k is the error covariance ("inverse information") matrix, and the gain KkK_k scales the update based on current data novelty (Lai et al., 2024, Lai et al., 2023).

For vector-output or multi-output systems, the updates carry over via matrix analogues, with the forgetting factor entering the weighted information matrix and all major terms (Brüggemann et al., 2020).

2. Role and Effects of the Forgetting Factor

A small λ\lambda (close to zero) aggressively discounts older data, providing fast tracking but amplifying steady-state estimation error and sensitivity to noise. A large λ\lambda (approaching one) yields slower adaptation but reduces variance and improves noise immunity [(Lai et al., 2024); (Boya et al., 2014); (Yuan et al., 2019)]. Dynamic trade-offs are captured quantitatively: in online learning, a static λ\lambda yields a regret bound interpolating between O(logT)O(\log T) (static environment) and O(TV)O(\sqrt{T V}) (bounded path length VV of drift), with explicit control via λ\lambda selection (Yuan et al., 2019).

3. Generalizations: Variable-Rate, Directional, and Multi-Scheme Forgetting

Beyond scalar, constant forgetting, recent research details several structured extensions:

  • Variable-rate forgetting (VRF): The forgetting factor sequence λk\lambda_k (or βk=1/λk\beta_k = 1/\lambda_k) adapts online, often as a function of residuals, yielding rapid responsiveness during abrupt parameter changes and noise rejection in stationary periods (Bruce et al., 2020).
  • Multiple/partial forgetting: A vector λ=(λ1,,λp)\lambda = (\lambda_1,\dots,\lambda_p) allows each parameter or direction to evolve with its own forgetting profile. Generalized mapping schemes (e.g., Tuned/Correlated, Cubic-Spline-inspired) are designed for problems with heterogeneous rates of change in system subcomponents (Fraccaroli et al., 2015). This approach preserves positive definiteness of the information matrix and is computationally comparable to standard RLS.
  • Directional or subspace forgetting (SIFt-RLS, VDF-RLS): Forgetting is applied only in directions excited by new data (i.e., “information subspaces”). This prevents information loss and parameter drift in unexcited directions, bounding the covariance without persistent excitation and ensuring robust operation in low-rank or poorly excited environments (Lai et al., 2024, Park et al., 2024).
  • Segmented forgetting profiles: Designing a composite forgetting function with piecewise segments—fast (recent data), constant (plateau), and slow (distant past)—supports control over tracking speed, condition number, and estimator robustness. This allows encoding prior knowledge of system time scales and periodicities (Stotsky, 19 Nov 2025).
Scheme Adaptation Target Typical Use Case
Scalar λ\lambda Uniform weight Generic time-varying systems
Time-varying λk\lambda_k Residual/adaptive Abrupt or context-dependent changes
Vector multi-forgetting Parameter/direction-selective Heterogeneous subsystem variation
Directional forgetting Information subspace Sparse excitation, low-rank data, stability

4. Robustness, Stability, and Optimality Results

The underlying quadratic cost structure of RLS with forgetting allows direct analysis via Lyapunov techniques. Explicit global exponential stability and robustness (global uniform ultimate boundedness) are established for several variants, including with noise, drift, or errors-in-variables—assuming appropriate excitation and boundedness conditions (Lai et al., 2023).

In the generalized forgetting RLS (GF-RLS) framework, all major RLS extensions (exponential, variable-rate, resetting, directional/partial forgetting) are unified as specific choices of per-step forgetting matrix FkF_k: Pk+11=Pk1Fk+ϕkϕk.P_{k+1}^{-1} = P_k^{-1} - F_k + \phi_k^\top\phi_k. Selection of Fk=(1λ)Pk1F_k=(1-\lambda)P_k^{-1} recovers exponential forgetting; Fk=(1λk)Pk1F_k=(1-\lambda_k)P_k^{-1} yields adaptive/variable-rate forms; directional and partial forgetting correspond to more general FkF_k structures (Lai et al., 2023).

In the context of impulsive or non-Gaussian noise, robust RLS generalizations employ M-estimators and sparsity regularizers. The jointly optimized S-RRLS (JO-S-RRLS) algorithm extends this by adaptively optimizing both λk\lambda_k and the sparsity weighting ρk\rho_k at each step, achieving superior tracking and misadjustment trade-offs in sparse estimation under impulsive perturbations (Yu et al., 2022).

5. Practical Algorithms and Adaptive Mechanisms

Several adaptive mechanisms for online λ\lambda tuning have emerged:

  • Error-correlation driven VFF: Forgetting factor is set via a running average of error energy, with explicit bounding to avoid instability (CTVFF). This technique outperforms both fixed-λ\lambda and gradient-based VFF schemes in rapid adaptation and steady-state noise performance, while incurring minimal additional computation (Cai et al., 2013).
  • Criterion-aware VFF for blind adaptive filtering: Error metric violation (e.g., constant modulus) directly modulates λ\lambda, yielding optimal steady-state MSE and faster nonstationarity tracking (Boya et al., 2014).
  • Augmented regressor and two-layered forgetting: Outer-loop (exponential) and inner-loop (directional) forgetting are combined to guarantee parameter convergence even under finite excitation, with global exponential stability established via Lyapunov arguments (Tsuruhara et al., 28 Apr 2025).

6. State-of-the-Art Directions and Domain Applications

Recent RLS with forgetting variants address context-specific challenges:

  • Sliding window RLS with rank-two/low-rank upgrades: These variants employ rank-two updates and composite forgetting, allowing precise trade-offs between transient adaptation, memory length, and numerical conditioning. The segmented forgetting profile RLS exemplifies this by partitioning memory into regions tailored for rapid estimation or condition number control (Stotsky, 15 Jul 2025, Stotsky, 19 Nov 2025).
  • Robustness under nonstationarity and noise: Theoretical and experimental analyses confirm that carefully constructed forgetting profiles, variable-rate mechanisms, and sparsity-promoting penalties yield exponential convergence, accurate tracking, and bounded estimator variance in both time-invariant and abruptly changing environments (Yu et al., 2022, Bruce et al., 2020).
  • Online learning and regret guarantees: Forgetting-factor RLS achieves order-optimal dynamic regret bounds in nonstationary data streams, rigorously balancing “static” performance with adaptation to time-varying targets, matching the best achievable rates up to logarithmic factors (Yuan et al., 2019).
  • Connections to Kalman filtering: RLS with (generalized) forgetting is a special case of adaptive Kalman filtering for static or slow parameter dynamics, and extensions to combined RLS/Kalman filters adopt richer forgetting structures for improved estimation in systems with unmodeled or abrupt dynamics (Lai et al., 2024).

7. Summary Table: Key RLS Forgetting Variants

Variant Key Reference Distinctive Feature(s)
Classic exponential forgetting (Lai et al., 2024) Uniform time-discounting
Variable-rate forgetting (Bruce et al., 2020) Residual/adapted λk\lambda_k, proven convergence
Multiple/partial forgetting (Fraccaroli et al., 2015) Parameter/group-wise tuning of λi\lambda_i
Subspace/directional forgetting (Lai et al., 2024, Park et al., 2024) Forgetting only in excited directions
Jointly optimized robust/sparse RLS (Yu et al., 2022) λk\lambda_k and ρk\rho_k adapted via closed-form formulas
Segmented forgetting profile (Stotsky, 19 Nov 2025) Piecewise composite exponential/plateau decay
Two-layered (outer+inner) forgetting (Tsuruhara et al., 28 Apr 2025) FE→PE lift, global exponential stability
Adaptive Kalman-RLS fusion (Lai et al., 2024) Generalized forgetting as structural design

References

  • Lai & Bernstein, “Generalized Forgetting Recursive Least Squares: Stability and Robustness Guarantees” (Lai et al., 2023)
  • Glushchenko et al., “Robust method to provide exponential convergence of model parameters…” (Glushchenko et al., 2020)
  • Stotsky, “Performance Enhancement of the Recursive Least Squares Algorithms with Rank Two Updates” (Stotsky, 15 Jul 2025)
  • Stotsky, “RLS Framework with Segmentation of the Forgetting Profile and Low Rank Updates” (Stotsky, 19 Nov 2025)
  • Xian, “SIFt-RLS: Subspace of Information Forgetting Recursive Least Squares” (Lai et al., 2024)
  • Uehara et al., “Discrete-time Two-Layered Forgetting RLS Identification under Finite Excitation” (Tsuruhara et al., 28 Apr 2025)
  • Wu et al., “Inverter Output Impedance Estimation in Power Networks: A Variable Direction Forgetting Recursive-Least-Square Algorithm Based Approach” (Park et al., 2024)
  • Roman et al., “Study of Robust Sparsity-Aware RLS algorithms with Jointly-Optimized Parameters…” (Yu et al., 2022)
  • Yu & Bernstein, “A New Recursive Least-Squares Method with Multiple Forgetting Schemes” (Fraccaroli et al., 2015)
  • Paleologu et al., “Low-Complexity Variable Forgetting Factor Techniques…” (Cai et al., 2013)
  • de Lamare & Sampaio-Neto, “Low-Complexity Variable Forgetting Factor… for Adaptive Beamforming” (Boya et al., 2014)
  • Hazan & Luo, “Trading-Off Static and Dynamic Regret in Online Least-Squares and Beyond” (Yuan et al., 2019)
  • Bernstein, “Adaptive Kalman Filtering Developed from Recursive Least Squares Forgetting Algorithms” (Lai et al., 2024)
  • Qiu et al., “Exponential convergence of recursive least squares with forgetting factor for multiple-output systems” (Brüggemann et al., 2020)

These works represent the current technical landscape and provide the rigorous foundations for design, implementation, and theoretical guarantees for RLS with various forgetting factor methodologies.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (15)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Recursive Least Squares (RLS) with Forgetting Factor.