Relaxed Mean Squared Error
- Relaxed Mean Squared Error is a risk-aware extension of the classical MMSE that introduces a constraint on the error's fourth moment to control volatility.
- It employs a Lagrangian framework to trade off the mean squared error with higher-order risks, leading to a nonlinear estimator adjusted for skewness and heavy tails.
- Practical illustrations show that RMSE reduces the fourth-moment risk by 25-30% while incurring a modest increase in the mean squared error, enhancing robustness under unstable conditions.
The relaxed mean squared error (RMSE), also termed risk-aware mean squared error, generalizes the standard minimum mean squared error (MMSE) criterion by explicitly introducing a constraint on the volatility of the squared error. Unlike traditional MMSE, which solely minimizes the expected squared deviation between prediction and target, RMSE penalizes the higher-order moments of the error, thereby controlling not only the average performance but also the risk associated with large deviations. This approach is particularly valuable in scenarios where error distributions are skewed or heavy-tailed, and the conventional MMSE estimator lacks stability due to unconstrained error variance or higher moments (Kalogerias et al., 2019).
1. Standard MMSE Estimator
The MMSE estimator seeks a measurable function that minimizes the objective
where is the ground truth and is the observed variable. By the orthogonality principle, the pointwise optimal estimate is
The MMSE estimator is risk-neutral in the sense that it optimizes only the mean squared error, and does not account for higher-order moments, such as the variance of the squared error (Kalogerias et al., 2019).
2. Relaxed (Risk-Aware) MSE Criterion
To address the lack of stability inherent in the MMSE estimator under high error volatility, the RMSE criterion introduces a constraint or penalty on the conditional fourth central moment (the variance of the squared error):
For a prescribed risk budget , the associated optimization is
This leads naturally to a Lagrangian penalized formulation: where is the Lagrange multiplier controlling the trade-off between mean performance and risk-averse regularization (Kalogerias et al., 2019).
3. Optimal Estimator under Risk-Aware MSE
Assuming mild moment boundedness, the minimization can be performed pointwise by the interchangeability principle. Defining
and , the optimal risk-aware estimator solves,
yielding the explicit expression:
This estimator is a nonlinear, regularized variant of the standard MMSE: it introduces a bias depending on the third conditional moment, and a shrinkage factor inversely related to the conditional variance. Notably, in terms of ,
4. Existence, Uniqueness, and Theoretical Guarantees
The explicit construction above is justified under the following moment and regularity conditions:
- ensures finiteness of all required conditional moments.
- guarantees square-integrability of the third-moment filter.
- A Slater-type strict feasibility condition ensures that a dual-optimal Lagrange multiplier exists, guaranteeing zero duality gap and uniqueness.
Under these hypotheses, the minimization is strictly convex in and possesses a unique solution almost surely with respect to (Kalogerias et al., 2019).
5. Practical Illustrations and Regime-Specific Behavior
The risk-aware estimator demonstrates distinct advantages over risk-neutral MMSE in models characterized by high skewness or heavy tails:
- Skewed State-Dependent Noise:
For , , the standard MMSE , being symmetric, can yield large estimation errors for small , where rare large may be observed through significant noise. The risk-aware estimator hedges against variance in the tails by over-estimating for small and large . This results in a 25–30% reduction in the conditional fourth-moment risk, at the expense of a 10–20% increase in the mean squared error.
- Heavy-Tailed Priors:
For Student- (), with , the posterior inherits heavy tails. The standard MMSE can be shifted by outlying . The risk-aware estimator shrinks large-magnitude estimates toward zero, sharply reducing the fourth-moment risk with only a moderate increase in mean squared error.
In these examples, the RMSE estimator achieves improved robustness to outlier-induced volatility, supporting its application where error stability is a concern (Kalogerias et al., 2019).
6. Tuning and Trade-off Interpretation
The Lagrange multiplier (or equivalently, the risk budget ) directly parameterizes the trade-off between average squared error and higher-order predictive risk. As increases, the impact of the penalty for error volatility becomes more pronounced, biasing the estimator toward risk-averse predictions. can be set to ensure that the constraint on the fourth-moment risk is satisfied with equality. This formulation enables explicit control over the tail behavior of the estimation error, making RMSE suitable for applications requiring stability against rare but significant deviations (Kalogerias et al., 2019).