Repeated Risk Minimization (RRM)

Updated 10 February 2026

Repeated Risk Minimization is a dynamic framework that formalizes iterative risk minimization where model deployments influence subsequent data distributions.
It integrates fairness and robust optimization by decomposing loss functions and employing techniques like DRO and ARM to mitigate representation disparities.
Practical applications span sequential decision making in credit, hiring, and recommender systems, highlighting its role in ensuring performative stability and fairness over time.

Repeated Risk Minimization (RRM) is a foundational framework for performative prediction, fairness-aware machine learning, and adaptive systems facing feedback-induced distribution shifts. RRM formalizes the iterative retraining of predictive models in contexts where the data-generating process changes in response to model deployments, requiring risk minimization over a dynamic, model-dependent data distribution. RRM arises in the study of sequential decision making, performative stability, and fairness mitigation, and it is closely linked to both classical risk minimization and emerging robust optimization paradigms.

1. Formalization of Repeated Risk Minimization

Let $\Theta$ denote the model parameter space and $f_\theta: X \to Y$ a prediction function. In performative settings, deployment of $f_\theta$ induces a new data distribution $D(f_\theta)$ . The performative risk is defined as

$\text{PR}(\theta) := \mathbb{E}_{(x, y) \sim D(f_\theta)}[\ell(f_\theta(x), y)],$

where $\ell$ is a prescribed loss function. A parameter $\theta_{\text{PS}}$ is performatively stable if it is optimal for its own induced distribution, i.e.,

$\theta_{\text{PS}} \in \arg\min_{\theta \in \Theta} \mathbb{E}_{(x,y)\sim D(f_{\theta_{\text{PS}}})} [\ell(f_\theta(x), y)].$

Repeated Risk Minimization is the fixed-point iteration:

$\theta^{t+1} := \arg\min_{\theta \in \Theta} \mathbb{E}_{(x,y) \sim D(f_{\theta^t})}[\ell(f_\theta(x), y)],$

where at each step, the risk is minimized with respect to the distribution induced by the deployment of the previous iterate (Khorsandi et al., 2024, Hu et al., 2022).

When additional social utility terms, such as fairness constraints, are integrated, the loss decomposes as

$\ell(\theta) = \lambda_\text{u} \ell_\text{u}(\theta) + \lambda_\text{l} \ell_\text{l}(\theta) + \lambda_\text{s} \ell_\text{s}(\theta)$

with each component generally an expectation over a $\theta$ -dependent distribution (Hu et al., 2022). This enables problem formulations such as path-specific long-term fairness or short-term fairness in sequential decisions.

2. Applications and Motivating Scenarios

RRM was developed and analyzed in domains where model outputs influence future data. Representative contexts include:

Perpetual sequential decision making: Repeated application of models to individuals whose future states or retention depend on model decisions (e.g., credit, hiring, recommendation systems) (Hu et al., 2022, Hashimoto et al., 2018).
Fairness without demographics: Demonstrations that naïve empirical risk minimization (ERM) under RRM can exacerbate representation disparity for minority groups, as subpopulations experiencing high loss shrink due to selective feedback (user drop-out), amplifying unfairness (Hashimoto et al., 2018).
Performative prediction and stability: Analysis of iterative learning dynamics and convergence to stable solutions when data shifts are induced by model deployments, with extensions to multi-agent (competitive) and game-theoretic settings (Khorsandi et al., 2024).

The implications are critical for both the fairness and stability of deployed machine learning systems under performativity.

3. Convergence Guarantees and Lower Bounds

Convergence analysis of RRM leverages the regularity of the performative mapping $\theta \mapsto D(f_\theta)$ and the loss $\ell$ :

Sensitivity condition: If $D(\theta)$ is $\epsilon$ -sensitive, e.g., with respect to $1$-Wasserstein or Pearson $\chi^2$ distance, there exists $W_1(D(\theta), D(\theta')) \leq \epsilon \| \theta - \theta' \|_2$ or $\chi^2(D(f_{\theta'}), D(f_\theta)) \leq \epsilon \| f_\theta - f_{\theta'} \|_{f_\theta}^2$ (Khorsandi et al., 2024, Hu et al., 2022).
Loss properties: If the composition $\varphi \circ h$ is $\gamma$ -strongly convex and $\beta$ -smooth, and the sensitivity $\epsilon < \beta / \gamma$ , the RRM update mapping becomes a contraction, yielding linear convergence to a unique fixed point (Hu et al., 2022, Khorsandi et al., 2024).
Tightness: Lower bounds demonstrate that when the contraction constant $\rho = \sqrt{\epsilon} M/\gamma \leq 1$ , the distance to the performative stable solution decays as $\Omega(\rho^t)$ and cannot be improved for standard RRM (Khorsandi et al., 2024).
Relaxed convergence with historical averaging: Reusing historical data via Affine Risk Minimizers (ARM) strictly enlarges the regime of guaranteed convergence to $\sqrt{\epsilon} M/\gamma < 1.035\ldots$ , surpassing the last-iterate threshold for RRM (Khorsandi et al., 2024).

Failure to satisfy the contraction condition can result in non-convergence or oscillatory iterates, particularly under high feedback sensitivity (Hu et al., 2022).

4. Algorithmic Implementation and Variants

The core RRM procedure consists of repeated retraining via risk minimization over the distribution induced by deploying the current model. A canonical pseudocode for the fairness-constrained case (Hu et al., 2022):

Input: Dataset D, Causal graph G, Tolerance δ
1. Initialize θ₀ by solving risk minimization on observed D
2. i ← 0
3. repeat
   a. For each t, sample post-intervention points via G under θ_i
   b. Compute empirical estimates of loss components (utility, long-term fairness, short-term fairness)
   c. θ_{i+1} ← argmin_θ weighted sum of losses (e.g., with Adam)
   d. Δ ← ‖θ_{i+1}−θ_i‖₂; i ← i + 1
until Δ < δ
Output: h_{θ_i}

For performative prediction, RRM can be extended to the Affine Risk Minimizers (ARM) family (Khorsandi et al., 2024):

Store historical parameter-distribution pairs $\{\theta^0, D(f_{\theta^0}), \ldots, \theta^{t-1}, D(f_{\theta^{t-1}})\}$
Aggregate: $D_t := \sum_i \alpha_i D(f_{\theta^i})$ (e.g., uniform mixing over last $\tau$ snapshots)
Update: $\theta^{t} := \arg\min_\theta \mathbb{E}_{(x,y)\sim D_t}[\ell(f_\theta(x), y)]$

Variant DRO-based RRM replaces ERM at each round with distributionally robust minimization over $f$ -divergence ambiguity sets, protecting minority risks without requiring demographic labels (Hashimoto et al., 2018).

5. Fairness and Robustness in RRM

A critical aspect of RRM is its effect on disparate impact and the mitigation of representation disparity.

Amplification under ERM: When RRM is paired with standard ERM, iterative deployment reduces minority subpopulation presence via feedback, resulting in unfairness even from initially fair models (Hashimoto et al., 2018).
Distributionally robust RRM: Mitigates disparity amplification by minimizing the worst-case risk over $f$ -divergence balls, ensuring all latent groups maintain bounded loss and representation over time. The group-wise risk at all $t$ and for each group $k$ can be bounded above by a constant $C$ , provided the robustness radius $\rho$ is chosen to encompass the relevant subpopulations (Hashimoto et al., 2018).

The fairness trade-off can be explicitly tuned via $\lambda$ -weights in loss decomposition, modulating accuracy versus various fairness objectives (Hu et al., 2022).

6. Empirical Behavior and Practical Considerations

Empirical studies across synthetic and semi-synthetic domains highlight the practical effects and tuning parameters for RRM:

Fairness-constrained case: RRM reduces both short-term and long-term fairness gaps to zero across the decision horizon, in contrast to static fairness baselines where long-term disparity can grow rapidly. The accuracy/fairness trade-off is directly adjustable via hyperparameters (Hu et al., 2022).
Performance with historical mixing: Increasing the historical averaging window $\tau$ in ARM leads to faster and smoother convergence with reduced oscillatory loss shifts; further gains are subject to diminishing returns (Khorsandi et al., 2024).
Robustness benefits of DRO: RRM+DRO (versus RRM+ERM) avoids minority population extinction and reduces worst-group loss with little degradation of average risk. This is verified in both simulated and real-world datasets (Hashimoto et al., 2018).

Practical points include the choice of loss function (logistic or hinge surrogate), optimizer (e.g., Adam), regularization for convexity, and appropriate setting of convergence tolerance $\delta$ (Hu et al., 2022). Computational overhead for robust variants is moderate: on the order of 2-5 $\times$ the per-iteration cost of ERM (Hashimoto et al., 2018).

7. Extensions, Open Problems, and Implications

Recent advances generalize RRM and address its limitations:

Affine Risk Minimizers (ARM): Using convex combinations of historical distributions, ARM provides strictly improved convergence guarantees and enables faster stabilization in performative environments (Khorsandi et al., 2024).
Lower bounds for convergence: Tight matching upper and lower bounds have been established, quantifying the optimal contraction regime under given loss curvature, sensitivity, and feedback strength (Khorsandi et al., 2024).
Open challenges: For high feedback sensitivity, the contraction criterion may be violated, and analysis or algorithmic adaptation is needed for stability. Model design and intervention strategies remain critical for balancing performative utility, robustness, and fairness guarantees.

RRM and its variants underpin ongoing research in dynamic and feedback-prone learning systems. They formalize and rigorously address risks arising in longitudinal deployment scenarios, enabling principled advances in fairness, robustness, and performative stability (Hashimoto et al., 2018, Hu et al., 2022, Khorsandi et al., 2024).