Minimax optimality of RaT beyond kernel-based settings

Determine whether the residual-as-teacher (RaT) estimator achieves minimax-optimal convergence rates outside the kernel ridge regression student–teacher framework analyzed in the paper, particularly for more general student–teacher pairs and non-kernel model classes under covariate shift.

Background

The paper proves that, for kernel ridge regression (KRR) student–teacher pairs under covariate shift with a biased teacher, RaT achieves the minimax-optimal rate while standard student soft-matching is inconsistent. This is established via exact mean-squared error formulas and a separation theorem.

The authors explicitly raise the question of whether this optimality persists when moving beyond the specific kernel-based construction to broader student–teacher interactions and non-kernel settings.

References

Whether RaT remains minimax-optimal beyond this class, particularly for more general student--teacher pairs and non-kernel settings, is an important open question.

Residual-as-Teacher: Mitigating Bias Propagation in Student--Teacher Estimation  (2603.25466 - Yamamoto et al., 26 Mar 2026) in Discussion