Robust retrieval‑based inference under extreme imbalance and distribution shift in clinical settings

Develop retrieval‑augmented inference methods for structured electronic health record prediction that are robust to extreme outcome imbalance and distribution shift, thereby overcoming the current open challenges faced by retrieval‑based inference in clinical settings.

Background

Retrieval‑augmented tabular in‑context learning depends on selecting informative neighbors at inference time, a process that can be destabilized by minority‑class scarcity and cohort shifts across institutions or time.

Even with task‑aligned embeddings, the paper notes that retrieval‑induced instability persists under extreme rarity and shift, identifying this as an open challenge requiring further methodological advances beyond the proposed AWARE framework.

References

Extreme imbalance and distribution shift remain open challenges for retrieval-based inference in clinical settings.