Risk-surface protocols for HITL to automated validation transitions
Develop protocols that manage the risk surface in transitions from human-in-the-loop validation to automated validation by addressing three failure modes: shared-mode error via base-LLM diversity, collusive calibration via held-out physical measurements, and governance via retained sign-off at federation level.
References
Nine open questions will determine whether instrumented data matures into a recognised substrate for scientific machine learning. HITL→ automated validation as a risk surface. Three failure modes need protocols: shared-mode error (diversity across base LLMs), collusive calibration (held-out physical measurements), and autonomy in production is not autonomy in liability (sign-off retained at federation level).
— Instrumented data for causal scientific machine learning
(2606.07865 - Wilke, 5 Jun 2026) in Section 7, Methodological questions for the community, Item 8