Extending GUI-Libra to fully online interactive reinforcement learning
Investigate the extension of the GUI-Libra post-training framework for native GUI agents from offline optimization on static datasets to a fully online, interactive reinforcement learning scheme that trains through environment interaction, and systematically characterize the design choices, training stability, and performance implications of such an online extension.
References
We train on a relatively limited amount of data and do not explore how to extend the framework to fully online, interactive training. We leave a systematic study of extending our framework to fully online scheme as future work.
— GUI-Libra: Training Native GUI Agents to Reason and Act with Action-aware Supervision and Partially Verifiable RL
(2602.22190 - Yang et al., 25 Feb 2026) in Limitations (unnumbered section, after Conclusion)