Human-in-the-loop: Real-time Preference Optimization

Published 2 Jun 2025 in math.OC | (2506.02225v1)

Abstract: Human-aware controllers play an important role in engineering systems for improving productivity, efficiency, and sustainability. It is essential to design such a controller that optimizes user utility while adhering to plant dynamics. While most online optimization algorithms rely on first-order or zeroth-order oracles, human feedback often appears as pairwise comparisons. In this work, we propose an online feedback optimization algorithm that leverages such preference feedback. We design a controller that estimates the gradient based on the binary pairwise comparison result between two consecutive points and study its coupled behavior with a nonlinear plant. Under mild assumptions on both the utility and the plant dynamics, we establish explicit stability criteria and quantify sub-optimality. The theoretical findings are further supported through numerical experiments.