Agent-in-the-Loop: A Data Flywheel for Continuous Improvement in LLM-based Customer Support

Published 8 Oct 2025 in cs.AI | (2510.06674v2)

Abstract: We introduce an Agent-in-the-Loop (AITL) framework that implements a continuous data flywheel for iteratively improving an LLM-based customer support system. Unlike standard offline approaches that rely on batch annotations, AITL integrates four key types of annotations directly into live customer operations: (1) pairwise response preferences, (2) agent adoption and rationales, (3) knowledge relevance checks, and (4) identification of missing knowledge. These feedback signals seamlessly feed back into models' updates, reducing retraining cycles from months to weeks. Our production pilot involving US-based customer support agents demonstrated significant improvements in retrieval accuracy (+11.7% recall@75, +14.8% precision@8), generation quality (+8.4% helpfulness) and agent adoption rates (+4.5%). These results underscore the effectiveness of embedding human feedback loops directly into operational workflows to continuously refine LLM-based customer support system.

Abstract PDF Upgrade to Chat

Summary

The paper introduces a continuous learning framework that integrates real-time agent feedback to reduce retraining cycles and enhance response accuracy.
It employs a Unified Knowledge Base and an Agent Annotation Interface to centralize resources and capture diverse feedback for iterative model updates.
Experimental results demonstrate significant improvements in retrieval accuracy and agent adoption, validating the framework's practical impact on customer support.

Agent-in-the-Loop: A Data Flywheel for Continuous Improvement in LLM-based Customer Support

Introduction

The paper introduces an Agent-in-the-Loop (AITL) framework designed to enhance LLM-based customer support systems through a continuous data flywheel. This approach directly integrates live feedback from customer support operations, focusing on four annotation types: pairwise response preferences, agent adoption decisions and rationales, knowledge relevance checks, and identification of missing knowledge. These annotations feed into an iterative model update cycle, reducing retraining times from months to weeks and improving performance across several key metrics.

System Architecture

Figure 1: Overview of the agent-in-the-loop architecture.

The AITL framework involves a structured workflow that begins with customer interactions and integrates human annotations into the model training pipeline. This system is supported by a Unified Knowledge Base that combines diverse resources, facilitating effective response generation.

Methodology

The AITL system captures real-time feedback from agents during customer interactions. Key components include:

Unified Knowledge Base: Centralizes resources such as FAQs, policies, and historical data to support real-time knowledge retrieval.
Agent Annotation Interface: Allows agents to provide instant feedback on response preferences, rationale for selections, knowledge relevance, and identify missing information. This feedback directly improves model training datasets.
Figure 2: Example of selecting knowledge references.
Continuous Learning Pipeline: Integrates online annotations into a training pipeline (Figure 3), updating models iteratively and reducing update cycles.
Figure 3: Data flow in continuous learning pipeline.

Experimental Results

The AITL system was tested in a US-based pilot with significant enhancements over baseline systems. Improvements were seen in:

Retrieval Accuracy: Recall@75 improved by 11.7% and Precision@8 by 14.8%, demonstrating the effectiveness of real-time agent feedback in enhancing response accuracy.
Generation Quality: Notable increases in response helpfulness and citation correctness highlight the impact of immediate human feedback on generation models.
Agent Adoption Rates: Real-time annotation led to a 4.5% increase in adoption rates, confirming the system’s efficacy in aligning model outputs with agent needs.

Annotation Timing and Quality

An ablation study on annotation timing showed that immediate annotations improved the identification of missing knowledge without impacting preference judgments, suggesting different strategies may optimize annotation workload across various customer support channels.

Conclusions and Future Work

AITL represents a significant step forward in integrating human-in-the-loop feedback within LLM-driven systems. Future work could focus on scaling the annotation framework, integrating product-embedded AITL for efficiency gains, and exploring automation in dataset curation. Continued adaptation to multilingual support contexts and the long-term sustainability of annotation workflows remain areas of interest.