SHaRe-RL: Structured, Interactive Reinforcement Learning for Contact-Rich Industrial Assembly Tasks

Published 17 Sep 2025 in cs.RO | (2509.13949v1)

Abstract: High-mix low-volume (HMLV) industrial assembly, common in small and medium-sized enterprises (SMEs), requires the same precision, safety, and reliability as high-volume automation while remaining flexible to product variation and environmental uncertainty. Current robotic systems struggle to meet these demands. Manual programming is brittle and costly to adapt, while learning-based methods suffer from poor sample efficiency and unsafe exploration in contact-rich tasks. To address this, we present SHaRe-RL, a reinforcement learning framework that leverages multiple sources of prior knowledge. By (i) structuring skills into manipulation primitives, (ii) incorporating human demonstrations and online corrections, and (iii) bounding interaction forces with per-axis compliance, SHaRe-RL enables efficient and safe online learning for long-horizon, contact-rich industrial assembly tasks. Experiments on the insertion of industrial Harting connector modules with 0.2-0.4 mm clearance demonstrate that SHaRe-RL achieves reliable performance within practical time budgets. Our results show that process expertise, without requiring robotics or RL knowledge, can meaningfully contribute to learning, enabling safer, more robust, and more economically viable deployment of RL for industrial assembly.

Abstract PDF Upgrade to Chat

Summary

The paper introduces SHaRe-RL, a human-in-the-loop reinforcement learning framework that significantly improves learning efficiency in industrial assembly tasks.
It employs task-frame formalism and adaptive manipulation primitives to dynamically adjust control settings, ensuring safe exploration in contact-rich environments.
Experimental results show a 95% success rate, outperforming skilled human operators and reducing the need for continuous human intervention.

This paper proposes SHaRe-RL, an innovative reinforcement learning framework tailored for high-mix low-volume (HMLV) industrial assembly applications. By embedding structured, interactive reinforcement learning techniques, SHaRe-RL aims to bridge the gap between the stringent requirements of HMLV tasks and the flexibility offered by RL solutions in robotic applications.

Introduction and Motivation

High-mix low-volume manufacturing environments require systems that can perform with precision and reliability while adapting to a diverse range of products and unpredictable conditions. Current robotic solutions often fall short, being either too brittle or prone to inefficiencies in sample requirements, especially in contact-heavy scenarios. SHaRe-RL innovatively merges structured skill sets, human demonstrations, and real-time operator adjustments to enhance learning efficiency and safety in industrial tasks.

Figure 1: System overview of SHaRe-RL. Structure (blue), interaction (yellow), and safety (red) combine to make industrial RL both practical and reliable.

Methodology

Human-in-the-Loop RL

A human-in-the-loop approach is central to SHaRe-RL, incorporating operator expertise directly into the learning process. Skilled operators provide critical demonstrations and can intervene during training to correct robot actions, accelerating convergence and preventing unsafe states. This integration creates a synergistic loop between human knowledge and machine learning capabilities, enabling the training to be both rapid and adaptable without compromising safety.

Task-Level Priors: TFF and MP-Nets

SHaRe-RL uses task frame formalism (TFF) and manipulation primitives (MPs) to guide exploration and learning. MPs, which represent fundamental manipulation skills, are composed into MP-Nets that outline state transitions based on specific stop conditions. By introducing adaptive MPs, the framework adapts control variables dynamically in response to task requirements, focusing learning efforts on the most relevant phases of the task.

Figure 3: MP-Net for the HanDD connector insertion. The controller setpoints for each direction of the MPs are shown in the table below the diagram. Adaptive MPs, learned stop conditions, and variable setpoints are colored green.

Adaptive Safety Limits

Safety in contact-rich environments is achieved through adaptive force limits that govern interaction during exploration. These limits are dynamically adjusted based on the measured forces in real-time, striking a balance between allowing flexibility in free space and ensuring safety during contact. This approach prevents hardware damage and enhances sample efficiency by minimizing the impact of large, uncontrolled force applications.

Figure 4: Adaptive force limits under ideal conditions. Left: Time response of a stable adaptive limit. Right: Phase-plane cobweb plot illustrating the recurrence.

Experimental Setup and Results

The evaluation was conducted using Harting Han-Modular connectors with stringent tolerance requirements. The experimental setup included a UR3e robot equipped with sensors and a force control spine optimized for SHaRe-RL tasks.

Figure 2: Experimental setup and task. Top: UR3e robot and Harting HanDD industrial connector used for evaluation. Bottom: External view and corresponding policy input over time.

The results showcased SHaRe-RL's enhanced learning capabilities, with a significant success rate of 95% on challenging tasks, eventually surpassing even skilled human operators' efficiency. The methodology effectively reduced the necessary human guidance over time, demonstrating a steady shift towards increased autonomy.

Figure 5: Main comparison of learning approaches. Success rate, cycle time, and intervention rate vs. wall-clock time.

Conclusion

SHaRe-RL represents a significant advancement in structured reinforcement learning, combining human expertise with automated learning to enhance the adaptability and safety of robotic systems in industrial settings. The framework's ability to integrate various sources of prior knowledge optimizes its learning trajectory, paving the way for more flexible and robust robotic solutions suitable for SMEs without extensive robotics expertise.

The future trajectory of SHaRe-RL includes extending scalability across longer and more complex assembly processes, improving generalization to various tasks, and reducing engineering overhead through integration with planning tools. These advancements will further the adoption of RL in industrial applications, contributing to increased operational efficiencies and reduced dependency on highly specialized human intervention.

Markdown Report Issue