SPI-BoTER: Error Compensation for Industrial Robots via Sparse Attention Masking and Hybrid Loss with Spatial-Physical Information

Published 28 Jun 2025 in cs.RO | (2506.22788v1)

Abstract: The widespread application of industrial robots in fields such as cutting and welding has imposed increasingly stringent requirements on the trajectory accuracy of end-effectors. However, current error compensation methods face several critical challenges, including overly simplified mechanism modeling, a lack of physical consistency in data-driven approaches, and substantial data requirements. These issues make it difficult to achieve both high accuracy and strong generalization simultaneously. To address these challenges, this paper proposes a Spatial-Physical Informed Attention Residual Network (SPI-BoTER). This method integrates the kinematic equations of the robotic manipulator with a Transformer architecture enhanced by sparse self-attention masks. A parameter-adaptive hybrid loss function incorporating spatial and physical information is employed to iteratively optimize the network during training, enabling high-precision error compensation under small-sample conditions. Additionally, inverse joint angle compensation is performed using a gradient descent-based optimization method. Experimental results on a small-sample dataset from a UR5 robotic arm (724 samples, with a train:test:validation split of 8:1:1) demonstrate the superior performance of the proposed method. It achieves a 3D absolute positioning error of 0.2515 mm with a standard deviation of 0.15 mm, representing a 35.16\% reduction in error compared to conventional deep neural network (DNN) methods. Furthermore, the inverse angle compensation algorithm converges to an accuracy of 0.01 mm within an average of 147 iterations. This study presents a solution that combines physical interpretability with data adaptability for high-precision control of industrial robots, offering promising potential for the reliable execution of precision tasks in intelligent manufacturing.

Abstract PDF Upgrade to Chat

Summary

The paper introduces SPI-BoTER, a dual-stream framework that merges transformer-based sparse self-attention with kinematic modeling for error compensation in industrial robots.
It employs a novel hybrid loss function combining data residuals with spatial-physical information to optimize prediction accuracy under limited data conditions.
The model achieves a minimal positional error of 0.2515 mm, outperforming traditional deep learning approaches by 35.16% and demonstrating high computational efficiency.

SPI-BoTER: Error Compensation for Industrial Robots via Sparse Attention Masking and Hybrid Loss with Spatial-Physical Information

Introduction to Error Compensation in Industrial Robots

The demand for precision in industrial robotics, which plays a vital role in tasks such as cutting and welding, has necessitated advancements in error compensation methods. Many traditional approaches have been challenged by their simplifications and data demands, resulting in limitations in both accuracy and generalization. To address these, the SPI-BoTER framework integrates a combination of kinematic equations and data-driven methods within a dual mechanism comprising transformers enhanced with sparse self-attention masks and a hybrid loss function. The result is a system that achieves high-precision error compensation under small-sample conditions.

Figure 1: Schematic diagram of the complete error compensation process for industrial robots in this study.

BoTER Model Architecture

Dual-Stream Architecture

BoTER leverages a dual-stream architecture that incorporates elements of both physical modeling and data-driven predictions. This design decouples the theoretical coordinate prediction, underpinned by a DH kinematic model, from a Transformer-driven error compensation pathway.

Figure 2: Schematic Diagram of the BoTER Model Architecture.

The architecture's forward branch is predicated on robotic kinematics. This branch integrates DH parameters to extract theoretical position outputs, effectively computing cumulative transformations from joint-specific matrices.

Sparse Self-Attention Masking and Residual Networks

BoTER introduces a novel sparse self-attention mask specifically for six-axis serial manipulators. Such a mechanism improves feature extraction efficiency by accounting for constraints within inter-joint interactions. Complementing this is the integration of residual networks to bolster the stability of value predictions.

Figure 3: Inverse Angle Solving Algorithm Pipeline.

Hybrid Loss Function Design

SPI Loss Components

Enhancing the capability of the BoTER framework, a hybrid loss function is implemented. This loss function ingeniously combines data residuals with spatial-physical information (SPI) terms to ensure high-precision predictions that maintain physical consistency.

Figure 4: Simulation of one batch of training samples optimized using the SPI loss function.

Coupled with a novel dynamic weighting mechanism, the SPI loss layers adaptively balance between constraints and learned data patterns, producing robust, accurate predictions. The methodology also integrates a gradient-based inverse angle compensation algorithm, enabling joint-angle correction from predetermined positions within the given framework.

Experiments and Results

Dataset Acquisition and Model Training

This study employed a six-axis UR5 robotic arm alongside a TrackScan Sharp system for data collection. By randomizing the acquisition of 800 positional datasets, a comprehensive suite of training, validation, and test datasets was built, detailing precise position readings that map robotic joint angles to real-world positions.

Figure 5: Experiments of error data collection.

The SPI-BoTER model was subject to iterative training across these datasets, ultimately achieving superior predictive performance with high fidelity. With impressive computational efficiency, the model demonstrated a minimal absolute positional error of just 0.2515 mm, outperforming baseline deep learning frameworks by a significant 35.16%.

Comparative Analysis and Performance Metrics

Comprehensive assessments deployed MAE, RMSE, MSE, and $R^2$ to evaluate model proficiency. These assessments affirmed the robust capabilities of SPI-BoTER, with all metrics indicating a substantial reduction in prediction errors relative to traditional approaches.

Figure 6: Performance Comparison between DNN and SPI-BoTER across MAE, MSE, RMSE, and $R^2$ .

DNN benchmarks demonstrated more substantial deviation than SPI-BoTER predictions, underscoring the enhanced error prediction efficacy associated with SPI-BoTER's dual-channel scheme. Experimental validation confirmed that SPI-BoTER successfully mitigates error accumulation, particularly in complex, high-fidelity tasks.

Figure 7: Position Error of 50 Randomly Sampled Test Points.

Conclusion

The SPI-BoTER framework offers a sophisticated solution for industrial robot error compensation, integrating innovative architectural components that meld physical insight with cutting-edge learning models. Despite notable achievements, ongoing challenges include extending the sparse attention mask's utility to varied robot types and improving model performance in dynamic motion tasks. Future research will prioritize addressing these challenges, particularly by incorporating temporal models to improve response in operationally diverse environments. Additionally, practical applications in high-end manufacturing systems will continue to inform iterative model refinements.

Markdown Report Issue