Papers
Topics
Authors
Recent
Search
2000 character limit reached

Detecting the Root Cause Code Lines in Bug-Fixing Commits by Heterogeneous Graph Learning

Published 2 May 2025 in cs.SE | (2505.01022v3)

Abstract: With the continuous growth in the scale and complexity of software systems, defect remediation has become increasingly difficult and costly. Automated defect prediction tools can proactively identify software changes prone to defects within software projects, thereby enhancing software development efficiency. However, existing work in heterogeneous and complex software projects continues to face challenges, such as struggling with heterogeneous commit structures and ignoring cross-line dependencies in code changes, which ultimately reduce the accuracy of defect identification. To address these challenges, we propose an approach called RC_Detector. RC_Detector comprises three main components: the bug-fixing graph construction component, the code semantic aggregation component, and the cross-line semantic retention component. The bug-fixing graph construction component identifies the code syntax structures and program dependencies within bug-fixing commits and transforms them into heterogeneous graph formats by converting the source code into vector representations. The code semantic aggregation component adapts to heterogeneous data by using heterogeneous attention to learn the hidden semantic representation of target code lines. The cross-line semantic retention component regulates propagated semantic information by using attenuation and reinforcement gates derived from old and new code semantic representations, effectively preserving cross-line semantic relationships. Extensive experiments were conducted to evaluate the performance of our model by collecting data from 87 open-source projects, including 675 bug-fixing commits. The experimental results demonstrate that our model outperforms state-of-the-art approaches, achieving significant improvements of 83.15%,96.83%,78.71%,74.15%,54.14%,91.66%,91.66%, and 34.82% in MFR, respectively, compared with the state-of-the-art approaches.

Summary

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 2 tweets with 1 like about this paper.