Leveraging Large Language Model for Information Retrieval-based Bug Localization
Abstract: Information Retrieval-based Bug Localization aims to identify buggy source files for a given bug report. While existing approaches -- ranging from vector space models to deep learning models -- have shown potential in this domain, their effectiveness is often limited by the vocabulary mismatch between bug reports and source code. To address this issue, we propose a novel LLM based bug localization approach, called GenLoc. Given a bug report, GenLoc leverages an LLM equipped with code-exploration functions to iteratively analyze the code base and identify potential buggy files. To gather better context, GenLoc may optionally retrieve semantically relevant files using vector embeddings. GenLoc has been evaluated on over 9,000 real-world bug reports from six large-scale Java projects. Experimental results show that GenLoc outperforms five state-of-the-art bug localization techniques across multiple metrics, achieving an average improvement of more than 60\% in Accuracy@1.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.