ExpandR: Teaching Dense Retrievers Beyond Queries with LLM Guidance

Published 24 Feb 2025 in cs.IR and cs.AI | (2502.17057v3)

Abstract: LLMs have demonstrated significant potential in enhancing dense retrieval through query augmentation. However, most existing methods treat the LLM and the retriever as separate modules, overlooking the alignment between generation and ranking objectives. In this work, we propose ExpandR, a unified LLM-augmented dense retrieval framework that jointly optimizes both the LLM and the retriever. ExpandR employs the LLM to generate semantically rich query expansions, which are leveraged to enhance the retriever's training. Simultaneously, the LLM is trained using Direct Preference Optimization (DPO), guided by a carefully designed reward function that balances retrieval effectiveness and generation consistency. This joint optimization paradigm enables mutual adaptation between the LLM and the retriever, resulting in query expansions that are both informative and well-suited for retrieval. Experimental results on multiple benchmarks show that ExpandR consistently outperforms strong baselines, achieving more than a 5% improvement in retrieval performance. All codes are available at https://github.com/NEUIR/ExpandR.

Abstract PDF Upgrade to Chat

Summary

The paper presents LLM-QE, a novel method using rank-based and answer-based reward models to align LLM-guided query expansions with dense retrieval ranking preferences, mitigating hallucinations.
Evaluated on the BEIR dataset, LLM-QE significantly improves dense retrieval effectiveness, achieving over 8% improvement on Contriever by aligning LLMs for better query expansion.
LLM-QE provides a potent strategy for enhancing information retrieval by bridging the query-document semantic gap and offers a framework for aligning generative models with ranking paradigms.

LLM-QE: Enhancing Query Expansion with Ranking-Aligned LLMs

The paper presents a novel approach, LLM-QE, for query expansion in information retrieval, leveraging the capabilities of LLMs. This method aims to bridge the semantic gap between query terms and document content, a prevalent challenge in dense retrieval models. Unlike traditional methods, LLM-QE incorporates rank-based and answer-based reward models to align LLMs with the ranking preferences of both retrievers and the LLMs themselves. This mitigation of LLM hallucinations significantly enhances the performance of dense retrieval models, as evidenced by experimental results on the zero-shot dense retrieval model, Contriever. LLM-QE achieved over 8% improvement in retrieval effectiveness, underscoring the benefits of its innovative reward modeling strategy.

The architecture of LLM-QE diverges from standard generative relevance feedback methods, which primarily utilize LLMs for query expansion by directly generating relevant documents or reasoning through steps like Chain-of-Thought (CoT). The authors propose integrating Direct Preference Optimization (DPO) to fine-tune LLMs, thereby crafting more precise and contextually relevant expansions. This process mitigates the risk of irrelevant content introduction, a common shortfall observed when LLMs are directly deployed for query-related content generation.

Another critical component of LLM-QE is its unique reward model. By employing rank-based rewards, the model emulates dense retrieval systems' document ranking preferences. In concert, the answer-based rewards assess the relevance of generated expansions by comparing them to optimized LLM outputs, thus achieving a holistic alignment of expansions with effective document ranking strategies. The results demonstrate LLM-QE's superiority in ensuring query expansions are not only relevant but also concise, avoiding the generation of excessive and unnecessary tokens.

The evaluation conducted on the BEIR dataset reveals significant improvements in both unsupervised and supervised settings, with marked performance boosts noted in complex query-response environments, particularly in enhancing the training of dense retrievers. The robust performance across numerous datasets solidifies LLM-QE as an essential tool for query expansion, especially in scenarios deficient in high-quality training signals.

The implications of this research are multifaceted. Practically, LLM-QE equips information retrieval systems with enhanced tools for addressing semantic discrepancies between queries and large-scale document collections, thereby improving user experience. Theoretically, the framework offers a potent strategy for aligning generative models with ranking paradigms, paving the way for future exploration into more sophisticated integration of LLMs in diverse retrieval contexts. Future research could expand upon this framework, exploring its adaptability to various retrieval models and further refining the interaction between generation and retrieval components in information systems. The open-source availability of all codes ensures that LLM-QE can serve as a foundational model for future developments in query expansion and information retrieval processes, potentially influencing the broader landscape of artificial intelligence applications in data retrieval and interaction.