Papers
Topics
Authors
Recent
Search
2000 character limit reached

Pseudo Relevance Feedback is Enough to Close the Gap Between Small and Large Dense Retrieval Models

Published 19 Mar 2025 in cs.IR and cs.LG | (2503.14887v2)

Abstract: Scaling dense retrievers to larger LLM backbones has been a dominant strategy for improving their retrieval effectiveness. However, this has substantial cost implications: larger backbones require more expensive hardware (e.g. GPUs with more memory) and lead to higher indexing and querying costs (latency, energy consumption). In this paper, we challenge this paradigm by introducing PromptPRF, a feature-based pseudo-relevance feedback (PRF) framework that enables small LLM-based dense retrievers to achieve effectiveness comparable to much larger models. PromptPRF uses LLMs to extract query-independent, structured and unstructured features (e.g., entities, summaries, chain-of-thought keywords, essay) from top-ranked documents. These features are generated offline and integrated into dense query representations via prompting, enabling efficient retrieval without additional training. Unlike prior methods such as GRF, which rely on online, query-specific generation and sparse retrieval, PromptPRF decouples feedback generation from query processing and supports dense retrievers in a fully zero-shot setting. Experiments on TREC DL and BEIR benchmarks demonstrate that PromptPRF consistently improves retrieval effectiveness and offers favourable cost-effectiveness trade-offs. We further present ablation studies to understand the role of positional feedback and analyse the interplay between feature extractor size, PRF depth, and model performance. Our findings demonstrate that with effective PRF design, scaling the retriever is not always necessary, narrowing the gap between small and large models while reducing inference cost.

Summary

  • The paper introduces PromptPRF, an offline pseudo-relevance feedback method leveraging LLMs to improve zero-shot dense retrieval.
  • PromptPRF extracts features like keywords and summaries from top documents using LLMs to refine query representations, enhancing retrieval accuracy.
  • Experimental results show PromptPRF significantly improves smaller dense retrievers, enabling them to match the performance of larger models without PRF on benchmarks like TREC DL'19.

Pseudo-Relevance Feedback in Zero-Shot Dense Retrieval Using LLMs

This paper investigates the impact and efficacy of pseudo-relevance feedback (PRF) within the field of zero-shot dense retrieval facilitated by LLMs. The authors propose an innovative approach named "PromptPRF," which builds upon the PromptReps method to enhance query representations and improve retrieval performance.

Methodology Overview

The core of the approach lies in leveraging LLMs to extract salient features from the top-ranked documents during an initial retrieval phase. These features include elements like keywords, summaries, and more nuanced constructs such as entities and essays. The extracted features are utilized to refine the query representation in a dense retrieval setting, all within a zero-shot paradigm. This approach is operated offline, providing a significant advantage regarding resource optimization as it does not increase query-time latency.

PromptPRF integrates the following critical components into its framework:

  1. Initial Retrieval: Queries undergo dense retrieval without additional training via LLM-based embeddings.
  2. Feature Extraction: LLMs generate passage-level features based on pre-defined prompt templates, enhancing context without introducing excessive noise.
  3. Query Refinement: The refined query incorporates the features from pseudo-relevant documents, thus improving the retrieval accuracy in subsequent stages.
  4. Second-Stage Retrieval: The refined query representation is engaged for improved passage ranking, leveraging the contextualized information from PRF.

Experimental Findings

The experiments comprehensively utilize benchmarks from TREC 2019 and 2020. Key observations include:

  • Incorporating PRF significantly improves retrieval effectiveness, particularly for smaller dense retrievers, which can match the efficacy of larger models without PRF.
  • On TREC DL'19 tasks, PromptPRF enhances nDCG@10 from 0.3695 to 0.5013 for Llama3.2 3B dense retrievers, nearly achieving parity with larger models.
  • Smaller models benefit notably from larger feature extractors, indicating the importance of context-rich feature generation. However, diminishing returns are apparent when scaling extractor size for already large dense retrieval models.

Implications and Future Directions

The practical implications of this approach are substantial in scenarios where computational resources may be constrained. PRF allows for reduced hardware requirements in production, which is beneficial for real-time applications like conversational search. The process of PRF being conducted offline further supports the deployment in latency-sensitive environments.

Theoretically, the study challenges the common scaling laws that correlate dense retrieval effectiveness primarily with model size. By more intelligently leveraging model retrieval strategies, this research outlines a path forward allowing smaller models to bridge the gap traditionally occupied by larger and more resource-intensive configurations.

Future work posited by the authors involves fine-tuning various aspects of the approach, including the examination of optimal PRF depth and combining multiple PRF features to refine effectiveness further.

Overall, this research advances the landscape of dense retrieval systems by innovatively harnessing LLM capabilities to deliver enhanced query representations through strategic pseudo-relevance feedback utilization.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 23 likes about this paper.