A Setwise Approach for Effective and Highly Efficient Zero-shot Ranking with Large Language Models

Published 14 Oct 2023 in cs.IR and cs.AI | (2310.09497v2)

Abstract: We propose a novel zero-shot document ranking approach based on LLMs: the Setwise prompting approach. Our approach complements existing prompting approaches for LLM-based zero-shot ranking: Pointwise, Pairwise, and Listwise. Through the first-of-its-kind comparative evaluation within a consistent experimental framework and considering factors like model size, token consumption, latency, among others, we show that existing approaches are inherently characterised by trade-offs between effectiveness and efficiency. We find that while Pointwise approaches score high on efficiency, they suffer from poor effectiveness. Conversely, Pairwise approaches demonstrate superior effectiveness but incur high computational overhead. Our Setwise approach, instead, reduces the number of LLM inferences and the amount of prompt token consumption during the ranking procedure, compared to previous methods. This significantly improves the efficiency of LLM-based zero-shot ranking, while also retaining high zero-shot ranking effectiveness. We make our code and results publicly available at \url{https://github.com/ielab/LLM-rankers}.

Abstract PDF HTML Upgrade to Chat

References (29)

Citations (12)

View on Semantic Scholar

Summary

The paper introduces a novel setwise prompting method that reduces computational cost while maintaining effectiveness in zero-shot ranking.
It employs sorting algorithms and benchmarks from TREC DL and BEIR to demonstrate efficiency improvements over pointwise, pairwise, and listwise strategies.
The approach enables cost-effective deployment of large language models in low-resource settings and scalable applications.

A Setwise Approach for Effective and Highly Efficient Zero-shot Ranking with LLMs

Introduction

In the domain of document ranking, LLMs such as GPT-3, FlanT5, and PaLM have shown exceptional capabilities, especially under zero-shot settings. Traditional approaches to utilizing LLMs for zero-shot document ranking have been categorized into pointwise, listwise, and pairwise prompting strategies. These strategies differ in their prompting methods and consequently impact the efficiency and effectiveness of the ranking process. The study introduces a novel "Setwise" prompting approach aimed at optimizing zero-shot document ranking by balancing computational cost and effectiveness.

Evaluation of Existing Approaches

The paper presents a thorough evaluation of pointwise, listwise, and pairwise methods for zero-shot document ranking using a common experimental framework. Key parameters like model size, token usage, and latency were considered. It was found that:

Pointwise Strategies: These are highly efficient but lack effectiveness. They typically assess each document individually in comparison to a query.
Pairwise Strategies: These approaches compare pairs of documents to determine relevance but at the cost of higher computational demands.
Listwise Strategies: By evaluating sets of documents relative to the query, this method potentially balances between effectiveness and computational efficiency but often varies significantly based on configuration and dataset.

The comprehensive evaluation helped elucidate trade-offs intrinsic to each method, providing a clearer pathway for practitioners selecting ranking strategies.

Setwise Prompting: A Novel Approach

The Setwise prompting method is introduced to enhance efficiency while maintaining or even improving effectiveness. It reduces LLM inferences by considering multiple documents concurrently, rather than sequential pairs or lists. This approach leverages sorting algorithms such as heap sort and bubble sort to decrease computational costs and prompt token consumption significantly.

Figure 1: Different prompting strategies. (a) Pointwise, (b) Listwise, (c) Pairwise and (d) our proposed Setwise.

Through this method, a relevance estimation is made across sets of documents, thereby providing an efficient mechanism for zero-shot ranking. The empirical tests conducted demonstrate that this method achieves reductions in computational overhead without sacrificing the quality of the ranking results.

Empirical Results

The efficacy of the Setwise approach was validated using the TREC DL datasets and the BEIR benchmarks. Results showed notable reductions in computational costs, specifically in terms of the number of inferences and prompt tokens required per query. Furthermore, Setwise prompting exhibited strong robustness against variations in initial ranking quality, unlike existing methods which are sensitive to initial rankings.

Figure 2: Heapify with Pairwise prompting (comparing 2 documents at a time).

The experiments demonstrated the superiority of the Setwise method, especially in balancing the trade-off between effectiveness and computational efficiency. Notably, the use of open-source Flan-t5 LLMs showcased the potential scalability of this approach without depending heavily on expensive, closed-source models.

Implications and Future Directions

The introduction of Setwise prompting into zero-shot document ranking presents practical implications for real-world applications where computational resources and response times are critical. By using fewer inferences and shorter prompts, the approach helps manage costs while maintaining high ranking effectiveness. It opens pathways for deploying LLMs in low-resource settings without compromising on performance.

Future research could explore the application of Setwise prompting with other LLMs, including proprietary models like LLaMA and APIs from OpenAI. Furthermore, enhancements in self-supervised learning and optimization techniques can be integrated to refine the Setwise method further.

Conclusion

This paper offers a significant leap forward in the efficient application of LLMs for zero-shot document ranking by introducing Setwise prompting. By achieving a balance between effectiveness and computational efficiency, the study provides insights and tools essential for leveraging the power of LLMs in scalable and cost-effective ways. The results underscore the method's robustness, making it a valuable addition to existing ranking strategies.