Papers
Topics
Authors
Recent
Search
2000 character limit reached

Dense Sparse Retrieval: Using Sparse Language Models for Inference Efficient Dense Retrieval

Published 31 Mar 2023 in cs.IR, cs.AI, and cs.CL | (2304.00114v1)

Abstract: Vector-based retrieval systems have become a common staple for academic and industrial search applications because they provide a simple and scalable way of extending the search to leverage contextual representations for documents and queries. As these vector-based systems rely on contextual LLMs, their usage commonly requires GPUs, which can be expensive and difficult to manage. Given recent advances in introducing sparsity into LLMs for improved inference efficiency, in this paper, we study how sparse LLMs can be used for dense retrieval to improve inference efficiency. Using the popular retrieval library Tevatron and the MSMARCO, NQ, and TriviaQA datasets, we find that sparse LLMs can be used as direct replacements with little to no drop in accuracy and up to 4.3x improved inference speeds

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (2)

Collections

Sign up for free to add this paper to one or more collections.