Papers
Topics
Authors
Recent
Search
2000 character limit reached

BERT-LSH: Reducing Absolute Compute For Attention

Published 12 Apr 2024 in cs.CL, cs.AI, and cs.LG | (2404.08836v1)

Abstract: This study introduces a novel BERT-LSH model that incorporates Locality Sensitive Hashing (LSH) to approximate the attention mechanism in the BERT architecture. We examine the computational efficiency and performance of this model compared to a standard baseline BERT model. Our findings reveal that BERT-LSH significantly reduces computational demand for the self-attention layer while unexpectedly outperforming the baseline model in pretraining and fine-tuning tasks. These results suggest that the LSH-based attention mechanism not only offers computational advantages but also may enhance the model's ability to generalize from its training data. For more information, visit our GitHub repository: https://github.com/leo4life2/algoml-final

Summary

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (2)

Collections

Sign up for free to add this paper to one or more collections.