Scalable Zero-shot Entity Linking with Dense Entity Retrieval

Published 10 Nov 2019 in cs.CL | (1911.03814v3)

Abstract: This paper introduces a conceptually simple, scalable, and highly effective BERT-based entity linking model, along with an extensive evaluation of its accuracy-speed trade-off. We present a two-stage zero-shot linking algorithm, where each entity is defined only by a short textual description. The first stage does retrieval in a dense space defined by a bi-encoder that independently embeds the mention context and the entity descriptions. Each candidate is then re-ranked with a cross-encoder, that concatenates the mention and entity text. Experiments demonstrate that this approach is state of the art on recent zero-shot benchmarks (6 point absolute gains) and also on more established non-zero-shot evaluations (e.g. TACKBP-2010), despite its relative simplicity (e.g. no explicit entity embeddings or manually engineered mention tables). We also show that bi-encoder linking is very fast with nearest neighbour search (e.g. linking with 5.9 million candidates in 2 milliseconds), and that much of the accuracy gain from the more expensive cross-encoder can be transferred to the bi-encoder via knowledge distillation. Our code and models are available at https://github.com/facebookresearch/BLINK.

Abstract PDF Upgrade to Chat

Authors (5)

Citations (182)

View on Semantic Scholar

Summary

The paper introduces a novel two-stage approach that combines a BERT-based bi-encoder for dense entity retrieval with a cross-encoder for re-ranking, significantly boosting linking accuracy.
The bi-encoder efficiently links millions of candidate entities in milliseconds while knowledge distillation transfers improvements from the cross-encoder.
The model sets new state-of-the-art benchmarks by achieving higher accuracy on zero-shot datasets and TACKBP-2010 without relying on additional entity type cues.

Scalable Zero-shot Entity Linking with Dense Entity Retrieval

The paper presents a straightforward yet effective approach to zero-shot entity linking using BERT-based models, showcasing impressive performance improvements over existing methods. The proposed model employs a two-stage linking algorithm, leveraging dense entity retrieval and re-ranking, which achieves state-of-the-art results on multiple benchmarks.

Methodology

The approach consists of two distinct phases:

Bi-encoder for Dense Retrieval: This stage utilizes a bi-encoder architecture where BERT independently embeds both the context of a mention and the corresponding entity descriptions into dense vectors. The similarity is computed via the dot product, allowing for fast nearest neighbor searches. The bi-encoder demonstrates exceptional speed, linking among 5.9 million candidates in 2 milliseconds.
Cross-encoder for Re-ranking: Candidates retrieved in the first stage are further refined by a cross-encoder. This model takes the concatenated mention and entity text, allowing for more sophisticated interactions and achieving higher accuracy.

The paper details the application of knowledge distillation to transfer accuracy gains from the cross-encoder back to the more efficient bi-encoder model, fine-tuning it further.

Empirical Evaluation

The proposed model is evaluated on a range of benchmarks, including zero-shot entity linking datasets and TACKBP-2010. Key findings include:

Zero-shot Dataset: The approach improves unnormalized accuracy by nearly 6 points over previous methods, underscoring its efficacy in scenarios with unseen entities.
TACKBP-2010: It surpasses existing state-of-the-art systems with a significant reduction in error rates, achieving high accuracy without relying on additional cues such as entity type information.
Retrieval Speed: The bi-encoder's retrieval efficiency is highlighted, with large-scale entity linking accomplished in negligible time, an essential feature for practical applications.

Implications and Future Directions

This research significantly contributes to entity linking, particularly in zero-shot contexts where pre-existing knowledge of entities isn't available. The dual-model setup not only advances accuracy but also offers a robust solution that balances efficiency and performance.

The insights drawn from this study suggest several future research avenues:

Incorporation of Additional Information: Enriching the model with entity types, graph data, and other metadata could provide further accuracy improvements.
Coherence Modeling: Developing strategies to address multiple mentions simultaneously could refine context utilization.
Cross-lingual Extension: Adapting these methods to other languages would broaden their applicability and utility across multilingual datasets.

Conclusion

The integration of dense retrieval with pre-trained models uniquely positions this research to handle large-scale entity linking challenges effectively. By eschewing additional external knowledge, this method not only streamlines the linking process but sets a new benchmark for future innovations in AI-driven entity resolution tasks. The open-source availability of their model and code further invites replication and improvement upon their techniques, fostering continued exploration and impact in the field.

Markdown Report Issue