Papers
Topics
Authors
Recent
Search
2000 character limit reached

Contextualization Distillation from Large Language Model for Knowledge Graph Completion

Published 28 Jan 2024 in cs.CL and cs.AI | (2402.01729v3)

Abstract: While textual information significantly enhances the performance of pre-trained LLMs (PLMs) in knowledge graph completion (KGC), the static and noisy nature of existing corpora collected from Wikipedia articles or synsets definitions often limits the potential of PLM-based KGC models. To surmount these challenges, we introduce the Contextualization Distillation strategy, a versatile plug-in-and-play approach compatible with both discriminative and generative KGC frameworks. Our method begins by instructing LLMs to transform compact, structural triplets into context-rich segments. Subsequently, we introduce two tailored auxiliary tasks, reconstruction and contextualization, allowing smaller KGC models to assimilate insights from these enriched triplets. Comprehensive evaluations across diverse datasets and KGC techniques highlight the efficacy and adaptability of our approach, revealing consistent performance enhancements irrespective of underlying pipelines or architectures. Moreover, our analysis makes our method more explainable and provides insight into generating path selection, as well as the choosing of suitable distillation tasks. All the code and data in this work will be released at https://github.com/David-Li0406/Contextulization-Distillation

Definition Search Book Streamline Icon: https://streamlinehq.com
References (61)
  1. Realistic re-evaluation of knowledge graph completion methods: An experimental study. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, pages 1995–2010.
  2. Palm 2 technical report. arXiv preprint arXiv:2305.10403.
  3. Knowledge distillation: A good teacher is patient and consistent. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10925–10934.
  4. Translating embeddings for modeling multi-relational data. Advances in neural information processing systems, 26.
  5. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901.
  6. Model compression. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 535–541.
  7. Knowledge is flat: A seq2seq generative framework for various knowledge graph completion. In Proceedings of the 29th International Conference on Computational Linguistics, pages 4005–4017.
  8. Dipping plms sauce: Bridging structure and text for effective knowledge graph completion via conditional soft prompting. arXiv preprint arXiv:2307.01709.
  9. Places: Prompting language models for social conversation synthesis. arXiv preprint arXiv:2302.03269.
  10. Weakly supervised data augmentation through prompting for dialogue understanding. In NeurIPS 2022 Workshop on Synthetic Data for Empowering ML Research.
  11. Chataug: Leveraging chatgpt for text data augmentation. arXiv preprint arXiv:2302.13007.
  12. Convolutional 2d knowledge graph embeddings. In Proceedings of the AAAI conference on artificial intelligence, volume 32.
  13. Learning sequence encoders for temporal knowledge graph completion. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 4816–4821.
  14. Fine-grained post-training for improving retrieval-based dialogue systems. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1549–1558.
  15. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531.
  16. Large language models are reasoning teachers. arXiv preprint arXiv:2212.10071.
  17. Distilling step-by-step! outperforming larger language models with less training data and smaller model sizes. arXiv preprint arXiv:2305.02301.
  18. A survey on knowledge graphs: Representation, acquisition, and applications. IEEE transactions on neural networks and learning systems, 33(2):494–514.
  19. Survey of hallucination in natural language generation. ACM Computing Surveys, 55(12):1–38.
  20. Jacob Devlin Ming-Wei Chang Kenton and Lee Kristina Toutanova. 2019. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of naacL-HLT, volume 1, page 2.
  21. Multi-task learning for knowledge graph completion with pre-trained language models. In Proceedings of the 28th International Conference on Computational Linguistics, pages 1737–1743.
  22. Soda: Million-scale dialogue distillation with social commonsense contextualization. arXiv preprint arXiv:2212.10465.
  23. Prosocialdialog: A prosocial backbone for conversational agents. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 4005–4029.
  24. Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7871–7880.
  25. C3kg: A chinese commonsense conversation knowledge graph. In Findings of the Association for Computational Linguistics: ACL 2022, pages 1369–1383.
  26. Holistic evaluation of language models. arXiv preprint arXiv:2211.09110.
  27. Learning entity and relation embeddings for knowledge graph completion. In Proceedings of the AAAI conference on artificial intelligence, volume 29.
  28. Do pre-trained models benefit knowledge graph completion? a reliable evaluation and a reasonable approach. In Findings of the Association for Computational Linguistics: ACL 2022, pages 3570–3581.
  29. Teaching small language models to reason. arXiv preprint arXiv:2212.08410.
  30. Yago3: A knowledge base from multilingual wikipedias. In CIDR.
  31. What language reveals about perception: Distilling psychophysical knowledge from large language models. arXiv preprint arXiv:2302.01308.
  32. A three-way model for collective learning on multi-relational data. In Proceedings of the 28th International Conference on International Conference on Machine Learning, pages 809–816.
  33. Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 21(1):5485–5551.
  34. Nils Reimers and Iryna Gurevych. 2019. Sentence-bert: Sentence embeddings using siamese bert-networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3982–3992.
  35. Sequence-to-sequence knowledge graph completion and question answering. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2814–2828.
  36. Distilling reasoning capabilities into smaller language models. In Findings of the Association for Computational Linguistics: ACL 2023, pages 7059–7073.
  37. Head-to-tail: How knowledgeable are large language models (llm)? aka will llms replace knowledge graphs? arXiv preprint arXiv:2308.10168.
  38. Colake: Contextualized language and knowledge embedding. In Proceedings of the 28th International Conference on Computational Linguistics, pages 3660–3670.
  39. Rotate: Knowledge graph embedding by relational rotation in complex space. In International Conference on Learning Representations.
  40. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288.
  41. Complex embeddings for simple link prediction. In International conference on machine learning, pages 2071–2080. PMLR.
  42. Composition-based multi-relational graph convolutional networks. In International Conference on Learning Representations.
  43. Structure-augmented text representation learning for efficient knowledge graph completion. In Proceedings of the Web Conference 2021, pages 1737–1748.
  44. Pinto: Faithful language reasoning using prompt-generated rationales. In The Eleventh International Conference on Learning Representations.
  45. Kepler: A unified model for knowledge embedding and pre-trained language representation. Transactions of the Association for Computational Linguistics, 9:176–194.
  46. Knowledge graph embedding by translating on hyperplanes. In Proceedings of the AAAI conference on artificial intelligence, volume 28.
  47. Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems, 35:24824–24837.
  48. Asdot: Any-shot data-to-text generation with pretrained language models. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 1886–1899.
  49. Transg: A generative model for knowledge graph embedding. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2316–2325.
  50. From discrimination to generation: Knowledge graph completion with generative transformer. In Companion Proceedings of the Web Conference 2022, pages 162–165.
  51. One-shot relational learning for knowledge graphs. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 1980–1990.
  52. Embedding entities and relations for learning and inference in knowledge bases. In Proceedings of the International Conference on Learning Representations (ICLR) 2015.
  53. A new benchmark and reverse validation method for passage-level hallucination detection. arXiv preprint arXiv:2310.06498.
  54. Kg-bert: Bert for knowledge graph completion. arXiv preprint arXiv:1909.03193.
  55. Huatuogpt, towards taming language model to be a doctor. arXiv preprint arXiv:2305.15075.
  56. Opt: Open pre-trained transformer language models. arXiv preprint arXiv:2205.01068.
  57. A survey of large language models. arXiv preprint arXiv:2303.18223.
  58. Augesc: Dialogue augmentation with large language models for emotional support conversation. In Findings of the Association for Computational Linguistics: ACL 2023, pages 1552–1568.
  59. Aligning knowledge and text embeddings by entity descriptions. In Proceedings of the 2015 conference on empirical methods in natural language processing, pages 267–272.
  60. Reflect, not reflex: Inference-based common ground improves dialogue response quality. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 10450–10468.
  61. Llms for knowledge graph construction and reasoning: Recent capabilities and future opportunities. arXiv preprint arXiv:2305.13168.
Citations (8)

Summary

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (4)

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 2 likes about this paper.