Papers
Topics
Authors
Recent
Search
2000 character limit reached

Injecting linguistic knowledge into BERT for Dialogue State Tracking

Published 27 Nov 2023 in cs.CL, cs.AI, and cs.LG | (2311.15623v3)

Abstract: Dialogue State Tracking (DST) models often employ intricate neural network architectures, necessitating substantial training data, and their inference process lacks transparency. This paper proposes a method that extracts linguistic knowledge via an unsupervised framework and subsequently utilizes this knowledge to augment BERT's performance and interpretability in DST tasks. The knowledge extraction procedure is computationally economical and does not require annotations or additional training data. The injection of the extracted knowledge can be achieved by the addition of simple neural modules. We employ the Convex Polytopic Model (CPM) as a feature extraction tool for DST tasks and illustrate that the acquired features correlate with syntactic and semantic patterns in the dialogues. This correlation facilitates a comprehensive understanding of the linguistic features influencing the DST model's decision-making process. We benchmark this framework on various DST tasks and observe a notable improvement in accuracy.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (24)
  1. Recent Neural Methods on Dialogue State Tracking for Task-Oriented Dialogue Systems: A Survey. In Proceedings of the 22nd Annual Meeting of the Special Interest Group on Discourse and Dialogue, pages 239–251, Singapore and Online. Association for Computational Linguistics.
  2. Guan-Lin Chao and Ian Lane. 2019. BERT-DST: Scalable End-to-End Dialogue State Tracking with Bidirectional Encoder Representations from Transformer. arXiv:1907.03040 [cs].
  3. What Does BERT Look at? An Analysis of BERT’s Attention. In Proceedings of the 2019 ACL Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, pages 276–286, Florence, Italy. Association for Computational Linguistics.
  4. BERT: pre-training of deep bidirectional transformers for language understanding. CoRR, abs/1810.04805.
  5. A Sequence-to-Sequence Approach to Dialogue State Tracking. arXiv:2011.09553 [cs].
  6. Robust Dialogue State Tracking with Weak Supervision and Sparse Data.
  7. TripPy: A Triple Copy Strategy for Value Independent Neural Dialog State Tracking. arXiv:2005.02877 [cs].
  8. A Simple Language Model for Task-Oriented Dialogue. arXiv:2005.00796 [cs].
  9. In-Context Learning for Few-Shot Dialogue State Tracking.
  10. A Simple But Effective Bert Model for Dialog State Tracking on Resource-Limited Systems. In ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 8034–8038.
  11. SUMBT: Slot-Utterance Matching for Universal and Scalable Belief Tracking. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 5478–5483, Florence, Italy. Association for Computational Linguistics.
  12. Minimum Volume Simplex Analysis: A Fast Algorithm to Unmix Hyperspectral Data. In IGARSS 2008 - 2008 IEEE International Geoscience and Remote Sensing Symposium, volume 3, pages III – 250–III – 253. ISSN: 2153-7003.
  13. Linguistic knowledge and transferability of contextual representations. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 1073–1094, Minneapolis, Minnesota. Association for Computational Linguistics.
  14. Neural Belief Tracker: Data-Driven Dialogue State Tracking. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1777–1788, Vancouver, Canada. Association for Computational Linguistics.
  15. SOLOIST: Building task bots at scale with transfer learning and machine teaching. In Transactions of the Association for Computational Linguistics.
  16. Deep contextualized word representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 2227–2237, New Orleans, Louisiana. Association for Computational Linguistics.
  17. A Primer in BERTology: What we know about how BERT works. arXiv:2002.12327 [cs].
  18. Attention is all you need.
  19. A Network-based End-to-End Trainable Task-oriented Dialogue System. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, pages 438–449, Valencia, Spain. Association for Computational Linguistics.
  20. Topic Discovery via Convex Polytopic Model: A Case Study with Small Corpora. In 2018 9th IEEE International Conference on Cognitive Infocommunications (CogInfoCom), pages 000367–000372, Budapest, Hungary. IEEE.
  21. Zhengxuan Wu and Desmond C. Ong. 2020. Context-Guided BERT for Targeted Aspect-Based Sentiment Analysis.
  22. Context-Aware Self-Attention Networks.
  23. Convex Polytope Modelling for Unsupervised Derivation of Semantic Structure for Data-efficient Natural Language Understanding.
  24. Automatic Extraction of Semantic Patterns in Dialogs using Convex Polytopic Model. In 2021 12th International Symposium on Chinese Spoken Language Processing (ISCSLP), pages 1–5.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (3)

Collections

Sign up for free to add this paper to one or more collections.