Discourse Structure Extraction from Pre-Trained and Fine-Tuned Language Models in Dialogues
Abstract: Discourse processing suffers from data sparsity, especially for dialogues. As a result, we explore approaches to build discourse structures for dialogues, based on attention matrices from Pre-trained LLMs (PLMs). We investigate multiple tasks for fine-tuning and show that the dialogue-tailored Sentence Ordering task performs best. To locate and exploit discourse information in PLMs, we propose an unsupervised and a semi-supervised method. Our proposals achieve encouraging results on the STAC corpus, with F1 scores of 57.2 and 59.3 for unsupervised and semi-supervised methods, respectively. When restricted to projective trees, our scores improved to 63.3 and 68.1.
- Modelling strategic conversation: model, annotation design and corpus. In Proceedings of the 16th Workshop on the Semantics and Pragmatics of Dialogue (Seinedial), Paris.
- Discourse parsing for multi-party chat dialogues. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pages 928ā937, Lisbon, Portugal. Association for Computational Linguistics.
- Logics of conversation. Cambridge University Press.
- Discourse structure and dialogue acts in multiparty dialogue: the STAC corpus. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LRECā16), pages 2721ā2727, Portorož, Slovenia. European Language Resources Association (ELRA).
- Data programming for learning discourse structure. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 640ā645, Florence, Italy. Association for Computational Linguistics.
- Weak supervision for learning discourse structure. In EMNLP.
- Regina Barzilay and Mirella Lapata. 2008. Modeling local coherence: An entity-based approach. Computational Linguistics, 34(1):1ā34.
- Building a discourse-tagged corpus in the framework of Rhetorical Structure Theory. In Proceedings of the Second SIGdial Workshop on Discourse and Dialogue.
- Is everything in order? a simple way to order sentences. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 10769ā10779.
- BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171ā4186, Minneapolis, Minnesota. Association for Computational Linguistics.
- Jason Eisner. 1996. Three new probabilistic models for dependency parsing: An exploration. In COLING 1996 Volume 1: The 16th International Conference on Computational Linguistics.
- A survey on dialogue summarization: Recent advances and new frontiers. arXiv preprint arXiv:2107.03175.
- Evaluating discourse in structured text representations. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 646ā653, Florence, Italy. Association for Computational Linguistics.
- Discodisco at the disrpt2021 shared task: A system for discourse segmentation, classification, and connective detection. In Proceedings of the 2nd Shared Task on Discourse Relation Parsing and Treebanking (DISRPT 2021), pages 51ā62.
- SAMSum corpus: A human-annotated dialogue dataset for abstractive summarization. In Proceedings of the 2nd Workshop on New Frontiers in Summarization, pages 70ā79, Hong Kong, China. Association for Computational Linguistics.
- Multi-tasking dialogue comprehension with discourse parsing. In Proceedings of the 35th Pacific Asia Conference on Language, Information and Computation, pages 551ā561, Shanghai, China. Association for Computational Lingustics.
- John Hewitt and ChristopherĀ D. Manning. 2019. A structural probe for finding syntax in word representations. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4129ā4138, Minneapolis, Minnesota. Association for Computational Linguistics.
- Patrick Huber and Giuseppe Carenini. 2019. Predicting discourse structure using distant supervision from sentiment. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 2306ā2316, Hong Kong, China. Association for Computational Linguistics.
- Patrick Huber and Giuseppe Carenini. 2020. MEGA RST discourse treebanks with structure and nuclearity from scalable distant sentiment supervision. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 7442ā7457, Online. Association for Computational Linguistics.
- Patrick Huber and Giuseppe Carenini. 2022. Towards understanding large-scale discourse structures in pre-trained and fine-tuned language models. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics.
- What does BERT learn about the structure of language? In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 3651ā3657, Florence, Italy. Association for Computational Linguistics.
- Multi-turn response selection using dialogue dependency relations. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1911ā1920.
- Training data enrichment for infrequent discourse relations. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pages 2603ā2614, Osaka, Japan. The COLING 2016 Organizing Committee.
- How can we know what language models know? Transactions of the Association for Computational Linguistics, 8:423ā438.
- Codra: A novel discriminative framework for rhetorical analysis. Computational Linguistics, 41(3):385ā435.
- Are pre-trained language models aware of phrases? simple but strong baselines for grammar induction. In International Conference on Learning Representations.
- Improving neural rst parsing model with silver agreement subtrees. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1600ā1612.
- Split or merge: Which is better for unsupervised RST parsing? In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 5797ā5802, Hong Kong, China. Association for Computational Linguistics.
- Discourse probing of pretrained language models. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 3849ā3864.
- BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7871ā7880, Online. Association for Computational Linguistics.
- Molweni: A challenge multiparty dialogues-based machine reading comprehension dataset with discourse structure. In Proceedings of the 28th International Conference on Computational Linguistics, pages 2642ā2652, Barcelona, Spain (Online). International Committee on Computational Linguistics.
- Dadgraph: A discourse-aware dialogue graph neural network for multiparty dialogue machine reading comprehension. arXiv preprint arXiv:2104.12377.
- Keep meeting summaries on topic: Abstractive multi-modal meeting summarization. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 2190ā2196, Florence, Italy. Association for Computational Linguistics.
- Text-level discourse dependency parsing. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 25ā35, Baltimore, Maryland. Association for Computational Linguistics.
- DailyDialog: A manually labelled multi-turn dialogue dataset. In Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 986ā995, Taipei, Taiwan. Asian Federation of Natural Language Processing.
- Yang Liu and Mirella Lapata. 2018. Learning structured text representations. Transactions of the Association for Computational Linguistics, 6:63ā75.
- Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692.
- Zhengyuan Liu and Nancy Chen. 2021. Improving multi-party dialogue discourse parsing via domain integration. In Proceedings of the 2nd Workshop on Computational Approaches to Discourse, pages 122ā127, Punta Cana, Dominican Republic and Online. Association for Computational Linguistics.
- Discourse indicators for content selection in summaization.
- The Ubuntu dialogue corpus: A large dataset for research in unstructured multi-turn dialogue systems. In Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pages 285ā294, Prague, Czech Republic. Association for Computational Linguistics.
- WilliamĀ C Mann and SandraĀ A Thompson. 1988. Rhetorical structure theory: Toward a functional theory of text organization. Text-interdisciplinary Journal for the Study of Discourse, 8(3):243ā281.
- David MareÄek and Rudolf Rosa. 2019. From balustrades to pierre vinken: Looking for syntax in transformer self-attentions. In Proceedings of the 2019 ACL Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, pages 263ā275, Florence, Italy. Association for Computational Linguistics.
- Constrained decoding for text-level discourse parsing. In Proceedings of COLING 2012, pages 1883ā1900, Mumbai, India. The COLING 2012 Organizing Committee.
- Abstractive text summarization using sequence-to-sequence RNNs and beyond. In Proceedings of The 20th SIGNLL Conference on Computational Natural Language Learning, pages 280ā290, Berlin, Germany. Association for Computational Linguistics.
- Rst parsing from scratch. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1613ā1625.
- Noriki Nishida and Yuji Matsumoto. 2022. Out-of-domain discourse dependency parsing via bootstrapping: An empirical analysis on its effectiveness and limitation. Transactions of the Association for Computational Linguistics, 10:127ā144.
- Integer linear programming for discourse parsing. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 99ā109, San Diego, California. Association for Computational Linguistics.
- Stanza: A python natural language processing toolkit for many human languages. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pages 101ā108.
- Joint modeling of content and discourse relations in dialogues. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 974ā984, Vancouver, Canada. Association for Computational Linguistics.
- Language models are unsupervised multitask learners. OpenAI blog, 1(8):9.
- Know what you donāt know: Unanswerable questions for SQuAD. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 784ā789, Melbourne, Australia. Association for Computational Linguistics.
- A simplest systematics for the organization of turn taking for conversation. In Studies in the organization of conversational interaction, pages 7ā55. Elsevier.
- EmanuelĀ A Schegloff. 2007. Sequence organization in interaction: A primer in conversation analysis I, volumeĀ 1. Cambridge university press.
- Zhouxing Shi and Minlie Huang. 2019. A deep sequential model for discourse parsing on multi-party dialogues. In Proceedings of the AAAI Conference on Artificial Intelligence, volumeĀ 33, pages 7007ā7014.
- A structure self-aware model for discourse parsing on multi-party dialogues. In Proceedings of the Thirtieth International Conference on International Joint Conferences on Artificial Intelligence.
- Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 38ā45, Online. Association for Computational Linguistics.
- Do we really need that many parameters in transformer for extractive summarization? discourse can help ! In Proceedings of the First Workshop on Computational Approaches to Discourse, pages 124ā134, Online. Association for Computational Linguistics.
- Predicting discourse trees from transformer-based neural summarizers. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 4139ā4152, Online. Association for Computational Linguistics.
- Discourse-aware neural extractive text summarization. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 5021ā5031.
- DIALOGPT : Large-scale generative pre-training for conversational response generation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pages 270ā278, Online. Association for Computational Linguistics.
- Dialoglm: Pre-trained model for long dialogue understanding and summarization. In Proceedings of the AAAI Conference on Artificial Intelligence, volumeĀ 36, pages 11765ā11773.
- Examining the rhetorical capacities of neural language models. In Proceedings of the Third BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP, pages 16ā32, Online. Association for Computational Linguistics.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.