Can we obtain significant success in RST discourse parsing by using Large Language Models?
Abstract: Recently, decoder-only pre-trained LLMs, with several tens of billion parameters, have significantly impacted a wide range of NLP tasks. While encoder-only or encoder-decoder pre-trained LLMs have already proved to be effective in discourse parsing, the extent to which LLMs can perform this task remains an open research question. Therefore, this paper explores how beneficial such LLMs are for Rhetorical Structure Theory (RST) discourse parsing. Here, the parsing process for both fundamental top-down and bottom-up strategies is converted into prompts, which LLMs can work with. We employ Llama 2 and fine-tune it with QLoRA, which has fewer parameters that can be tuned. Experimental results on three benchmark datasets, RST-DT, Instr-DT, and the GUM corpus, demonstrate that Llama 2 with 70 billion parameters in the bottom-up strategy obtained state-of-the-art (SOTA) results with significant differences. Furthermore, our parsers demonstrated generalizability when evaluated on RST-DT, showing that, in spite of being trained with the GUM corpus, it obtained similar performances to those of existing parsers trained with RST-DT.
- Polyglot: Distributed word representations for multilingual NLP. In Proceedings of the Seventeenth Conference on Computational Natural Language Learning, pages 183–192, Sofia, Bulgaria. Association for Computational Linguistics.
- Better document-level sentiment analysis from RST discourse parsing. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pages 2212–2218, Lisbon, Portugal. Association for Computational Linguistics.
- Multi-view and multi-task training of RST discourse parsers. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pages 1903–1913, Osaka, Japan. The COLING 2016 Organizing Committee.
- Language models are few-shot learners. In Advances in Neural Information Processing Systems, volume 33, pages 1877–1901. Curran Associates, Inc.
- Modeling discourse structure for document-level neural machine translation. In Proceedings of the First Workshop on Automatic Simultaneous Translation, pages 30–36, Seattle, Washington. Association for Computational Linguistics.
- Qlora: Efficient finetuning of quantized llms. CoRR, abs/2305.14314.
- BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota. Association for Computational Linguistics.
- Discern: Discourse-aware entailment reasoning network for conversational machine reading. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 2439–2449, Online. Association for Computational Linguistics.
- Grigorii Guz and Giuseppe Carenini. 2020. Coreference for discourse parsing: A neural approach. In Proceedings of the First Workshop on Computational Approaches to Discourse, pages 160–167, Online. Association for Computational Linguistics.
- DeBERTa: Decoding-enhanced BERT with disentangled attention. In Proceedings of the International Conference on Learning Representations.
- LoRA: Low-rank adaptation of large language models. In International Conference on Learning Representations.
- Xinyu Hu and Xiaojun Wan. 2023. Rst discourse parsing as text-to-text generation. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 31:3278–3289.
- SpanBERT: Improving pre-training by representing and predicting spans. Transactions of the Association for Computational Linguistics, 8:64–77.
- Discourse structure in machine translation evaluation. Computational Linguistics, 43(4):683–722.
- Diederik P. Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings.
- Top-down rst parsing utilizing granularity levels in documents. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 8099–8106.
- A simple and strong baseline for end-to-end neural RST-style discourse parsing. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 6725–6737, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Considering nested tree structure in sentence extractive summarization with pre-trained transformer. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 4039–4044, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
- A unified linear-time framework for sentence-level discourse parsing. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 4190–4200, Florence, Italy. Association for Computational Linguistics.
- Yang Janet Liu and Amir Zeldes. 2023. Why can’t discourse parsing generalize? a thorough investigation of the impact of data diversity. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, pages 3112–3130, Dubrovnik, Croatia. Association for Computational Linguistics.
- DMRST: A joint framework for document-level multilingual RST discourse segmentation and parsing. In Proceedings of the 2nd Workshop on Computational Approaches to Discourse, pages 154–164, Punta Cana, Dominican Republic and Online. Association for Computational Linguistics.
- Mary Ellen Okurowski Lynn Carison, Daniel Marcu. 2002. RST Discourse Treebank. Philadelphia: Linguistic Data Consortium.
- W.C. Mann and S.A Thompson. 1987. Rhetorical structure theory: A theory of text organization. Technical Report ISI/RS-87-190, USC/ISI.
- Daniel Marcu. 1998. Improving summarization through rhetorical parsing tuning. In Sixth Workshop on Very Large Corpora.
- How much progress have we made on RST discourse parsing? a replication study of recent results on the RST-DT. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 1319–1324, Copenhagen, Denmark. Association for Computational Linguistics.
- RST parsing from scratch. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1613–1625, Online. Association for Computational Linguistics.
- GloVe: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1532–1543, Doha, Qatar. Association for Computational Linguistics.
- Deep contextualized word representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 2227–2237, New Orleans, Louisiana. Association for Computational Linguistics.
- Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of Machine Learning Research, 21(140):1–67.
- Kenji Sagae and Alon Lavie. 2005. A classifier-based parser with linear run-time complexity. In Proceedings of the Ninth International Workshop on Parsing Technology, pages 125–132, Vancouver, British Columbia. Association for Computational Linguistics.
- An end-to-end document-level neural discourse parser exploiting multi-granularity representations. CoRR, abs/2012.11169.
- Mpnet: Masked and permuted pre-training for language understanding. In Advances in Neural Information Processing Systems, volume 33, pages 16857–16867. Curran Associates, Inc.
- Rajen Subba and Barbara Di Eugenio. 2009. An effective discourse parser that uses rich linguistic information. In Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pages 566–574, Boulder, Colorado. Association for Computational Linguistics.
- Towards discourse-aware document-level neural machine translation. In Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pages 4383–4389. International Joint Conferences on Artificial Intelligence Organization. Main Track.
- Llama 2: Open foundation and fine-tuned chat models.
- A two-stage parsing method for text-level discourse analysis. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 184–188, Vancouver, Canada. Association for Computational Linguistics.
- Finetuned language models are zero-shot learners. In International Conference on Learning Representations.
- Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 38–45, Online. Association for Computational Linguistics.
- Exploring the trade-offs: Unified large language models vs local fine-tuned models for highly-specific radiology nli task.
- Discourse-aware neural extractive text summarization. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 5021–5031, Online. Association for Computational Linguistics.
- Xlnet: Generalized autoregressive pretraining for language understanding. In Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc.
- Transition-based neural RST parsing with implicit syntax features. In Proceedings of the 27th International Conference on Computational Linguistics, pages 559–570, Santa Fe, New Mexico, USA. Association for Computational Linguistics.
- RST discourse parsing with second-stage EDU-level pre-training. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 4269–4280, Dublin, Ireland. Association for Computational Linguistics.
- Amir Zeldes. 2017. The GUM corpus: Creating multilayer resources in the classroom. Language Resources and Evaluation, 51(3):581–612.
- Adversarial learning for discourse rhetorical structure parsing. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 3946–3957, Online. Association for Computational Linguistics.
- A language model-based generative classifier for sentence-level discourse parsing. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 2432–2446, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.