Papers
Topics
Authors
Recent
Search
2000 character limit reached

Towards Controlled Table-to-Text Generation with Scientific Reasoning

Published 8 Dec 2023 in cs.CL | (2312.05402v1)

Abstract: The sheer volume of scientific experimental results and complex technical statements, often presented in tabular formats, presents a formidable barrier to individuals acquiring preferred information. The realms of scientific reasoning and content generation that adhere to user preferences encounter distinct challenges. In this work, we present a new task for generating fluent and logical descriptions that match user preferences over scientific tabular data, aiming to automate scientific document analysis. To facilitate research in this direction, we construct a new challenging dataset CTRLSciTab consisting of table-description pairs extracted from the scientific literature, with highlighted cells and corresponding domain-specific knowledge base. We evaluated popular pre-trained LLMs to establish a baseline and proposed a novel architecture outperforming competing approaches. The results showed that large models struggle to produce accurate content that aligns with user preferences. As the first of its kind, our work should motivate further research in scientific domains.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (18)
  1. “Table-to-text generation by structure-aware seq2seq learning,” in Thirty-Second AAAI Conference on Artificial Intelligence, 2018.
  2. “Logical natural language generation from open-domain tables,” arXiv preprint arXiv:2004.10404, 2020.
  3. “Challenges in data-to-document generation,” in Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017, pp. 2253–2263.
  4. “Totto: A controlled table-to-text generation dataset,” arXiv preprint arXiv:2004.14373, 2020.
  5. “The wikipedia xml corpus,” in ACM SIGIR Forum. ACM New York, NY, USA, 2006, vol. 40, pp. 64–69.
  6. “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017.
  7. “The generality/specificity of expertise in scientific reasoning,” Cognitive science, vol. 23, no. 3, pp. 337–370, 1999.
  8. “Learning and scientific reasoning,” Science, vol. 323, no. 5914, pp. 586–587, 2009.
  9. Corinne Zimmerman, “The development of scientific reasoning skills,” Developmental review, vol. 20, no. 1, pp. 99–149, 2000.
  10. “Scigen: a dataset for reasoning-aware text generation from scientific tables,” in Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2), 2021.
  11. “Towards table-to-text generation with numerical reasoning,” in Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 2021, pp. 1451–1465.
  12. “Building applied natural language generation systems,” Natural Language Engineering, vol. 3, no. 1, pp. 57–87, 1997.
  13. “Chain-of-thought prompting elicits reasoning in large language models,” Advances in Neural Information Processing Systems, vol. 35, pp. 24824–24837, 2022.
  14. “Tsdae: Using transformer-based sequential denoising auto-encoderfor unsupervised sentence embedding learning,” in Findings of the Association for Computational Linguistics: EMNLP 2021, 2021, pp. 671–688.
  15. “Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020, pp. 7871–7880.
  16. “Exploring the limits of transfer learning with a unified text-to-text transformer.,” J. Mach. Learn. Res., vol. 21, no. 140, pp. 1–67, 2020.
  17. Akiko Aizawa, “An information-theoretic perspective of tf–idf measures q,” Information Processing and Management, vol. 39, pp. 45–65, 2003.
  18. “Language models are few-shot learners,” Advances in neural information processing systems, vol. 33, pp. 1877–1901, 2020.
Citations (4)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.