Papers
Topics
Authors
Recent
Search
2000 character limit reached

ParallelPARC: A Scalable Pipeline for Generating Natural-Language Analogies

Published 2 Mar 2024 in cs.CL and cs.AI | (2403.01139v4)

Abstract: Analogy-making is central to human cognition, allowing us to adapt to novel situations -- an ability that current AI systems still lack. Most analogy datasets today focus on simple analogies (e.g., word analogies); datasets including complex types of analogies are typically manually curated and very small. We believe that this holds back progress in computational analogy. In this work, we design a data generation pipeline, ParallelPARC (Parallel Paragraph Creator) leveraging state-of-the-art LLMs to create complex, paragraph-based analogies, as well as distractors, both simple and challenging. We demonstrate our pipeline and create ProPara-Logy, a dataset of analogies between scientific processes. We publish a gold-set, validated by humans, and a silver-set, generated automatically. We test LLMs' and humans' analogy recognition in binary and multiple-choice settings, and found that humans outperform the best models (~13% gap) after a light supervision. We demonstrate that our silver-set is useful for training models. Lastly, we show challenging distractors confuse LLMs, but not humans. We hope our pipeline will encourage research in this emerging field.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (49)
  1. Analogy generation by prompting large language models: A case study of InstructGPT. In Proceedings of the 15th International Conference on Natural Language Generation, pages 298–312, Waterville, Maine, USA and virtual meeting. Association for Computational Linguistics.
  2. Vasr: Visual analogies of situation recognition. In AAAI Conference on Artificial Intelligence.
  3. Language models are few-shot learners. ArXiv, abs/2005.14165.
  4. E-kar: A benchmark for rationalizing natural language analogical reasoning. In Findings of the Association for Computational Linguistics: ACL 2022, pages 3941–3955.
  5. Vicuna: An open-source chatbot impressing gpt-4 with 90%* chatgpt quality.
  6. François Chollet. 2019. On the measure of intelligence. ArXiv, abs/1911.01547.
  7. Scaling instruction-finetuned language models. arXiv preprint arXiv:2210.11416.
  8. C. Clement and D. Gentner. 1991. Systematicity as a selection constraint in analogical mapping. Cognitive Science, 15:89–132.
  9. John J. Clement. 1993. Using bridging analogies and anchoring institutions to seal with students’ preconceptions in physics.
  10. Scientific and creative analogies in pretrained language models. In Conference on Empirical Methods in Natural Language Processing.
  11. Tracking state changes in procedural text: A challenge dataset and models for process paragraph comprehension. NAACL.
  12. Imagenet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition, pages 248–255.
  13. Reinders Duit. 1991. On the role of analogies and metaphors in learning science. Science Education, 75:649–672.
  14. Dedre Gentner. 1983. Structure-mapping: A theoretical framework for analogy. Cognitive Science, 7(2):155–170.
  15. The roles of similarity in transfer: Separating retrieval from inferential soundness. Cognitive Psychology, 25:524–575.
  16. Connecting long distance: semantic distance in analogical reasoning modulates frontopolar cortex activity. Cerebral cortex (New York, N.Y. : 1991), 20(1):70—76.
  17. Douglas R Hofstadter and Emmanuel Sander. 2013. Surfaces and essences: Analogy as the fuel and fire of thinking. Basic books.
  18. Keith J Holyoak. 1984. Analogical thinking and human intelligence. Advances in the psychology of human intelligence, 2:199–230.
  19. Keith J Holyoak and Paul Thagard. 1996. Mental leaps: Analogy in creative thought.
  20. Accelerating innovation through analogy mining. Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.
  21. Storyanalogy: Deriving story-level analogies from large language models to unlock analogical understanding. arXiv preprint arXiv:2310.12874.
  22. SemEval-2012 task 2: Measuring degrees of relational similarity. In *SEM 2012: The First Joint Conference on Lexical and Computational Semantics – Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation (SemEval 2012), pages 356–364, Montréal, Canada. Association for Computational Linguistics.
  23. The time course of semantic and relational processing during verbal analogical reasoning. Brain and Cognition, 129:25–34.
  24. Tal Linzen. 2016. Issues in evaluating semantic spaces using word analogies. In Proceedings of the 1st Workshop on Evaluating Vector-Space Representations for NLP, pages 13–18, Berlin, Germany. Association for Computational Linguistics.
  25. Ilya Loshchilov and Frank Hutter. 2017. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101.
  26. Efficient estimation of word representations in vector space. In International Conference on Learning Representations.
  27. Marvin Minsky. 1988. Society of mind. Simon and Schuster.
  28. Melanie Mitchell. 2021. Abstraction and analogy-making in artificial intelligence. Annals of the New York Academy of Sciences, 1505.
  29. Fundamental studies in design-by-analogy: A focus on domain-knowledge experts and applications to transactional design problems. Design Studies, 35:232–272.
  30. R OpenAI. 2023. Gpt-4 technical report. arXiv, pages 2303–08774.
  31. The refinedweb dataset for falcon llm: Outperforming curated corpora with web data, and web data only.
  32. The relational luring effect: Retrieval of relational information during associative recognition. Journal of Experimental Psychology: General, 146:722–745.
  33. Unsupervised representation learning with deep convolutional generative adversarial networks. In 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2-4, 2016, Conference Track Proceedings.
  34. Deep visual analogy-making. In Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, December 7-12, 2015, Montreal, Quebec, Canada, pages 1252–1260.
  35. Nils Reimers and Iryna Gurevych. 2019. Sentence-bert: Sentence embeddings using siamese bert-networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics.
  36. Analogy-based detection of morphological and semantic relations with word embeddings: what works and what doesn’t. In North American Chapter of the Association for Computational Linguistics.
  37. Visalogy: Answering visual analogy questions. In Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, December 7-12, 2015, Montreal, Quebec, Canada, pages 1882–1890.
  38. Natalie Schluter. 2018. The word analogy testing caveat. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), pages 242–246, New Orleans, Louisiana. Association for Computational Linguistics.
  39. Oren Sultan and Dafna Shahaf. 2022. Life is a circus and we are the clowns: Automatically finding analogies between situations and processes. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 3547–3562, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
  40. David P. Swain. 2000. The water-tower analogy of the cardiovascular system. Advances in physiology education, 24 1:43–50.
  41. Alpaca: A strong, replicable instruction-following model. Stanford Center for Research on Foundation Models. https://crfm. stanford. edu/2023/03/13/alpaca. html, 3(6):7.
  42. Gemini: a family of highly capable multimodal models. arXiv preprint arXiv:2312.11805.
  43. Zero-shot image-to-text generation for visual-semantic arithmetic. ArXiv preprint, abs/2111.14447.
  44. Llama: Open and efficient foundation language models.
  45. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288.
  46. Bert is to nlp what alexnet is to cv: can pre-trained language models identify analogies? arXiv preprint arXiv:2105.04949.
  47. Below the surface: Analogical similarity and retrieval competition in reminding. Cognitive Psychology, 26:64–101.
  48. Beneath surface similarity: Large language models make reasonable scientific analogies after structure abduction. ArXiv, abs/2305.12660.
  49. Analogykb: Unlocking analogical reasoning of language models with a million-scale knowledge base. ArXiv, abs/2305.05994.
Citations (4)

Summary

  • The paper presents a scalable pipeline, ParallelPARC, that uses advanced LLMs to generate complex, paragraph-scale analogies and distractors.
  • The methodology overcomes limitations of simplistic analogy datasets by introducing ProPara-Logy, a benchmark with both human-validated and automatically generated analogies.
  • Evaluation shows humans outperform state-of-the-art LLMs by roughly 13%, highlighting the need for enhanced AI reasoning in analogical tasks.

ParallelPARC: A Scalable Pipeline for Generating Natural-Language Analogies

The paper "ParallelPARC: A Scalable Pipeline for Generating Natural-Language Analogies" presents a robust method for generating complex analogies in natural language using advanced machine learning techniques. Constructed around the concept that analogy-making is fundamental to human cognition, this research aims to confront current limitations of AI systems in understanding and generating such analogies. The authors introduce a data generation pipeline, ParallelPARC, which leverages LLMs to create sophisticated, paragraph-based analogies and challenging distractors.

In developing the ParallelPARC pipeline, the authors introduce ProPara-Logy, a benchmark dataset that consists of analogies between scientific processes spanning multiple domains. The dataset is differentiated into a 'gold-set,' validated by human annotators, and a 'silver-set,' which is automatically generated. The dataset's design includes not only analogous paragraphs but also both randomly generated and targeted, challenging distractors, facilitating robust testing and training environments for analogy recognition tasks.

The method for generating analogical data is methodological, overcoming barriers present in existing analogy datasets that typically focus on simplistic word analogies. The engineering of the dataset preserves the complexity of real-world analogical reasoning by encompassing full paragraphs mapped through structurally analogous relationships. This advancement replicates more accurately the multifaceted nature of analogies encountered in human task performance.

The evaluation of both humans and state-of-the-art LLMs on analogical recognition tasks within this dataset provides insightful results. The authors highlight that while humans typically outperform the best models by a margin of approximately 13% post-light supervision, the generated silver-set significantly enhances model training, improving performance metrics. This underscores the utility of their dataset in advancing model accuracy in analogy-based reasoning.

The research further elucidates the limitations of current LLMs in processing challenging distractors, proving them effective in confusing models but not humans. This result points toward unresolved challenges in developing AI capable of processing complex analogical constructs.

The implications of this work extend into theoretical and practical realms. The ProPara-Logy benchmark can drive methodological advancements in ML models focused on higher-order reasoning capabilities, which are crucial for tasks that require understanding abstract relations and adapting to novel scenarios. Practically, the ability to efficiently generate complex analogies can stimulate innovation in domains reliant on analogical modeling, such as education, cognitive neuroscience, and computational creativity.

Future developments in this field may focus on extending this methodology into other domains, exploring cross-domain analogical reasoning, and enhancing the model's ability to resolve complex distractors. The work opens promising avenues for extending the comprehension and generative capabilities of AI systems regarding human-like reasoning tasks, potentially leading to more resilient AI.

In conclusion, the research establishes a concrete method for analogy generation and lays the groundwork for future investigations into enhancing AI models' capacity for analogy-making, a domain previously less addressed due to data limitations. The ParallelPARC pipeline and ProPara-Logy dataset represent significant advancements in this field, offering tools for ongoing research and development in AI and natural language processing.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 4 tweets with 77 likes about this paper.