Papers
Topics
Authors
Recent
Search
2000 character limit reached

AmazUtah_NLP at SemEval-2024 Task 9: A MultiChoice Question Answering System for Commonsense Defying Reasoning

Published 16 May 2024 in cs.CL, cs.AI, cs.IR, and cs.LG | (2405.10385v2)

Abstract: The SemEval 2024 BRAINTEASER task represents a pioneering venture in NLP by focusing on lateral thinking, a dimension of cognitive reasoning that is often overlooked in traditional linguistic analyses. This challenge comprises of Sentence Puzzle and Word Puzzle subtasks and aims to test LLMs' capacity for divergent thinking. In this paper, we present our approach to the BRAINTEASER task. We employ a holistic strategy by leveraging cutting-edge pre-trained models in multiple choice architecture, and diversify the training data with Sentence and Word Puzzle datasets. To gain further improvement, we fine-tuned the model with synthetic humor or jokes dataset and the RiddleSense dataset which helped augmenting the model's lateral thinking abilities. Empirical results show that our approach achieve 92.5% accuracy in Sentence Puzzle subtask and 80.2% accuracy in Word Puzzle subtask.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (17)
  1. Codah: An adversarially authored question-answer dataset for common sense. arXiv preprint arXiv:1904.04365.
  2. Bert: Pre-training of deep bidirectional transformers for language understanding.
  3. Debertav3: Improving deberta using electra-style pre-training with gradient-disentangled embedding sharing.
  4. Cosmos qa: Machine reading comprehension with contextual commonsense reasoning. arXiv preprint arXiv:1909.00277.
  5. Cskg: The commonsense knowledge graph. In The Semantic Web: 18th International Conference, ESWC 2021, Virtual Event, June 6–10, 2021, Proceedings 18, pages 680–696. Springer.
  6. Semeval-2024 task 9: Brainteaser: A novel task defying common sense. In Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024), pages 1996–2010, Mexico City, Mexico. Association for Computational Linguistics.
  7. BRAINTEASER: Lateral thinking puzzles for large language models. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 14317–14332, Singapore. Association for Computational Linguistics.
  8. Riddlesense: Reasoning about riddle questions featuring linguistic creativity and commonsense knowledge. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics (ACL-IJCNLP 2021): Findings. To appear.
  9. Gpt-4 technical report.
  10. Scaling language models: Methods, analysis & insights from training gopher. arXiv preprint arXiv:2112.11446.
  11. Conceptnet 5.5: An open multilingual graph of general knowledge.
  12. Inductive learning on commonsense knowledge graph completion. In 2021 International Joint Conference on Neural Networks (IJCNN), pages 1–8. IEEE.
  13. Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems, 35:24824–24837.
  14. Huggingface’s transformers: State-of-the-art natural language processing.
  15. Olagpt: Empowering llms with human-like problem-solving abilities. arXiv preprint arXiv:2305.16334.
  16. Swag: A large-scale adversarial dataset for grounded commonsense inference. arXiv preprint arXiv:1808.05326.
  17. Conditional prompt learning for vision-language models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16816–16825.
Citations (1)

Summary

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 2 tweets with 0 likes about this paper.