Papers
Topics
Authors
Recent
Search
2000 character limit reached

Striking a Balance between Classical and Deep Learning Approaches in Natural Language Processing Pedagogy

Published 16 May 2024 in cs.CL | (2405.09854v2)

Abstract: While deep learning approaches represent the state-of-the-art of NLP today, classical algorithms and approaches still find a place in NLP textbooks and courses of recent years. This paper discusses the perspectives of conveners of two introductory NLP courses taught in Australia and India, and examines how classical and deep learning approaches can be balanced within the lecture plan and assessments of the courses. We also draw parallels with the objects-first and objects-later debate in CS1 education. We observe that teaching classical approaches adds value to student learning by building an intuitive understanding of NLP problems, potential solutions, and even deep learning models themselves. Despite classical approaches not being state-of-the-art, the paper makes a case for their inclusion in NLP courses today.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (13)
  1. Pushpak Bhattacharyya and Aditya Joshi. 2023. Natural Language Processing. Wiley.
  2. Teaching objects-first in introductory computer science. ACM SIGCSE Bulletin, pages 191–195.
  3. Albrecht Ehlert and Carsten Schulte. 2009. Empirical comparison of objects-first and objects-later. ICER’09 - Proceedings of the 2009 ACM Workshop on International Computing Education Research, pages 15–26.
  4. Jennifer Foster and Joachim Wagner. 2021. Naive Bayes versus BERT: Jupyter notebook assignments for an introductory NLP course. In Proceedings of the Fifth Workshop on Teaching NLP, pages 112–114, Online. Association for Computational Linguistics.
  5. Interactive assignments for teaching structured neural NLP. In Proceedings of the Fifth Workshop on Teaching NLP, pages 104–107, Online. Association for Computational Linguistics.
  6. Towards a unified view of parameter-efficient transfer learning. In International Conference on Learning Representations.
  7. Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in Neural Information Processing Systems, 33:9459–9474.
  8. Introductory programming: A systematic literature review. ACM.
  9. Measuring cognitive load in introductory cs: adaptation of an instrument. Proceedings of the tenth annual conference on International computing, pages 131–138.
  10. Barbara Plank. 2021. From back to the roots into the gated woods: Deep learning for NLP. In Proceedings of the Fifth Workshop on Teaching NLP, pages 59–61, Online. Association for Computational Linguistics.
  11. Objects from the beginning - with guis. Proceedings of the 7th annual conference on Innovation and technology in computer science education, pages 65–69.
  12. John Sweller. 2011. Cognitive load theory. Psychology of Learning and Motivation - Advances in Research and Theory, 55:37–76.
  13. Impact of alternative introductory courses on programming concept understanding. dl.acm.orgAE Tew, WM McCracken, M GuzdialProceedings of the first international workshop on Computing education research, 2005•dl.acm.org.

Summary

  • The paper demonstrates that combining classical pre-neural methods with deep learning techniques builds strong foundational intuition in NLP education.
  • It compares course strategies in Australia and India, emphasizing gradual concept layering to reduce cognitive overload while enhancing comprehension.
  • Empirical findings suggest that classical statistical models improve student motivation and understanding before transitioning to advanced neural approaches.

The Value of Pre-Neural Approaches in NLP Education

Introduction

NLP continues to be dramatically influenced by Transformer-based models and other neural approaches. With these innovations, a critical analysis arises: Are pre-neural methods irrelevant in today's introductory NLP education? A recent study explores this debate by examining NLP courses in Australia and India. Let's break down their findings and what it means for current and future NLP educators and students.

Context of the Study

The study examines two NLP courses: one in Australia (NLP-UNSW) and one in India (NLP-IITB). NLP-UNSW represents a new course developed in 2024 with 60 students, while NLP-IITB has been running for 19 years with 150 students. Both courses target undergraduate and postgraduate students. However, a significant difference is that NLP-IITB has a follow-up course focusing on deep learning for NLP, unlike NLP-UNSW.

Parallels to Computer Science Education

Drawing parallels from the "objects-first vs. objects-later" debate in introductory programming education helps. The study suggests that, like object-oriented programming, diving straight into neural approaches in NLP might overwhelm students. Instead, incremental introduction through simpler, pre-neural methodologies can foster a more robust foundation:

  • Objects-first: Introduces complex concepts early (akin to neural-first in NLP).
  • Objects-later: Starts with foundational concepts before layering complexity (akin to pre-neural-first in NLP).

Empirical studies in programming have shown mixed results, advocating that the overall quality of instruction and the specific goals of a curriculum are more crucial than whether one starts with objects or procedural paradigms first. A similar perspective could benefit NLP education.

Textbooks and Course Content

A review of recent NLP textbooks shows a trend towards interleaving pre-neural and neural approaches, with nearly all texts discussing statistical models alongside neural methods. For example:

  • Jurafsky-Martin's "Speech and Language Processing" introduces fundamental algorithms, including statistical models, before diving into neural models like Transformers.
  • "Natural Language Processing" by Bhattacharyya and Joshi alternates between pre-neural and neural approaches, clearly demonstrating the evolution and significance of each.

Several university NLP courses also reflect this balanced approach. For instance, courses at UMass Amherst and NYU start with basic statistical models before introducing Transformers around the midway point.

Lecture Plans

Instructors at NLP-UNSW and NLP-IITB adopt varied strategies to balance both approaches:

  • NLP-UNSW: Uses a hybrid method where neural approaches are interwoven with pre-neural techniques. Early weeks focus on black-box models and probabilistic language modeling. Subsequent weeks introduce Transformers, followed by task-specific applications (e.g., sentiment analysis, named entity recognition) using both pre-neural and neural methods.
  • NLP-IITB: Begins with sequence labeling and probabilistic parsing using pre-neural methods before transitioning into neural approaches like Transformers and LLMs.

The interleaved teaching strategy at both institutions ensures that students grasp the complexity and nuances of NLP tasks, providing a solid foundation before diving into more advanced neural techniques.

Coding Assessments

Both courses incorporate individual and group coding assessments, but their approaches differ slightly:

  • NLP-UNSW: Emphasizes using pre-neural libraries (e.g., spacy matcher) in individual assignments and then combines this with neural methods (e.g., embeddings). Group projects are designed to ensure a mix of techniques, with credits assigned for problem definition, dataset selection, modeling, and evaluation.
  • NLP-IITB: Includes multiple individual assignments focusing on tasks like POS tagging and statistical parsing. Group projects are more flexible in topic choice, emphasizing the 'right' algorithm for the task.

Making the Case for Pre-Neural Approaches

The study highlights several reasons for maintaining pre-neural approaches in NLP curricula:

  1. Intuition-Building: Understanding rule-based and statistical methods helps students appreciate the complexities and challenges in NLP.
  2. Student Motivation: Pre-neural methods serve as a bridge to understand the necessity and efficiency of neural approaches.
  3. Popular Classical Approaches: Methods like HMM and CRF remain effective for certain tasks, emphasizing the need for a foundational understanding.
  4. Annotation: Linguistically sound pre-neural techniques (e.g., POS tagging) provide robust benchmarks for evaluating neural methods.
  5. Cognitive Load Theory: Gradual introduction of concepts reduces cognitive overload, allowing for better assimilation of neural approaches later.

Conclusion

The blend of pre-neural and neural approaches in NLP education addresses diverse learning needs and prepares students comprehensively for both current practices and future innovations in the field. By balancing foundational and advanced methods, educators can cultivate a deeper understanding and allow students to navigate the complexities of NLP efficiently.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 2 tweets with 7 likes about this paper.