Intent Detection with WikiHow

Published 12 Sep 2020 in cs.CL | (2009.05781v2)

Abstract: Modern task-oriented dialog systems need to reliably understand users' intents. Intent detection is most challenging when moving to new domains or new languages, since there is little annotated data. To address this challenge, we present a suite of pretrained intent detection models. Our models are able to predict a broad range of intended goals from many actions because they are trained on wikiHow, a comprehensive instructional website. Our models achieve state-of-the-art results on the Snips dataset, the Schema-Guided Dialogue dataset, and all 3 languages of the Facebook multilingual dialog datasets. Our models also demonstrate strong zero- and few-shot performance, reaching over 75% accuracy using only 100 training examples in all datasets.

Abstract PDF Upgrade to Chat

Citations (29)

View on Semantic Scholar

Summary

The paper introduces a novel wikiHow pretraining method that improves intent detection in dialogue systems.
It employs transformer-based models on multilingual benchmarks, achieving near state-of-the-art accuracy in both standard and zero-shot settings.
The study underscores the potential of instructional data for generalizing models across diverse and low-resource domains.

Overview

The paper "Intent Detection with WikiHow" (2009.05781) explores a novel approach for enhancing intent detection in task-oriented dialogue systems by pretraining models on data derived from the instructional content of the wikiHow website. The focus is on improving the adaptability of intent detection models across new domains and languages by leveraging the wide-ranging instructional steps in wikiHow, which are available in multiple languages. By using the proposed pretraining methods, the authors achieve state-of-the-art results on several benchmark datasets, highlighting the potential of wikiHow as a data source for intent detection tasks.

WikiHow Pretraining Methodology

The authors propose a pretraining task that involves creating a dataset where each wikiHow article's title is considered a goal or intent, and its instructional steps are treated as associated utterances. The pretraining task is formulated as a multiple-choice problem in which the model predicts the correct goal for a given step from a set of candidate goals. This approach allows models to benefit from the diversity and domain range of wikiHow articles, making them more generalizable to emerging services and uncommon tasks.

Experimental Setup

In their experiments, the authors fine-tune transformer-based LLMs, specifically RoBERTa for English tasks and XLM-RoBERTa for multilingual tasks. These models, pretrained on the generated wikiHow data, are assessed on major intent detection benchmarks: Snips, Schema-Guided Dialogue (SGD), and Facebook multilingual dialog datasets in English, Spanish, and Thai. The results indicate significant performance improvements in both standard and zero-shot settings compared to baseline models.

Results and Analysis

The models pretrained on wikiHow data demonstrate state-of-the-art performance on intent detection benchmarks, achieving close to 100% accuracy where applicable. In scenarios with little or no in-domain training data, such as zero-shot conditions, the pretrained models still maintain notable accuracy levels, suggesting that the pretraining effectively contributes to the models' generalization capabilities. Through error analysis, the authors identify that misclassifications are often due to improper labeling or ambiguous intents rather than deficiencies in the model's understanding, suggesting that existing benchmarks may need enhancement to challenge modern models effectively.

Implications and Future Work

The research suggests that utilizing a diverse and well-structured dataset like wikiHow for pretraining offers significant advantages for intent detection, especially in rapidly evolving and multilingual environments. Given the models' high performance on existing benchmarks, the authors propose a shift towards more open-domain intent detection research that can better evaluate models across a broader range of intents and user scenarios. The work implies that future development in this field could focus on creating datasets with a vast array of intents from various domains, potentially using automated augmentation techniques to generate comprehensive benchmark data.

Conclusion

The paper highlights how incorporating wikiHow-based pretraining significantly enhances the performance of intent detection models across several benchmarks and languages, underlining the potential of instructional websites as valuable resources for data augmentation in natural language processing tasks. This approach not only advances the state-of-the-art in intent detection but also opens new avenues for tackling intent detection challenges in open-domain and low-resource contexts.

Markdown Report Issue