e-SNLI: Natural Language Inference with Natural Language Explanations

Published 4 Dec 2018 in cs.CL | (1812.01193v2)

Abstract: In order for machine learning to garner widespread public adoption, models must be able to provide interpretable and robust explanations for their decisions, as well as learn from human-provided explanations at train time. In this work, we extend the Stanford Natural Language Inference dataset with an additional layer of human-annotated natural language explanations of the entailment relations. We further implement models that incorporate these explanations into their training process and output them at test time. We show how our corpus of explanations, which we call e-SNLI, can be used for various goals, such as obtaining full sentence justifications of a model's decisions, improving universal sentence representations and transferring to out-of-domain NLI datasets. Our dataset thus opens up a range of research directions for using natural language explanations, both for improving models and for asserting their trust.

Abstract PDF Upgrade to Chat

Citations (579)

View on Semantic Scholar

Summary

The paper introduces e-SNLI, which augments the SNLI dataset with human-annotated, natural language explanations to improve model interpretability.
It details a crowd-sourced methodology and templated filtering process that enhances data quality for training NLI models.
Experiments show that incorporating explanations boosts transfer performance to downstream tasks, though achieving full human-like reasoning remains challenging.

e-SNLI: Natural Language Inference with Natural Language Explanations

The paper "e-SNLI: Natural Language Inference with Natural Language Explanations" addresses a critical challenge in machine learning: enhancing model interpretability by extending the Stanford Natural Language Inference (SNLI) dataset with human-annotated natural language explanations. The authors introduce a novel data augmentation process that equips models with the capability to not only predict entailment relations but also generate textual justifications for their decisions.

Dataset and Methodology

The extension to the SNLI dataset, termed e-SNLI, provides a substantial collection of explanations articulated in natural language. These explanations serve dual purposes. Firstly, they aim to enhance model transparency by offering insights into the reasoning process behind a model's decision. Secondly, they act as an additional layer of supervision during training. The dataset collection was meticulous, employing precise annotation guidelines to ensure high-quality, informative explanations that are both human-comprehensible and machine-readable.

The authors implemented a crowd-sourcing strategy via Amazon Mechanical Turk to collect explanations, meanwhile designing an in-browser mechanism to filter out low-quality submissions semiautomatically. Further, a templated approach helped capture explanations that were possibly trivial or overly generic, which augmented dataset quality.

Modeling Approaches and Experiments

The core of the experimentation involves integrating the e-SNLI dataset into existing SNLI model architectures. The authors employed a baseline architecture similar to InferSent, a well-known model framework for Natural Language Inference (NLI), augmented with a recurrent neural network (RNN) decoder for explaining the output labels. Two main experimental setups were evaluated:

PredictAndExplain: A model that generates a label accompanied by a textual explanation.
ExplainThenPredict: A model tasked with generating an explanation prior to predicting the relational label.

These experiments highlighted that successful generation of explanations could provide additional dimensions of transparency in model decision-making processes. Moreover, including natural language explanations within training procedures demonstrably enhanced the quality of universal sentence representations, measured via their transfer capabilities to downstream tasks.

Results and Implications

The introduction of e-SNLI demonstrates measurable improvements in both interpretability and the utility of sentence embeddings in downstream tasks, as illustrated by enhanced performance across various benchmarks. However, a complete human-like comprehensibility remains a challenge, with a significant portion of generated explanations being only partially correct.

Importantly, models leveraging e-SNLI explanations exhibited promising transfer capabilities to out-of-domain NLI datasets like MultiNLI and SICK-E without task-specific fine-tuning. Nevertheless, a decrease in explanation quality when applied to domains vastly different from SNLI highlights existing limitations and opportunities for further research.

Future Directions

The implications of e-SNLI are significant both in theory and practice. The natural language explanations foster the development of models that can better mimic human reasoning, crucial for integrative tasks involving intricate human-AI collaboration.

Future research might explore leveraging these explanations to refine attention-based models, particularly scrutinizing how highlighted words during annotation phase align with model attention mechanisms. Additionally, exploring more advanced neural architectures could yield improvements in producing meaningful and fully coherent explanations, strengthening model robustness against adversarial inputs.

In conclusion, e-SNLI constitutes a valuable contribution to the NLI community, providing a comprehensive resource for developing machine learning models endowed with enhanced explanatory capabilities. Its potential to bridge the gap between model predictions and human-like reasoning marks a pivotal step towards interpretable AI systems.