Papers
Topics
Authors
Recent
Search
2000 character limit reached

Unsupervised Layer-wise Score Aggregation for Textual OOD Detection

Published 20 Feb 2023 in cs.CL and cs.AI | (2302.09852v3)

Abstract: Out-of-distribution (OOD) detection is a rapidly growing field due to new robustness and security requirements driven by an increased number of AI-based systems. Existing OOD textual detectors often rely on an anomaly score (e.g., Mahalanobis distance) computed on the embedding output of the last layer of the encoder. In this work, we observe that OOD detection performance varies greatly depending on the task and layer output. More importantly, we show that the usual choice (the last layer) is rarely the best one for OOD detection and that far better results could be achieved if the best layer were picked. To leverage this observation, we propose a data-driven, unsupervised method to combine layer-wise anomaly scores. In addition, we extend classical textual OOD benchmarks by including classification tasks with a greater number of classes (up to 77), which reflects more realistic settings. On this augmented benchmark, we show that the proposed post-aggregation methods achieve robust and consistent results while removing manual feature selection altogether. Their performance achieves near oracle's best layer performance.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (90)
  1. Detecting Out-of-Distribution Inputs in Deep Neural Networks Using an Early-Layer Output. arXiv:1910.10307.
  2. HINT3: Raising the bar for Intent Detection in the Wild. In Proceedings of the First Workshop on Insights from Negative Results in NLP, 100–105. Online: Association for Computational Linguistics.
  3. Types of Out-of-Distribution Texts and How to Detect Them. arXiv preprint arXiv:2109.06827.
  4. On the cross-lingual transferability of monolingual representations. CoRR, abs/1910.11856.
  5. Just say no: Analyzing the stance of neural dialogue generation in offensive contexts. arXiv preprint arXiv:2108.11830.
  6. Multi-layer Representation Learning for Robust OOD Image Classification. In Proceedings of the 12th Hellenic Conference on Artificial Intelligence. ACM.
  7. XLM-T: Multilingual Language Models in Twitter for Sentiment Analysis and Beyond. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, 258–266. Marseille, France: European Language Resources Association.
  8. Cats are not fish: Deep learning testing calls for out-of-distribution awareness. In Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering, 1041–1052.
  9. Blard, T. 2019. French sentiment analysis with BERT. https://github.com/TheophileBlard/french-sentiment-analysis-with-bert.
  10. LOF: identifying density-based local outliers. In Proceedings of the 2000 ACM SIGMOD international conference on Management of data, 93–104.
  11. Anomalous example detection in deep learning: A survey. IEEE Access, 8: 132330–132347.
  12. Generating an entailment corpus from news headlines. In Proceedings of the ACL Workshop on Empirical Modeling of Semantic Equivalence and Entailment, 49–54. Association for Computational Linguistics.
  13. Efficient Intent Detection with Dual Sentence Encoders. In Proceedings of the 2nd Workshop on NLP for ConvAI - ACL 2020. Data available at https://github.com/PolyAI-LDN/task-specific-datasets.
  14. Beyond Mahalanobis Distance for Textual OOD Detection. In Advances in Neural Information Processing Systems.
  15. XNLI: Evaluating Cross-lingual Sentence Representations. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics.
  16. Feature Extraction By Using Deep Learning: A Survey. In 2018 Second International Conference on Electronics, Communication and Aerospace Technology (ICECA), 1795–1801.
  17. Rainproof: An Umbrella To Shield Text Generators From Out-Of-Distribution Data. arXiv preprint arXiv:2212.09171.
  18. What’s so special about BERT’s layers? A closer look at the NLP pipeline in monolingual and multilingual models. arXiv preprint arXiv:2004.06499.
  19. GoEmotions: A Dataset of Fine-Grained Emotions. In 58th Annual Meeting of the Association for Computational Linguistics (ACL).
  20. ImageNet: A Large-Scale Hierarchical Image Database. In CVPR09.
  21. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
  22. Zero-Shot Out-of-Distribution Detection Based on the Pretrained Model CLIP. In Proceedings of the AAAI conference on artificial intelligence.
  23. Sentiment analysis using product review data. Journal of Big Data, 2(1): 1–14.
  24. Is Out-of-Distribution Detection Learnable? In Oh, A. H.; Agarwal, A.; Belgrave, D.; and Cho, K., eds., Advances in Neural Information Processing Systems.
  25. MASSIVE: A 1M-Example Multilingual Natural Language Understanding Dataset with 51 Typologically-Diverse Languages. arXiv:2204.08582.
  26. Igeood: An Information Geometry Approach to Out-of-Distribution Detection. arXiv preprint arXiv:2203.07798.
  27. Inequalities. Cambridge university press.
  28. The many faces of robustness: A critical analysis of out-of-distribution generalization. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 8340–8349.
  29. A baseline for detecting misclassified and out-of-distribution examples in neural networks. arXiv preprint arXiv:1610.02136.
  30. Pretrained transformers improve out-of-distribution robustness. arXiv preprint arXiv:2004.06100.
  31. Recognizing textual entailment with LCC’s GROUNDHOG system. In Proceedings of the Second PASCAL Challenges Workshop.
  32. Toward Semantics-Based Answer Pinpointing. In Proceedings of the First International Conference on Human Language Technology Research.
  33. Feature space singularity for out-of-distribution detection. arXiv preprint arXiv:2011.14654.
  34. Mos: Towards scaling out-of-distribution detection for large semantic space. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8710–8719.
  35. Density-driven Regularization for Out-of-distribution Detection. In Advances in Neural Information Processing Systems.
  36. Abstraction hierarchy in deep learning neural networks. In 2017 International Joint Conference on Neural Networks (IJCNN), 768–774. IEEE.
  37. Generalization error in deep learning. In Compressed sensing and its applications, 153–193. Springer.
  38. Joachims, T. 1996. A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization. Technical report, Carnegie-mellon univ pittsburgh pa dept of computer science.
  39. Rodd: A self-supervised approach for robust out-of-distribution detection. arXiv preprint arXiv:2204.02553.
  40. Sentiment analysis of twitter data: a survey of techniques. arXiv preprint arXiv:1601.06971.
  41. Evolution of Abstraction Across Layers in Deep Learning Neural Networks. Procedia Computer Science, 144: 203–213.
  42. Lang, K. 1995. NewsWeeder: Learning to Filter Netnews. In Prieditis, A.; and Russell, S., eds., Machine Learning Proceedings 1995, 331–339. San Francisco (CA): Morgan Kaufmann. ISBN 978-1-55860-377-6.
  43. An Evaluation Dataset for Intent Classification and Out-of-Scope Prediction. CoRR, abs/1909.02027.
  44. An evaluation dataset for intent classification and out-of-scope prediction. arXiv preprint arXiv:1909.02027.
  45. Tiny ImageNet Visual Recognition Challenge.
  46. Gradient-Based Adversarial and Out-of-Distribution Detection. arXiv preprint arXiv:2206.08255.
  47. A simple unified framework for detecting out-of-distribution samples and adversarial attacks. Advances in neural information processing systems, 31.
  48. k𝑘kitalic_k Folden: k𝑘kitalic_k-Fold Ensemble for Out-Of-Distribution Detection. arXiv preprint arXiv:2108.12731.
  49. Learning Question Classifiers. In COLING 2002: The 19th International Conference on Computational Linguistics.
  50. Enhancing The Reliability of Out-of-distribution Image Detection in Neural Networks. In International Conference on Learning Representations.
  51. Mood: Multi-level out-of-distribution detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 15313–15323.
  52. Isolation Forest. In In Proceedings 8th IEEE International Conference on Data Mining, 413–422.
  53. Energy-based Out-of-distribution Detection. Advances in Neural Information Processing Systems.
  54. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692.
  55. Learning Word Vectors for Sentiment Analysis. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, 142–150. Portland, Oregon, USA: Association for Computational Linguistics.
  56. Robustness to out-of-distribution inputs via task-aware generative uncertainty. In 2019 International Conference on Robotics and Automation (ICRA), 2083–2089. IEEE.
  57. An Empirical Study on Cross-X Transfer for Legal Judgment Prediction. arXiv:2209.12325.
  58. Understanding softmax confidence and uncertainty. arXiv preprint arXiv:2106.04972.
  59. Adversarial Attack Detection Under Realistic Constraints.
  60. Revisiting Mahalanobis Distance for Transformer-Based Out-of-Domain Detection. arXiv preprint arXiv:2101.03778.
  61. A general framework for detecting anomalous inputs to dnn classifiers. In International Conference on Machine Learning, 8764–8775. PMLR.
  62. Integrated rank-weighted depth. Journal of Multivariate Analysis, 173: 51–69.
  63. Likelihood Ratios for Out-of-Distribution Detection. In Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc.
  64. Concatenated power mean word embeddings as universal cross-lingual sentence representations. arXiv preprint arXiv:1803.01400.
  65. Ruder, S. 2022. The State of Multilingual AI. http://ruder.io/state-of-multilingual-ai/.
  66. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108.
  67. Identifying untrustworthy predictions in neural networks by geometric gradient analysis. In Uncertainty in Artificial Intelligence, 854–864. PMLR.
  68. Input complexity and out-of-distribution detection with likelihood-based generative models. arXiv preprint arXiv:1909.11480.
  69. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, 1631–1642. Seattle, Washington, USA: Association for Computational Linguistics.
  70. Predicting the performance of multilingual nlp models. arXiv preprint arXiv:2110.08875.
  71. Affine-Invariant Integrated Rank-Weighted Depth: Definition, Properties and Finite Sample Analysis. arXiv preprint arXiv:2106.11068.
  72. Out-of-distribution Detection with Deep Nearest Neighbors. In ICML.
  73. Hyperparameter-free out-of-distribution detection using softmax of scaled cosine similarity. arXiv preprint arXiv:1905.10628.
  74. X-stance: A multilingual multi-target dataset for stance detection. arXiv preprint arXiv:2003.08385.
  75. Writing system and speaker metadata for 2,800+ language varieties. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, 5035–5046.
  76. Attention is all you need. Advances in neural information processing systems, 30.
  77. Superglue: A stickier benchmark for general-purpose language understanding systems. arXiv preprint arXiv:1905.00537.
  78. GLUE: A multi-task benchmark and analysis platform for natural language understanding. arXiv preprint arXiv:1804.07461.
  79. ViM: Out-Of-Distribution with Virtual-logit Matching. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4921–4930.
  80. Layer Adaptive Deep Neural Networks for Out-of-Distribution Detection. In Advances in Knowledge Discovery and Data Mining: 26th Pacific-Asia Conference, PAKDD 2022, Chengdu, China, May 16–19, 2022, Proceedings, Part II, 526–538. Springer.
  81. Out-of-Distribution Detection via Conditional Kernel Independence Model. In Advances in Neural Information Processing Systems.
  82. A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), 1112–1122. New Orleans, Louisiana: Association for Computational Linguistics.
  83. Contrastive training for improved out-of-distribution detection. arXiv preprint arXiv:2007.05566.
  84. Contrastive Training for Improved Out-of-Distribution Detection. ArXiv, abs/2007.05566.
  85. A deep generative distance-based classifier for out-of-domain detection with mahalanobis space. In Proceedings of the 28th International Conference on Computational Linguistics, 1452–1460.
  86. Generalized Out-of-Distribution Detection: A Survey.
  87. PAWS-X: A Cross-lingual Adversarial Dataset for Paraphrase Identification. In Proc. of EMNLP.
  88. MDIA: A Benchmark for Multilingual Dialogue Generation in 46 Languages. arXiv preprint arXiv:2208.13078.
  89. Contrastive Out-of-Distribution Detection for Pretrained Transformers. arXiv preprint arXiv:2104.08812.
  90. Multilingual Stance Detection in Tweets: The Catalonia Independence Corpus. In Proceedings of the Twelfth Language Resources and Evaluation Conference, 1368–1375. Marseille, France: European Language Resources Association. ISBN 979-10-95546-34-4.
Citations (11)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.