Papers
Topics
Authors
Recent
Search
2000 character limit reached

TeLeS: Temporal Lexeme Similarity Score to Estimate Confidence in End-to-End ASR

Published 6 Jan 2024 in eess.AS, cs.LG, cs.SD, and stat.ML | (2401.03251v1)

Abstract: Confidence estimation of predictions from an End-to-End (E2E) Automatic Speech Recognition (ASR) model benefits ASR's downstream and upstream tasks. Class-probability-based confidence scores do not accurately represent the quality of overconfident ASR predictions. An ancillary Confidence Estimation Model (CEM) calibrates the predictions. State-of-the-art (SOTA) solutions use binary target scores for CEM training. However, the binary labels do not reveal the granular information of predicted words, such as temporal alignment between reference and hypothesis and whether the predicted word is entirely incorrect or contains spelling errors. Addressing this issue, we propose a novel Temporal-Lexeme Similarity (TeLeS) confidence score to train CEM. To address the data imbalance of target scores while training CEM, we use shrinkage loss to focus on hard-to-learn data points and minimise the impact of easily learned data points. We conduct experiments with ASR models trained in three languages, namely Hindi, Tamil, and Kannada, with varying training data sizes. Experiments show that TeLeS generalises well across domains. To demonstrate the applicability of the proposed method, we formulate a TeLeS-based Acquisition (TeLeS-A) function for sampling uncertainty in active learning. We observe a significant reduction in the Word Error Rate (WER) as compared to SOTA methods.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (8)
  1. Pilar, Bharathi. ”Subword Dictionary Learning and Segmentation Techniques for Automatic Speech Recognition in Tamil and Kannada.” arXiv preprint arXiv:2207.13331 (2022).
  2. Pilar, Bharathi. ”Knowledge-driven Subword Grammar Modeling for Automatic Speech Recognition in Tamil and Kannada.” arXiv preprint arXiv:2207.13333 (2022).
  3. https://www.openslr.org/127/ Accessed online: July 30, 2023.
  4. https://www.openslr.org/126/ Accessed online: July 30, 2023.
  5. https://ai4bharat.iitm.ac.in/indic-superb Accessed online: July 28 2023.
  6. https://www.iitk.ac.in/new/param-sanganak Accessed online: July 28 2023.
  7. Heafield, Kenneth. ”KenLM: Faster and smaller language model queries.” Proceedings of the sixth workshop on statistical machine translation. 2011.
  8. https://www.nist.gov/system/files/documents/2017/11/30/nce.pdf (Accessed Online 21.02.2023).

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.