Papers
Topics
Authors
Recent
Search
2000 character limit reached

DQ-Whisper: Joint Distillation and Quantization for Efficient Multilingual Speech Recognition

Published 18 May 2023 in cs.SD, cs.CL, and eess.AS | (2305.10788v2)

Abstract: As a popular multilingual and multitask pre-trained speech model, Whisper has the problem of curse of multilinguality. To enhance multilingual capabilities in small Whisper models, we propose DQ-Whisper, a novel joint distillation and quantization framework to compress Whisper for efficient inference. Firstly, we propose a novel dynamic matching distillation strategy. Then, a quantization-aware distillation framework is introduced to integrate quantization with distillation. Experimental results on various multilingual datasets show that our suggested distillation approach can effectively enhance the multilingual capabilities of small Whisper models without increasing computational costs. Up to 5.18x reduction in model size is achieved with marginal performance degradation. In addition, quantization is compatible with distillation, which can result in a higher compression rate.

Citations (9)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.