Papers
Topics
Authors
Recent
Search
2000 character limit reached

A Principle Solution for Enroll-Test Mismatch in Speaker Recognition

Published 23 Dec 2020 in cs.SD and eess.AS | (2012.12471v2)

Abstract: Mismatch between enrollment and test conditions causes serious performance degradation on speaker recognition systems. This paper presents a statistics decomposition (SD) approach to solve this problem. This approach decomposes the PLDA score into three components that corresponding to enrollment, prediction and normalization respectively. Given that correct statistics are used in each component, the resultant score is theoretically optimal. A comprehensive experimental study was conducted on three datasets with different types of mismatch: (1) physical channel mismatch, (2) speaking behavior mismatch, (3) near-far recording mismatch. The results demonstrated that the proposed SD approach is highly effective, and outperforms the ad-hoc multi-condition training approach that is commonly adopted but not optimal in theory.

Citations (6)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.