Papers
Topics
Authors
Recent
Search
2000 character limit reached

Neural Scoring: A Refreshed End-to-End Approach for Speaker Recognition in Complex Conditions

Published 21 Oct 2024 in cs.SD and eess.AS | (2410.16428v3)

Abstract: Modern speaker verification systems primarily rely on speaker embeddings, followed by verification based on cosine similarity between the embedding vectors of the enrollment and test utterances. While effective, these methods struggle with multi-talker speech due to the unidentifiability of embedding vectors. In this paper, we propose Neural Scoring (NS), a refreshed end-to-end framework that directly estimates verification posterior probabilities without relying on test-side embeddings, making it more robust to complex conditions, e.g., with multiple talkers. To make the training of such an end-to-end model more efficient, we introduce a large-scale trial e2e training (LtE2E) strategy, where each test utterance pairs with a set of enrolled speakers, thus enabling the processing of large-scale verification trials per batch. Experiments on the VoxCeleb dataset demonstrate that NS consistently outperforms both the baseline and competitive methods across various conditions, achieving an overall 70.36% reduction in EER compared to the baseline.

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 2 likes about this paper.