Privacy-preserving Representation Learning for Speech Understanding
Abstract: Existing privacy-preserving speech representation learning methods target a single application domain. In this paper, we present a novel framework to anonymize utterance-level speech embeddings generated by pre-trained encoders and show its effectiveness for a range of speech classification tasks. Specifically, given the representations from a pre-trained encoder, we train a Transformer to estimate the representations for the same utterances spoken by other speakers. During inference, the extracted representations can be converted into different identities to preserve privacy. We compare the results with the voice anonymization baselines from the VoicePrivacy 2022 challenge. We evaluate our framework on speaker identification for privacy and emotion recognition, depression classification, and intent classification for utility. Our method outperforms the baselines on privacy and utility in paralinguistic tasks and achieves comparable performance for intent classification.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.