Papers
Topics
Authors
Recent
Search
2000 character limit reached

Developing Acoustic Models for Automatic Speech Recognition in Swedish

Published 25 Apr 2024 in eess.AS, cs.AI, and cs.SD | (2404.16547v1)

Abstract: This paper is concerned with automatic continuous speech recognition using trainable systems. The aim of this work is to build acoustic models for spoken Swedish. This is done employing hidden Markov models and using the SpeechDat database to train their parameters. Acoustic modeling has been worked out at a phonetic level, allowing general speech recognition applications, even though a simplified task (digits and natural number recognition) has been considered for model evaluation. Different kinds of phone models have been tested, including context independent models and two variations of context dependent models. Furthermore many experiments have been done with bigram LLMs to tune some of the system parameters. System performance over various speaker subsets with different sex, age and dialect has also been examined. Results are compared to previous similar studies showing a remarkable improvement.

Authors (1)
Definition Search Book Streamline Icon: https://streamlinehq.com
References (10)
  1. TVIT project. Speech, Music and Hearing department, KTH, Sweden.
  2. Test set definition and specification. Technical Report LE2-4001-SD1.3.4, Consortium and CEC, dec 1997.
  3. Robert Edward Donovan. Trainable Speech Synthesis. PhD thesis, Cambridge University Engineering Department, Trumpington Street Cambridge CB2 1PZ England, 1996.
  4. Gunnar Fant. Speech Sounds and Features. The MIT Press Cambridge, Massachusetts and London, England, 1973.
  5. The august spoken dialogue system. In Proceedings of EuroSpeech, 1999.
  6. The norwegian part of speechdat: A european speech database for creation of voice driven teleservices. In NORSIG, 1997.
  7. Discrete-Time Processing od Speech Signals. Macmillian Publishing Company, 866 Third Avenue, New York, New York 10022, 1993.
  8. Håkan Melin. On word boundary detection in digital-based speaker verification. In La Reconnaissance du Locuteur et ses Applications Commerciales et Criminalistiques, pages 46–49, 1998.
  9. Kåre Sjölander. Continuous speech recognition with hidden markov models. Master’s thesis, Kungliga Tekniska Högskolan Department of Speech, Music and Hearing, Drottning Kristinas väg 31 100 44 Stockholm, 1996.
  10. The HTK Book. Entropic Cambridge University Laboratory, dec 1997.
Citations (8)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 3 tweets with 3 likes about this paper.