Papers
Topics
Authors
Recent
Search
2000 character limit reached

Morse Code-Enabled Speech Recognition for Individuals with Visual and Hearing Impairments

Published 7 Jul 2024 in cs.SD, cs.AI, cs.CL, cs.HC, and cs.LG | (2407.14525v1)

Abstract: The proposed model aims to develop a speech recognition technology for hearing, speech, or cognitively disabled people. All the available technology in the field of speech recognition doesn't come with an interface for communication for people with hearing, speech, or cognitive disabilities. The proposed model proposes the speech from the user, is transmitted to the speech recognition layer where it is converted into text and then that text is then transmitted to the morse code conversion layer where the morse code of the corresponding speech is given as the output. The accuracy of the model is completely dependent on speech recognition, as the morse code conversion is a process. The model is tested with recorded audio files with different parameters. The proposed model's WER and accuracy are both determined to be 10.18% and 89.82%, respectively.

Authors (1)
Definition Search Book Streamline Icon: https://streamlinehq.com
References (65)
  1. Y. Xu, C. Yu, J. Li, and Y. Liu, “Video telephony for end-consumers: Measurement study of google+, ichat, and skype,” in Proceedings of the 2012 Internet Measurement Conference, 2012, pp. 371–384.
  2. T.-Y. Huang, P. Huang, K.-T. Chen, and P.-J. Wang, “Could skype be more satisfying? a qoe-centric study of the fec mechanism in an internet-scale voip system,” IEEE Network, vol. 24, no. 2, pp. 42–48, 2010.
  3. ——, “Could skype be more satisfying?”
  4. T.-Y. Huang, K.-T. Chen, and P. Huang, “Tuning skype’s redundancy control algorithm for user satisfaction,” in IEEE INFOCOM 2009.   IEEE, 2009, pp. 1179–1187.
  5. K.-T. Chen, C.-C. Tu, and W.-C. Xiao, “Oneclick: A framework for measuring network quality of experience,” in IEEE INFOCOM 2009.   IEEE, 2009, pp. 702–710.
  6. K.-T. Chen, C.-C. Wu, Y.-C. Chang, and C.-L. Lei, “A crowdsourceable qoe evaluation framework for multimedia content,” in Proceedings of the 17th ACM international conference on Multimedia, 2009, pp. 491–500.
  7. Ł. Budzisz, R. Stanojevic, A. Schlote, F. Baker, and R. Shorten, “On the fair coexistence of loss-and delay-based tcp,” IEEE/ACM transactions on networking, vol. 19, no. 6, pp. 1811–1824, 2011.
  8. D. A. Hayes and G. Armitage, “Improved coexistence and loss tolerance for delay based tcp congestion control,” in IEEE Local Computer Network Conference.   IEEE, 2010, pp. 24–31.
  9. S. Akhshabi, A. C. Begen, and C. Dovrolis, “An experimental evaluation of rate-adaptation algorithms in adaptive streaming over http,” in Proceedings of the second annual ACM conference on Multimedia systems, 2011, pp. 157–168.
  10. S. Akhshabi, S. Narayanaswamy, A. C. Begen, and C. Dovrolis, “An experimental evaluation of rate-adaptive video players over http,” Signal Processing: Image Communication, vol. 27, no. 4, pp. 271–287, 2012.
  11. Y.-C. Chen, Y.-s. Lim, R. J. Gibbens, E. M. Nahum, R. Khalili, and D. Towsley, “A measurement-based study of multipath tcp performance over wireless networks,” in Proceedings of the 2013 conference on Internet measurement conference, 2013, pp. 455–468.
  12. M. Assefi, G. Liu, M. P. Wittie, and C. Izurieta, “An experimental evaluation of apple siri and google speech recognition,” Proccedings of the 2015 ISCA SEDE, vol. 118, 2015.
  13. ——, “Measuring the impact of network performance on cloud-based speech recognition,” INTERNATIONAL JOURNAL OF COMPUTERS AND THEIR APPLICATIONS, p. 19, 2016.
  14. L. De Cicco, G. Carlucci, and S. Mascolo, “Experimental investigation of the google congestion control for real-time flows,” in Proceedings of the 2013 ACM SIGCOMM workshop on Future human-centric multimedia networking, 2013, pp. 21–26.
  15. L. De Cicco and S. Mascolo, “An experimental investigation of the akamai adaptive video streaming,” in HCI in Work and Learning, Life and Leisure: 6th Symposium of the Workgroup Human-Computer Interaction and Usability Engineering, USAB 2010, Klagenfurt, Austria, November 4-5, 2010. Proceedings 6.   Springer, 2010, pp. 447–464.
  16. L. Messeri and M. Crockett, “Artificial intelligence and illusions of understanding in scientific research,” Nature, vol. 627, no. 8002, pp. 49–58, 2024.
  17. S. Winkler and R. Campos, “Video quality evaluation for internet streaming applications,” in Human vision and electronic imaging viii, vol. 5007.   SPIE, 2003, pp. 104–115.
  18. S. Winkler and F. Dufaux, “Video quality evaluation for mobile applications,” in Visual Communications and Image Processing 2003, vol. 5150.   SPIE, 2003, pp. 593–603.
  19. R. Perez-Lopez, N. Ghaffari Laleh, F. Mahmood, and J. N. Kather, “A guide to artificial intelligence for cancer researchers,” Nature Reviews Cancer, pp. 1–15, 2024.
  20. H. R. Oh and H. Song, “Mesh-pull-based p2p video streaming system using fountain codes,” in 2011 Proceedings of 20th International Conference on Computer Communications and Networks (ICCCN).   IEEE, 2011, pp. 1–6.
  21. G. Smith, P.-U. Tournoux, R. Boreli, J. Lacan, and E. Lochin, “On the limit of fountain mdc codes for video peer-to-peer networks,” in 2012 IEEE International Symposium on a World of Wireless, Mobile and Multimedia Networks (WoWMoM).   IEEE, 2012, pp. 1–6.
  22. D. Vukobratovic, V. Stankovic, D. Sejdinovic, L. Stankovic, and Z. Xiong, “Scalable video multicast using expanding window fountain codes,” IEEE Transactions on Multimedia, vol. 11, no. 6, pp. 1094–1104, 2009.
  23. ——, “Scalable data multicast using expanding window fountain codes,” in Proc. 45th Allerton Conf., Monticello, 2007.
  24. K. Kumar, R. Aggarwal, and A. Jain, “A hindi speech recognition system for connected words using htk,” International Journal of Computational Systems Engineering, vol. 1, no. 1, pp. 25–32, 2012.
  25. H. Suominen, L. Zhou, L. Hanlen, G. Ferraro et al., “Benchmarking clinical speech recognition and information extraction: new data, methods, and evaluations,” JMIR medical informatics, vol. 3, no. 2, p. e4321, 2015.
  26. M. Belenko and P. Balakshin, “Comparative analysis of speech recognition systems with open code,” no. 04 (58) 4, pp. 13–18, 2017.
  27. G. Hinton, L. Deng, D. Yu, G. E. Dahl, A.-r. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. N. Sainath et al., “Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups,” IEEE Signal processing magazine, vol. 29, no. 6, pp. 82–97, 2012.
  28. N. Jaitly, P. Nguyen, A. Senior, and V. Vanhoucke, “Application of pretrained deep neural networks to large vocabulary speech recognition,” 2012.
  29. L. Besacier, E. Barnard, A. Karpov, and T. Schultz, “Automatic speech recognition for under-resourced languages: A survey,” Speech communication, vol. 56, pp. 85–100, 2014.
  30. X. Lei, A. Senior, A. Gruenstein, and J. Sorensen, “Accurate and compact large vocabulary speech recognition on mobile devices,” 2013.
  31. M. Dua, R. Aggarwal, V. Kadyan, and S. Dua, “Punjabi automatic speech recognition using htk,” International Journal of Computer Science Issues (IJCSI), vol. 9, no. 4, p. 359, 2012.
  32. M. Y. El Amrani, M. H. Rahman, M. R. Wahiddin, and A. Shah, “Building cmu sphinx language model for the holy quran using simplified arabic phonemes,” Egyptian informatics journal, vol. 17, no. 3, pp. 305–314, 2016.
  33. O. Abdel-Hamid, A.-r. Mohamed, H. Jiang, L. Deng, G. Penn, and D. Yu, “Convolutional neural networks for speech recognition,” IEEE/ACM Transactions on audio, speech, and language processing, vol. 22, no. 10, pp. 1533–1545, 2014.
  34. J. Li, L. Deng, Y. Gong, and R. Haeb-Umbach, “An overview of noise-robust automatic speech recognition,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 22, no. 4, pp. 745–777, 2014.
  35. J. R. Bellegarda, “Spoken language understanding for natural interaction: The siri experience,” Natural interaction with robots, knowbots and smartphones, pp. 3–14, 2014.
  36. L. Deng and X. Li, “Machine learning paradigms for speech recognition: An overview,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 21, no. 5, pp. 1060–1089, 2013.
  37. D. S. Silnov, “Special features of radio interception of apco p25 messages in russia,” International Journal of Electrical and Computer Engineering, vol. 6, no. 3, p. 1072, 2016.
  38. V. Smirnov, D. Ignatov, M. Gusev, M. Farkhadov, N. Rumyantseva, and M. Farkhadova, “A russian keyword spotting system based on large vocabulary continuous speech recognition and linguistic knowledge,” Journal of Electrical and Computer Engineering, vol. 2016, 2016.
  39. M. R. Kamarudin, M. Yusof, and H. T. Jaya, “Low cost smart home automation via microsoft speech recognition,” International Journal of Engineering & Computer Science, vol. 13, no. 3, pp. 6–11, 2013.
  40. H. H.-J. Chen, “Developing and evaluating an oral skills training website supported by automatic speech recognition technology,” ReCALL, vol. 23, no. 1, pp. 59–78, 2011.
  41. P. Lakkhanawannakun and C. Noyunsan, “Speech recognition using deep learning,” in 2019 34th International Technical Conference on Circuits/Systems, Computers and Communications (ITC-CSCC).   IEEE, 2019, pp. 1–4.
  42. I. Tashev, “Kinect development kit: A toolkit for gesture-and speech-based human-machine interaction [best of the web],” IEEE Signal Processing Magazine, vol. 30, no. 5, pp. 129–131, 2013.
  43. R. Maskeliunas, K. Ratkevicius, and V. Rudzionis, “Voice-based human-machine interaction modeling for automated information services,” Elektronika ir Elektrotechnika, vol. 110, no. 4, pp. 109–112, 2011.
  44. Y. B. Krishna and S. Nagendram, “Zigbee based voice control system for smart home,” International Journal on Computer Technology and Applications, vol. 3, no. 1, pp. 163–168, 2012.
  45. F. Baig, S. Beg, and M. F. Khan, “Zigbee based home appliances controlling through spoken commands using handheld devices,” International Journal of Smart Home, vol. 7, no. 1, pp. 19–26, 2013.
  46. F. R. Sharma and S. G. Wasson, “Speech recognition and synthesis tool: assistive technology for physically disabled persons,” International Journal of Computer Science and Telecommunications, vol. 3, no. 4, pp. 86–91, 2012.
  47. K. Yamamoto, K. Kassai, I. Kuramoto, and Y. Tsujino, “Presenter supporting system with visual-overlapped positive response on audiences,” in Advances in Affective and Pleasurable Design.   Springer, 2017, pp. 87–93.
  48. A. Rai, A. Khan, A. Bajaj, and J. B. Khurana, “An efficient online examination system using speech recognition,” International Research Journal of Engineering and Technology, vol. 4, no. 4, pp. 2938–2941, 2017.
  49. A. J. Kumar, C. Schmidt, and J. Köhler, “A knowledge graph based speech interface for question answering systems,” Speech Communication, vol. 92, pp. 1–12, 2017.
  50. M. Y. El Amrani, M. H. Rahman, M. R. Wahiddin, and A. Shah, “Towards using cmu sphinx tools for the holy quran recitation verification,” Int. J. Islam. Appl. Comput. Sci. Technol, vol. 4, no. 2, pp. 10–15, 2016.
  51. D. K. Phull and G. B. Kumar, “Investigation of indian english speech recognition using cmu sphinx,” International Journal of Applied Engineering Research, vol. 11, no. 6, pp. 4167–4174, 2016.
  52. H. Satori and F. ElHaoussi, “Investigation amazigh speech recognition using cmu tools,” International Journal of Speech Technology, vol. 17, no. 3, pp. 235–243, 2014.
  53. K. Kulkarni, A. Londhe, B. Mahajan, C. Inamdar, and A. Jakhotiya, “Comprehensive tool for generation and compatibility management of subtitles for english language videos,” International Journal of Computational Intelligence Research, vol. 12, no. 1, pp. 63–68, 2016.
  54. M. Płonkowski and P. Urbanovich, “Tuning a cmu sphinx-iii speech recognition system for polish language,” 2014.
  55. I. Medennikov and A. Prudnikov, “Advances in stc russian spontaneous speech recognition system,” in International Conference on Speech and Computer.   Springer, 2016, pp. 116–123.
  56. L. Besacier, E. Gauthier, M. Mangeot, P. Bretier, P. Bagshaw, O. Rosec, T. Moudenc, F. Pellegrino, S. Voisin, E. Marsico et al., “Speech technologies for african languages: example of a multilingual calculator for education,” in Interspeech 2015 (short demo paper), 2015.
  57. V. Peddinti, V. Manohar, Y. Wang, D. Povey, and S. Khudanpur, “Far-field asr without parallel data.” in INTERSPEECH, vol. 9, 2016, pp. 1996–2000.
  58. K. Sakai, C. T. Ishi, T. Minato, and H. Ishiguro, “Online speech-driven head motion generating system and evaluation on a tele-operated robot,” in 2015 24th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN).   IEEE, 2015, pp. 529–534.
  59. M. Lojka, S. Ondáš, M. Pleva, and J. Juhár, “Multi-thread parallel speech recognition for mobile applications,” Journal of Electrical and Electronics Engineering, vol. 7, no. 1, pp. 81–86, 2014.
  60. R. Matarneh, S. Maksymova, V. Lyashenko, and N. Belova, “Speech recognition systems: A comparative review,” 2017.
  61. S. R. Mankala, S. R. Bojja, V. S. Ramaiah, and R. R. Rao, “Automatic speech processing using htk for telugu language,” International Journal of Advances in Engineering & Technology, vol. 6, no. 6, p. 2572, 2014.
  62. O. Adetunmbi, O. O. Obe, and J. Iyanda, “Development of standard yorùbá speech-to-text system using htk,” International Journal of Speech Technology, vol. 19, no. 4, pp. 929–944, 2016.
  63. C. Chelba, D. Bikel, M. Shugrina, P. Nguyen, and S. Kumar, “Large scale language modeling in automatic speech recognition,” arXiv preprint arXiv:1210.8440, 2012.
  64. M. Stenman, “Automatic speech recognition an evaluation of google speech,” 2015.
  65. A. L. Herchonvicz, C. R. Franco, and M. G. Jasinski, “A comparison of cloud-based speech recognition engines,” Anais do Computer on the Beach, pp. 366–375, 2019.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.