Papers
Topics
Authors
Recent
Search
2000 character limit reached

Enhancing Sign Language Teaching: A Mixed Reality Approach for Immersive Learning and Multi-Dimensional Feedback

Published 16 Apr 2024 in cs.CV | (2404.10490v2)

Abstract: Traditional sign language teaching methods face challenges such as limited feedback and diverse learning scenarios. Although 2D resources lack real-time feedback, classroom teaching is constrained by a scarcity of teacher. Methods based on VR and AR have relatively primitive interaction feedback mechanisms. This study proposes an innovative teaching model that uses real-time monocular vision and mixed reality technology. First, we introduce an improved hand-posture reconstruction method to achieve sign language semantic retention and real-time feedback. Second, a ternary system evaluation algorithm is proposed for a comprehensive assessment, maintaining good consistency with experts in sign language. Furthermore, we use mixed reality technology to construct a scenario-based 3D sign language classroom and explore the user experience of scenario teaching. Overall, this paper presents a novel teaching method that provides an immersive learning experience, advanced posture reconstruction, and precise feedback, achieving positive feedback on user experience and learning effectiveness.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (26)
  1. D. Quinto-Pozos, “Teaching American Sign Language to Hearing Adult Learners,” Annual Review of Applied Linguistics, vol. 31, pp. 137–158, 2011. doi:10.1017/S0267190511000195
  2. I. Kožuh, S. Hauptman, P. Kosec, and M. Debevc, “Assessing the Efficiency of Using Mixed Reality for Learning Sign Language,” in Universal Access in Human-Computer Interaction. Access to Interaction, M. Antona and C. Stephanidis, Eds., in Lecture Notes in Computer Science. Cham: Springer International Publishing, 2015, pp. 404–415. doi: 10.1007/978-3-319-20681-3_38.
  3. A. Kydyrbekova, A. Kenzhekhan, S. Omirbayev, N. Oralbayeva, A. Imashev, and A. Sandygulova, “Interaction Design of the Mixed Reality Application for Deaf Children,” 2023.
  4. R. Cui, H. Liu, and C. Zhang, “Recurrent convolutional neural networks for continuous sign language recognition by staged optimization,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jul. 2017, pp. 7361–7369.
  5. J. Huang, W. Zhou, Q. Zhang, H. Li, and W. Li, “Video-based sign language recognition without temporal segmentation,” in Proc. Assoc. Adv. Artif. Intell. (AAAI), New Orleans, LA, USA, 2018.
  6. J. Lin, A. Zeng, H. Wang, L. Zhang, and Y. Li, “One-Stage 3D Whole-Body Mesh Recovery With Component Aware Transformer,” presented at the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 21159–21168.
  7. Y. Zhang, Y. Min, and X. Chen, “Teaching Chinese Sign Language with a Smartphone,” Virtual Reality & Intelligent Hardware, vol. 3, no. 3, pp. 248–260, Jun. 2021, doi: 10.1016/j.vrih.2021.05.004.
  8. Y. Li, X. Chai, and X. Chen, “ScoringNet: Learning Key Fragment for Action Quality Assessment with Ranking Loss in Skilled Sports,” in Computer Vision – ACCV 2018, C. V. Jawahar, H. Li, G. Mori, and K. Schindler, Eds., Cham: Springer International Publishing, 2019, pp. 149–164. doi: 10.1007/978-3-030-20876-9_10.
  9. Y. Rong, T. Shiratori, and H. Joo, “FrankMocap: A Monocular 3D Whole-Body Pose Estimation System via Regression and Integration,” presented at the Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 1749–1759.
  10. Z. Cao, G. Hidalgo, T. Simon, S.-E. Wei, and Y. Sheikh, “OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 43, no. 1, pp. 172–186, Jan. 2021, doi: 10.1109/TPAMI.2019.2929257.
  11. G. Moon, H. Choi, and K. M. Lee, “Accurate 3D Hand Pose Estimation for Whole-Body 3D Human Mesh Estimation.” arXiv, Apr. 19, 2022. doi: 10.48550/arXiv.2011.11534.
  12. R. Morais, V. Le, T. Tran, B. Saha, M. Mansour, and S. Venkatesh, “Learning Regularity in Skeleton Trajectories for Anomaly Detection in Videos,” presented at the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 11996–12004.
  13. Y. Songtao and W. Xueqin, “Tai Chi video registration method based on joint angle and DTW [J],” Computing Technology and Automation, vol. 39, no. 1, pp. 117–122, 2020.
  14. F. Yu, P. Jiazhen, and W. Jianhan, “The General Posture Descriptor of the Human Body for Virtual Reality Simulation,” in 2018 IEEE 9th International Conference on Software Engineering and Service Science (ICSESS), Nov. 2018, pp. 1147–1150. doi: 10.1109/ICSESS.2018.8663922.
  15. J. Wang, C. Zeng, Z. Wang, and K. Jiang, “An improved smart key frame extraction algorithm for vehicle target recognition,” Computers & Electrical Engineering, vol. 97, p. 107540, Jan. 2022, doi: 10.1016/j.compeleceng.2021.107540.
  16. X. Chai, Z. Liu, Y. Li, F. Yin, and X. Chen, “SignInstructor: An effective tool for sign language vocabulary learning,” in Proc. 4th IAPR Asian Conf. Pattern Recognit. (ACPR), Nov. 2017, pp. 900–905.
  17. Y. Li, X. Chai, and X. Chen, “End-to-end learning for action quality assessment,” in Proc. Pacific Rim Conf. Multimedia.Cham, Switzerland: Springer, 2018, pp. 125–134.
  18. L. Yongjun, C. Xiujuan, and C. Xilin, “ScoringNet: Learning key fragment for action quality assessment with ranking loss in skilled sports,” in Proc. Asian Conf. Comput. Vis. Cham, Switzerland: Springer, 2018, pp. 149–164.
  19. P. Panteleris, I. Oikonomidis, and A. Argyros, “Using a single RGB frame for real time 3D hand pose estimation in the wild,” in Proc. IEEE Winter Conf. Appl. Comput. Vis. (WACV), Mar. 2018, pp. 436–445.
  20. C. Zimmermann and T. Brox, “Learning to estimate 3D hand pose from single RGB images,” in Proc. IEEE Int. Conf. Comput. Vis. (ICCV), Oct. 2017, pp. 4903–4911.
  21. Z. Cao, G. Hidalgo, T. Simon, S.-E. Wei, and Y. Sheikh, “Realtime multi-person 2D pose estimation using part affinity fields,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Jul. 2017, pp. 7291–7299.
  22. M. Müller, Ed., “Dynamic Time Warping,” in Information Retrieval for Music and Motion, Berlin, Heidelberg: Springer, 2007, pp. 69–84. doi: 10.1007/978-3-540-74048-3_4.
  23. S. Deb, Suraksha, and P. Bhattacharya, “Mixed Sign Language Modeling(ASLM) with interaction design on smartphone - an assistive learning and communication tool for inclusive classroom,” Procedia Computer Science, vol. 125, pp. 492–500, 2018, doi: 10.1016/j.procs.2017.12.064.
  24. C. Bork, K. Ng, Y. Liu, A. Yee, and M. Pohlscheidt, “Chromatographic peak alignment using derivative dynamic time warping,” Biotechnol Prog, vol. 29, no. 2, pp. 394–402, 2013, doi: 10.1002/btpr.1680.
  25. Y. Xu, J. Zhang, Q. Zhang, and D. Tao, “Vitpose: Simple vision transformer baselines for human pose estimation,” Advances in Neural Information Processing Systems, vol. 35, pp. 38571–38584, 2022.
  26. A. Zeng, L. Yang, X. Ju, J. Li, J. Wang, and Q. Xu, “SmoothNet: A Plug-and-Play Network for Refining Human Poses in Videos.” arXiv, Jul. 21, 2022. Accessed: Dec. 08, 2023. [Online].

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.