Papers
Topics
Authors
Recent
Search
2000 character limit reached

Learning to Score Sign Language with Two-stage Method

Published 16 Apr 2024 in cs.CV | (2404.10383v2)

Abstract: Human action recognition and performance assessment have been hot research topics in recent years. Recognition problems have mature solutions in the field of sign language, but past research in performance analysis has focused on competitive sports and medical training, overlooking the scoring assessment ,which is an important part of sign language teaching digitalization. In this paper, we analyze the existing technologies for performance assessment and adopt methods that perform well in human pose reconstruction tasks combined with motion rotation embedded expressions, proposing a two-stage sign language performance evaluation pipeline. Our analysis shows that choosing reconstruction tasks in the first stage can provide more expressive features, and using smoothing methods can provide an effective reference for assessment. Experiments show that our method provides good score feedback mechanisms and high consistency with professional assessments compared to end-to-end evaluations.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (26)
  1. J. Lin, A. Zeng, H. Wang, L. Zhang, and Y. Li, “One-Stage 3D Whole-Body Mesh Recovery With Component Aware Transformer,” presented at the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 21159–21168.
  2. A. Zeng, L. Yang, X. Ju, J. Li, J. Wang, and Q. Xu, “SmoothNet: A Plug-and-Play Network for Refining Human Poses in Videos.” arXiv, Jul. 21, 2022. Accessed: Dec. 08, 2023. [Online]. Available: http://arxiv.org/abs/2112.13715
  3. T. Górecki and M. Łuczak, “Multivariate time series classification with parametric derivative dynamic time warping,” Expert Systems with Applications, vol. 42, no. 5, pp. 2305–2312, Apr. 2015, doi: 10.1016/j.eswa.2014.11.007.
  4. J. Joy, K. Balakrishnan, and S. M., “SiLearn: an intelligent sign vocabulary learning tool,” JET, vol. ahead-of-print, no. ahead-of-print, Aug. 2019, doi: 10.1108/JET-03-2019-0014.
  5. F.-C. Yang, “Holographic Sign Language Interpreter: A User Interaction Study within Mixed Reality Classroom,” PhD Thesis, Purdue University, 2022.
  6. L. Quandt, “Teaching ASL Signs using Signing Avatars and Immersive Learning in Virtual Reality,” in Proceedings of the 22nd International ACM SIGACCESS Conference on Computers and Accessibility, in ASSETS ’20. New York, NY, USA: Association for Computing Machinery, Oct. 2020, pp. 1–4. doi: 10.1145/3373625.3418042.
  7. H.-Y. Li, Q. Lei, H.-B. Zhang, and J.-X. Du, “Skeleton Based Action Quality Assessment of Figure Skating Videos,” 2021 11th International Conference on Information Technology in Medicine and Education (ITME), pp. 196–200, Nov. 2021, doi: 10.1109/ITME53901.2021.00048.
  8. P. Parmar and B. T. Morris, “Learning to Score Olympic Events,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Honolulu, HI, USA: IEEE, Jul. 2017, pp. 76–84. doi: 10.1109/CVPRW.2017.16.
  9. J. Xu, Y. Rao, X. Yu, G. Chen, J. Zhou, and J. Lu, “FineDiving: A Fine-grained Dataset for Procedure-aware Action Quality Assessment.” arXiv, Apr. 07, 2022. Accessed: Apr. 06, 2024. [Online]. Available: http://arxiv.org/abs/2204.03646
  10. X. Feng, X. Lu, and X. Si, “Taijiquan Auxiliary Training and Scoring Based on Motion Capture Technology and DTW Algorithm,” International Journal of Ambient Computing and Intelligence, vol. 14, no. 1, doi: 10.4018/IJACI.330539.
  11. H. Jain and G. Harit, “An Unsupervised Sequence-to-Sequence Autoencoder Based Human Action Scoring Model,” 2019 IEEE Global Conference on Signal and Information Processing (GlobalSIP), pp. 1–5, Nov. 2019, doi: 10.1109/GlobalSIP45357.2019.8969424.
  12. Z. Liu, L. Pang, and X. Qi, “MEN: Mutual Enhancement Networks for Sign Language Recognition and Education,” IEEE Trans. Neural Netw. Learning Syst., vol. 35, no. 1, pp. 311–325, Jan. 2024, doi: 10.1109/TNNLS.2022.3174031.
  13. Y. Zhang, Y. Min, and X. Chen, “Teaching Chinese Sign Language with a Smartphone,” Virtual Reality & Intelligent Hardware, vol. 3, no. 3, pp. 248–260, Jun. 2021, doi: 10.1016/j.vrih.2021.05.004.
  14. Y. Xu, J. Zhang, Q. Zhang, and D. Tao, “ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation.” arXiv, Oct. 12, 2022. doi: 10.48550/arXiv.2204.12484.
  15. P. Panteleris, I. Oikonomidis, and A. Argyros, “Using a single RGB frame for real time 3D hand pose estimation in the wild,” in Proc. IEEE Winter Conf. Appl. Comput. Vis. (WACV), Mar. 2018, pp. 436–445.
  16. C. Zimmermann and T. Brox, “Learning to estimate 3D hand pose from single RGB images,” in Proc. IEEE Int. Conf. Comput. Vis. (ICCV), Oct. 2017, pp. 4903–4911.
  17. Z. Cao, G. Hidalgo, T. Simon, S.-E. Wei, and Y. Sheikh, “Realtime multi-person 2D pose estimation using part affinity fields,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Jul. 2017, pp. 7291–7299.
  18. Z. Cao, G. Hidalgo, T. Simon, S.-E. Wei, and Y. Sheikh, “OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 43, no. 1, pp. 172–186, Jan. 2021, doi: 10.1109/TPAMI.2019.2929257.
  19. G. Moon, H. Choi, and K. M. Lee, “Accurate 3D Hand Pose Estimation for Whole-Body 3D Human Mesh Estimation.” arXiv, Apr. 19, 2022. doi: 10.48550/arXiv.2011.11534.
  20. R. Morais, V. Le, T. Tran, B. Saha, M. Mansour, and S. Venkatesh, “Learning Regularity in Skeleton Trajectories for Anomaly Detection in Videos,” presented at the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 11996–12004.
  21. Y. Songtao and W. Xueqin, “Tai Chi video registration method based on joint angle and DTW [J],” Computing Technology and Automation, vol. 39, no. 1, pp. 117–122, 2020.
  22. F. Yu, P. Jiazhen, and W. Jianhan, “The General Posture Descriptor of the Human Body for Virtual Reality Simulation,” in 2018 IEEE 9th International Conference on Software Engineering and Service Science (ICSESS), Nov. 2018, pp. 1147–1150. doi: 10.1109/ICSESS.2018.8663922.
  23. J. Wang, C. Zeng, Z. Wang, and K. Jiang, “An improved smart key frame extraction algorithm for vehicle target recognition,” Computers & Electrical Engineering, vol. 97, p. 107540, Jan. 2022, doi: 10.1016/j.compeleceng.2021.107540.
  24. H. Jain, G. Harit, and A. Sharma, “Action quality assessment using Siamese network-based deep metric learning,” IEEE Trans. Circuits Syst. Video Technol., vol. 31, no. 6, pp. 2260–2273, Jun. 2021..
  25. C. Long-fei, Y. Nakamura, and K. Kondo, “Modeling user behaviors in machine operation tasks for adaptive guidance,” 2020, arXiv:2003.03025.
  26. M. Müller, Ed., “Dynamic Time Warping,” in Information Retrieval for Music and Motion, Berlin, Heidelberg: Springer, 2007, pp. 69–84. doi: 10.1007/978-3-540-74048-3_4.
Citations (1)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (2)

Collections

Sign up for free to add this paper to one or more collections.