Papers
Topics
Authors
Recent
Search
2000 character limit reached

Beat-Aligned Spectrogram-to-Sequence Generation of Rhythm-Game Charts

Published 22 Nov 2023 in cs.LG, cs.MM, cs.SD, and eess.AS | (2311.13687v1)

Abstract: In the heart of "rhythm games" - games where players must perform actions in sync with a piece of music - are "charts", the directives to be given to players. We newly formulate chart generation as a sequence generation task and train a Transformer using a large dataset. We also introduce tempo-informed preprocessing and training procedures, some of which are suggested to be integral for a successful training. Our model is found to outperform the baselines on a large dataset, and is also found to benefit from pretraining and finetuning.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (10)
  1. C. Donahue, Z. C. Lipton, and J. McAuley, “Dance dance convolution,” in Proc. of the 34th International Conference on Machine Learning, Sydney, Australia, 2017, pp. 1039–1048.
  2. E. Halina and M. Guzdial, “Taikonation: Patterning-focused chart generation for rhythm action games,” in Proc. of the 16th International Conference on the Foundations of Digital Games, 2021, pp. 1–10.
  3. A. Takada, D. Yamazaki, Y. Yoshida, N. Ganbat, T. Shimotomai, N. Hamada, L. Liu, T. Yamamoto, and D. Sakurai, “Genélive! generating rhythm actions in love live!” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, no. 4, 2023, pp. 5266–5275.
  4. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017.
  5. D. P. Ellis and B.-M. Thierry, “Large-scale cover song recognition using the 2d fourier transform magnitude,” 2012.
  6. S. Dieleman, P. Brakel, and B. Schrauwen, “Audio-based music classification with a pretrained convolutional network,” in Proceedings of the 12th International Society for Music Information Retrieval Conference (ISMIR-2011).   University of Miami, 2011, pp. 669–674.
  7. J. Gardner, I. Simon, E. Manilow, C. Hawthorne, and J. Engel, “Mt3: Multi-task multitrack music transcription,” in Proc. of the 10th International Conference on Learning Representations, 2022.
  8. C.-Z. A. Huang, A. Vaswani, J. Uszkoreit, I. Simon, C. Hawthorne, N. Shazeer, A. M. Dai, M. D. Hoffman, M. Dinculescu, and D. Eck, “Music transformer: Generating music with long-term structure,” in Proceedings of the 6th International Conference on Learning Representations, 2018.
  9. D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
  10. A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly et al., “An image is worth 16x16 words: Transformers for image recognition at scale,” arXiv preprint arXiv:2010.11929, 2020.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (3)

Collections

Sign up for free to add this paper to one or more collections.