WaveRoRA: Wavelet Rotary Route Attention for Multivariate Time Series Forecasting
Abstract: In recent years, Transformer-based models (Transformers) have achieved significant success in multivariate time series forecasting (MTSF). However, previous works focus on extracting features either from the time domain or the frequency domain, which inadequately captures the trends and periodic characteristics. To address this issue, we propose a wavelet learning framework to model complex temporal dependencies of the time series data. The wavelet domain integrates both time and frequency information, allowing for the analysis of local characteristics of signals at different scales. Additionally, the Softmax self-attention mechanism used by Transformers has quadratic complexity, which leads to excessive computational costs when capturing long-term dependencies. Therefore, we propose a novel attention mechanism: Rotary Route Attention (RoRA). Unlike Softmax attention, RoRA utilizes rotary position embeddings to inject relative positional information to sequence tokens and introduces a small number of routing tokens $r$ to aggregate information from the $KV$ matrices and redistribute it to the $Q$ matrix, offering linear complexity. We further propose WaveRoRA, which leverages RoRA to capture inter-series dependencies in the wavelet domain. We conduct extensive experiments on eight real-world datasets. The results indicate that WaveRoRA outperforms existing state-of-the-art models while maintaining lower computational costs. Our code is available at https://github.com/Leopold2333/WaveRoRA.
- S. Guo, Y. Lin, N. Feng, C. Song, and H. Wan, “Attention based spatial-temporal graph convolutional networks for traffic flow forecasting,” in Proceedings of the AAAI conference on artificial intelligence, vol. 33, pp. 922–929, 2019.
- N. Uremović, M. Bizjak, P. Sukič, G. Štumberger, B. Žalik, and N. Lukač, “A new framework for multivariate time series forecasting in energy management system,” IEEE Transactions on Smart Grid, 2022.
- G. Zhang, D. Yang, G. Galanis, and E. Androulakis, “Solar forecasting with hourly updated numerical weather prediction,” Renewable and Sustainable Energy Reviews, vol. 154, p. 111768, 2022.
- H. Wu, T. Hu, Y. Liu, H. Zhou, J. Wang, and M. Long, “Timesnet: Temporal 2d-variation modeling for general time series analysis,” in International Conference on Learning Representations, 2023.
- J. F. Torres, D. Hadjout, A. Sebaa, F. Martínez-Álvarez, and A. Troncoso, “Deep learning for time series forecasting: a survey,” Big Data, vol. 9, no. 1, pp. 3–21, 2021.
- M. Li, Y. Liu, S. Zhi, T. Wang, and F. Chu, “Short-time fourier transform using odd symmetric window function,” Journal of Dynamics, Monitoring and Diagnostics, vol. 1, no. 1, pp. 37–45, 2022.
- D. Liu, J. Wang, S. Shang, and P. Han, “Msdr: Multi-step dependency relation networks for spatial temporal forecasting,” in Proceedings of the 28th ACM SIGKDD conference on knowledge discovery and data mining, pp. 1042–1050, 2022.
- Z. Zhang, L. Meng, and Y. Gu, “Sageformer: Series-aware framework for long-term multivariate time series forecasting,” IEEE Internet of Things Journal, 2024.
- H. Hewamalage, C. Bergmeir, and K. Bandara, “Recurrent neural networks for time series forecasting: Current status and future directions,” International Journal of Forecasting, vol. 37, no. 1, pp. 388–427, 2021.
- Z. Liu, Q. Ma, P. Ma, and L. Wang, “Temporal-frequency co-training for time series semi-supervised learning,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, pp. 8923–8931, 2023.
- A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017.
- Z. Chen, M. Ma, T. Li, H. Wang, and C. Li, “Long sequence time-series forecasting with deep learning: A survey,” Information Fusion, vol. 97, p. 101819, 2023.
- S. Li, X. Jin, Y. Xuan, X. Zhou, W. Chen, Y.-X. Wang, and X. Yan, “Enhancing the locality and breaking the memory bottleneck of transformer on time series forecasting,” Advances in neural information processing systems, vol. 32, 2019.
- J. W. Cooley and J. W. Tukey, “An algorithm for the machine calculation of complex fourier series,” Mathematics of computation, vol. 19, no. 90, pp. 297–301, 1965.
- Y. Nie, N. H. Nguyen, P. Sinthong, and J. Kalagnanam, “A time series is worth 64 words: Long-term forecasting with transformers,” in The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023, OpenReview.net, 2023.
- Y. Liu, T. Hu, H. Zhang, H. Wu, S. Wang, L. Ma, and M. Long, “itransformer: Inverted transformers are effective for time series forecasting,” in The Twelfth International Conference on Learning Representations, 2023.
- H. Zhou, S. Zhang, J. Peng, S. Zhang, J. Li, H. Xiong, and W. Zhang, “Informer: Beyond efficient transformer for long sequence time-series forecasting,” in Proceedings of the AAAI conference on artificial intelligence, vol. 35, pp. 11106–11115, 2021.
- H. Wu, J. Xu, J. Wang, and M. Long, “Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting,” Advances in neural information processing systems, vol. 34, pp. 22419–22430, 2021.
- Y. Bai, J. Wang, X. Zhang, X. Miao, and Y. Lin, “Crossfun: Multi-view joint cross fusion network for time series anomaly detection,” IEEE Transactions on Instrumentation and Measurement, 2023.
- X. Zhang, X. Jin, K. Gopalswamy, G. Gupta, Y. Park, X. Shi, H. Wang, D. C. Maddix, and Y. Wang, “First de-trend then attend: Rethinking attention for time-series forecasting,” arXiv preprint arXiv:2212.08151, 2022.
- Z. Wu, S. Pan, G. Long, J. Jiang, X. Chang, and C. Zhang, “Connecting the dots: Multivariate time series forecasting with graph neural networks,” in Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining, pp. 753–763, 2020.
- T. Zhou, Z. Ma, Q. Wen, X. Wang, L. Sun, and R. Jin, “Fedformer: Frequency enhanced decomposed transformer for long-term series forecasting,” in International conference on machine learning, pp. 27268–27286, PMLR, 2022.
- M. Jiang, P. Zeng, K. Wang, H. Liu, W. Chen, and H. Liu, “Fecam: Frequency enhanced channel attention mechanism for time series forecasting,” Advanced Engineering Informatics, vol. 58, p. 102158, 2023.
- J. W. Gibbs, “Fourier’s series,” Nature, vol. 59, no. 1539, pp. 606–606, 1899.
- I. Daubechies, “Ten lectures on wavelets,” Society for industrial and applied mathematics, 1992.
- F. Yang, X. Li, M. Wang, H. Zang, W. Pang, and M. Wang, “Waveform: Graph enhanced wavelet learning for long sequence forecasting of multivariate time series,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, pp. 10754–10761, 2023.
- I. Daubechies, “The wavelet transform, time-frequency localization and signal analysis,” IEEE transactions on information theory, vol. 36, no. 5, pp. 961–1005, 1990.
- J. Wan, N. Xia, Y. Yin, X. Pan, J. Hu, and J. Yi, “Tcdformer: A transformer framework for non-stationary time series forecasting based on trend and change-point detection,” Neural Networks, vol. 173, p. 106196, 2024.
- Y. Zhang and J. Yan, “Crossformer: Transformer utilizing cross-dimension dependency for multivariate time series forecasting,” in The eleventh international conference on learning representations, 2023.
- Y. Jia, Y. Lin, X. Hao, Y. Lin, S. Guo, and H. Wan, “Witran: Water-wave information transmission and recurrent acceleration network for long-range time series forecasting,” Advances in Neural Information Processing Systems, vol. 36, 2024.
- A. Katharopoulos, A. Vyas, N. Pappas, and F. Fleuret, “Transformers are rnns: Fast autoregressive transformers with linear attention,” in International conference on machine learning, pp. 5156–5165, PMLR, 2020.
- Y. Liu, H. Wu, J. Wang, and M. Long, “Non-stationary transformers: Exploring the stationarity in time series forecasting,” Advances in Neural Information Processing Systems, vol. 35, pp. 9881–9893, 2022.
- T. Kim, J. Kim, Y. Tae, C. Park, J.-H. Choi, and J. Choo, “Reversible instance normalization for accurate time-series forecasting against distribution shift,” in International Conference on Learning Representations, 2021.
- J. Su, M. Ahmed, Y. Lu, S. Pan, W. Bo, and Y. Liu, “Roformer: Enhanced transformer with rotary position embedding,” Neurocomputing, vol. 568, p. 127063, 2024.
- A. Zeng, M. Chen, L. Zhang, and Q. Xu, “Are transformers effective for time series forecasting?,” in Proceedings of the AAAI conference on artificial intelligence, vol. 37, pp. 11121–11128, 2023.
- S. Elfwing, E. Uchibe, and K. Doya, “Sigmoid-weighted linear units for neural network function approximation in reinforcement learning,” Neural networks, vol. 107, pp. 3–11, 2018.
- J. Diaz-De-Arcaya, A. I. Torre-Bastida, G. Zárate, R. Miñón, and A. Almeida, “A joint study of the challenges, opportunities, and roadmap of mlops and aiops: A systematic survey,” ACM Computing Surveys, vol. 56, no. 4, pp. 1–30, 2023.
- Y. Liu, H. Zhang, C. Li, X. Huang, J. Wang, and M. Long, “Timer: Transformers for time series analysis at scale,” arXiv preprint arXiv:2402.02368, 2024.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.