ImputeFormer: Low Rankness-Induced Transformers for Generalizable Spatiotemporal Imputation
Abstract: Missing data is a pervasive issue in both scientific and engineering tasks, especially for the modeling of spatiotemporal data. This problem attracts many studies to contribute to data-driven solutions. Existing imputation solutions mainly include low-rank models and deep learning models. The former assumes general structural priors but has limited model capacity. The latter possesses salient features of expressivity but lacks prior knowledge of the underlying spatiotemporal structures. Leveraging the strengths of both two paradigms, we demonstrate a low rankness-induced Transformer to achieve a balance between strong inductive bias and high model expressivity. The exploitation of the inherent structures of spatiotemporal data enables our model to learn balanced signal-noise representations, making it generalizable for a variety of imputation problems. We demonstrate its superiority in terms of accuracy, efficiency, and versatility in heterogeneous datasets, including traffic flow, solar energy, smart meters, and air quality. Promising empirical results provide strong conviction that incorporating time series primitives, such as low-rankness, can substantially facilitate the development of a generalizable model to approach a wide range of spatiotemporal imputation problems.
- Y. Duan, Y. Lv, Y.-L. Liu, and F.-Y. Wang, “An efficient realization of deep learning for traffic data imputation,” Transportation research part C: emerging technologies, vol. 72, pp. 168–181, 2016.
- X. Chen, Z. He, and L. Sun, “A bayesian tensor decomposition approach for spatiotemporal traffic data imputation,” Transportation research part C: emerging technologies, vol. 98, pp. 73–84, 2019.
- Y. Chen, Y. Lv, and F.-Y. Wang, “Traffic flow imputation using parallel data and generative adversarial networks,” IEEE Transactions on Intelligent Transportation Systems, vol. 21, no. 4, pp. 1624–1630, 2019.
- C. Chen, K. Petty, A. Skabardonis, P. Varaiya, and Z. Jia, “Freeway performance measurement system: mining loop detector data,” Transportation Research Record, vol. 1748, no. 1, pp. 96–102, 2001.
- X. Chen, J. Yang, and L. Sun, “A nonconvex low-rank tensor completion model for spatiotemporal traffic data imputation,” Transportation Research Part C: Emerging Technologies, vol. 117, p. 102673, 2020.
- H. Li, M. Li, X. Lin, F. He, and Y. Wang, “A spatiotemporal approach for traffic data imputation with complicated missing patterns,” Transportation research part C: emerging technologies, vol. 119, p. 102730, 2020.
- Y. Ye, S. Zhang, and J. J. Yu, “Spatial-temporal traffic data imputation via graph attention convolutional network,” in International Conference on Artificial Neural Networks. Springer, 2021, pp. 241–252.
- T. Nie, G. Qin, and J. Sun, “Truncated tensor schatten p-norm based approach for spatiotemporal traffic data imputation with complicated missing patterns,” Transportation research part C: emerging technologies, vol. 141, p. 103737, 2022.
- T. Nie, G. Qin, Y. Wang, and J. Sun, “Correlating sparse sensing for large-scale traffic speed estimation: A laplacian-enhanced low-rank tensor kriging approach,” Transportation Research Part C: Emerging Technologies, vol. 152, p. 104190, 2023.
- X. Wang, Y. Wu, D. Zhuang, and L. Sun, “Low-rank hankel tensor completion for traffic speed estimation,” IEEE Transactions on Intelligent Transportation Systems, vol. 24, no. 5, pp. 4862–4871, 2023.
- X. Chen, M. Lei, N. Saunier, and L. Sun, “Low-rank autoregressive tensor completion for spatiotemporal traffic data imputation,” IEEE Transactions on Intelligent Transportation Systems, vol. 23, no. 8, pp. 12 301–12 310, 2021.
- S. Liu, X. Li, G. Cong, Y. Chen, and Y. Jiang, “Multivariate time-series imputation with disentangled temporal representations,” in The Eleventh International Conference on Learning Representations, 2022.
- W. Ma and G. H. Chen, “Missing not at random in matrix completion: The effectiveness of estimating missingness probabilities under a low nuclear norm assumption,” Advances in neural information processing systems, vol. 32, 2019.
- W. Cao, D. Wang, J. Li, H. Zhou, L. Li, and Y. Li, “Brits: Bidirectional recurrent imputation for time series,” Advances in neural information processing systems, vol. 31, 2018.
- Z. Che, S. Purushotham, K. Cho, D. Sontag, and Y. Liu, “Recurrent neural networks for multivariate time series with missing values,” Scientific reports, vol. 8, no. 1, p. 6085, 2018.
- A. Cini, I. Marisca, and C. Alippi, “Filling the g_ap_s: Multivariate time series imputation by graph neural networks,” arXiv preprint arXiv:2108.00298, 2021.
- W. Du, D. Côté, and Y. Liu, “Saits: Self-attention-based imputation for time series,” Expert Systems with Applications, vol. 219, p. 119619, 2023.
- I. Marisca, A. Cini, and C. Alippi, “Learning to reconstruct missing data from spatiotemporal graphs with sparse observations,” Advances in Neural Information Processing Systems, vol. 35, pp. 32 069–32 082, 2022.
- M. Liu, H. Huang, H. Feng, L. Sun, B. Du, and Y. Fu, “Pristi: A conditional diffusion framework for spatiotemporal imputation,” arXiv preprint arXiv:2302.09746, 2023.
- J. Ma, Z. Shou, A. Zareian, H. Mansour, A. Vetro, and S.-F. Chang, “Cdsa: cross-dimensional self-attention for multivariate, geo-tagged time series imputation,” arXiv preprint arXiv:1905.09904, 2019.
- X. Chen, C. Zhang, X.-L. Zhao, N. Saunier, and L. Sun, “Nonstationary temporal matrix factorization for multivariate time series forecasting,” arXiv preprint arXiv:2203.10651, 2022.
- A. Das, W. Kong, R. Sen, and Y. Zhou, “A decoder-only foundation model for time-series forecasting,” arXiv preprint arXiv:2310.10688, 2023.
- A. Garza and M. Mergenthaler-Canseco, “Timegpt-1,” arXiv preprint arXiv:2310.03589, 2023.
- S. Van Buuren and K. Groothuis-Oudshoorn, “mice: Multivariate imputation by chained equations in r,” Journal of statistical software, vol. 45, pp. 1–67, 2011.
- X. Yi, Y. Zheng, J. Zhang, and T. Li, “St-mvl: filling missing values in geo-sensory time series data,” in Proceedings of the 25th International Joint Conference on Artificial Intelligence, 2016.
- H.-F. Yu, N. Rao, and I. S. Dhillon, “Temporal regularized matrix factorization for high-dimensional time series prediction,” Advances in neural information processing systems, vol. 29, 2016.
- X. Chen and L. Sun, “Bayesian temporal factorization for multidimensional time series prediction,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 9, pp. 4659–4673, 2021.
- M. Jin, H. Y. Koh, Q. Wen, D. Zambon, C. Alippi, G. I. Webb, I. King, and S. Pan, “A survey on graph neural networks for time series: Forecasting, classification, imputation, and anomaly detection,” arXiv preprint arXiv:2307.03759, 2023.
- Y. Luo, X. Cai, Y. Zhang, J. Xu et al., “Multivariate time series imputation with generative adversarial networks,” Advances in neural information processing systems, vol. 31, 2018.
- J. Yoon, J. Jordon, and M. Schaar, “Gain: Missing data imputation using generative adversarial nets,” in International conference on machine learning. PMLR, 2018, pp. 5689–5698.
- Y. Luo, Y. Zhang, X. Cai, and X. Yuan, “E2gan: End-to-end generative adversarial network for multivariate time series imputation,” in Proceedings of the 28th international joint conference on artificial intelligence. AAAI Press Palo Alto, CA, USA, 2019, pp. 3094–3100.
- Y. Liu, R. Yu, S. Zheng, E. Zhan, and Y. Yue, “Naomi: Non-autoregressive multiresolution sequence imputation,” Advances in neural information processing systems, vol. 32, 2019.
- Y. Tashiro, J. Song, Y. Song, and S. Ermon, “Csdi: Conditional score-based diffusion models for probabilistic time series imputation,” Advances in Neural Information Processing Systems, vol. 34, pp. 24 804–24 816, 2021.
- S. Shan, Y. Li, and J. B. Oliva, “Nrtsi: Non-recurrent time series imputation,” in ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2023, pp. 1–5.
- Y. Liang, Z. Zhao, and L. Sun, “Memory-augmented dynamic graph convolution networks for traffic data imputation with diverse missing patterns,” Transportation Research Part C: Emerging Technologies, vol. 143, p. 103826, 2022.
- Y. Wu, D. Zhuang, A. Labbe, and L. Sun, “Inductive graph neural networks for spatiotemporal kriging,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 5, 2021, pp. 4478–4485.
- W. Liang, Y. Li, K. Xie, D. Zhang, K.-C. Li, A. Souri, and K. Li, “Spatial-temporal aware inductive graph neural network for c-its data recovery,” IEEE Transactions on Intelligent Transportation Systems, 2022.
- T. Nie, G. Qin, Y. Wang, and J. Sun, “Towards better traffic volume estimation: Jointly addressing the underdetermination and nonequilibrium problems with correlation-adaptive gnns,” Transportation Research Part C: Emerging Technologies, vol. 157, p. 104402, 2023.
- A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need,” in Advances in Neural Information Processing Systems, 2017, pp. 5998–6008.
- J. Dong, H. Wu, H. Zhang, L. Zhang, J. Wang, and M. Long, “Simmtm: A simple pre-training framework for masked time-series modeling,” arXiv preprint arXiv:2302.00861, 2023.
- Y. Nie, N. H. Nguyen, P. Sinthong, and J. Kalagnanam, “A time series is worth 64 words: Long-term forecasting with transformers,” arXiv preprint arXiv:2211.14730, 2022.
- Z. Li, Z. Rao, L. Pan, P. Wang, and Z. Xu, “Ti-mae: Self-supervised masked time series autoencoders,” arXiv preprint arXiv:2301.08871, 2023.
- A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly et al., “An image is worth 16x16 words: Transformers for image recognition at scale,” arXiv preprint arXiv:2010.11929, 2020.
- J. Gao and B. Ribeiro, “On the equivalence between temporal and static equivariant graph representations,” in International Conference on Machine Learning. PMLR, 2022, pp. 7052–7076.
- Z. Shao, Z. Zhang, F. Wang, W. Wei, and Y. Xu, “Spatial-temporal identity: A simple yet effective baseline for multivariate time series forecasting,” in Proceedings of the 31st ACM International Conference on Information & Knowledge Management, 2022, pp. 4454–4458.
- T. Nie, G. Qin, Y. Wang, and J. Sun, “Nexus sine qua non: Essentially connected neural networks for spatial-temporal forecasting of multivariate time series,” arXiv preprint arXiv:2307.01482, 2023.
- A. Zeng, M. Chen, L. Zhang, and Q. Xu, “Are transformers effective for time series forecasting?” arXiv preprint arXiv:2205.13504, 2022.
- Z. Wu, S. Pan, G. Long, J. Jiang, X. Chang, and C. Zhang, “Connecting the dots: Multivariate time series forecasting with graph neural networks,” in Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, NY, USA: Association for Computing Machinery, 2020, p. 753–763.
- A. Cini, I. Marisca, D. Zambon, and C. Alippi, “Taming local effects in graph-based spatiotemporal forecasting,” arXiv preprint arXiv:2302.04071, 2023.
- H. Liu, Z. Dong, R. Jiang, J. Deng, J. Deng, Q. Chen, and X. Song, “Spatio-temporal adaptive embedding makes vanilla transformer sota for traffic forecasting,” in Proceedings of the 32nd ACM International Conference on Information and Knowledge Management, 2023, pp. 4125–4129.
- Y. Zhang and J. Yan, “Crossformer: Transformer utilizing cross-dimension dependency for multivariate time series forecasting,” in The Eleventh International Conference on Learning Representations, 2022.
- J. Liu, P. Musialski, P. Wonka, and J. Ye, “Tensor completion for estimating missing values in visual data,” IEEE transactions on pattern analysis and machine intelligence, vol. 35, no. 1, pp. 208–220, 2012.
- T. G. Kolda and B. W. Bader, “Tensor decompositions and applications,” SIAM review, vol. 51, no. 3, pp. 455–500, 2009.
- Z. Wu, S. Pan, G. Long, J. Jiang, and C. Zhang, “Graph wavenet for deep spatial-temporal graph modeling,” arXiv preprint arXiv:1906.00121, 2019.
- Z. Shao, Z. Zhang, F. Wang, and Y. Xu, “Pre-training enhanced spatial-temporal graph neural network for multivariate time series forecasting,” in Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2022, pp. 1567–1577.
- G. Liu and W. Zhang, “Recovery of future data via convolution nuclear norm minimization,” IEEE Transactions on Information Theory, vol. 69, no. 1, pp. 650–665, 2022.
- H. Wu, T. Hu, Y. Liu, H. Zhou, J. Wang, and M. Long, “Timesnet: Temporal 2d-variation modeling for general time series analysis,” arXiv preprint arXiv:2210.02186, 2022.
- C. for Energy Regulation (CER), “CER Smart Metering Project - Electricity Customer Behaviour Trial, 2009-2010,” [dataset] 1st Edition. Irish Social Science Data Archive. SN: 0012-00. https://www.ucd.ie/issda/data/commissionforenergyregulationcer/, 2012.
- Y. Li, R. Yu, C. Shahabi, and Y. Liu, “Diffusion convolutional recurrent neural network: Data-driven traffic forecasting,” arXiv preprint arXiv:1707.01926, 2017.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.