A Distributed Neural Network Architecture for Robust Non-Linear Spatio-Temporal Prediction

Published 23 Dec 2019 in cs.LG and cs.NE | (1912.11141v1)

Abstract: We introduce a distributed spatio-temporal artificial neural network architecture (DISTANA). It encodes mesh nodes using recurrent, neural prediction kernels (PKs), while neural transition kernels (TKs) transfer information between neighboring PKs, together modeling and predicting spatio-temporal time series dynamics. As a consequence, DISTANA assumes that generally applicable causes, which may be locally modified, generate the observed data. DISTANA learns in a parallel, spatially distributed manner, scales to large problem spaces, is capable of approximating complex dynamics, and is particularly robust to overfitting when compared to other competitive ANN models. Moreover, it is applicable to heterogeneously structured meshes.

Abstract PDF Upgrade to Chat

Citations (12)

View on Semantic Scholar

Summary

The paper introduces DISTANA, a distributed neural network architecture that combines predictive and transition kernels for robust spatio-temporal modeling.
DISTANA's distributed, weight-sharing design prevents overfitting and outperforms benchmark models like ConvLSTMs in predicting complex wave dynamics.
The architecture is designed for heterogeneous data meshes, making it suitable for real-world applications such as short-range weather forecasting from sensor networks.

A Distributed Neural Network Architecture for Robust Non-Linear Spatio-Temporal Prediction

This paper introduces a novel approach to modeling and predicting non-linear spatio-temporal dynamics through an artificial neural network architecture called DISTANA. The architecture leverages the combination of predictive spatio-temporal neural network kernels (PKs) and neural transition kernels (TKs), allowing for efficient and robust simulation of spatio-temporal processes. DISTANA addresses significant challenges in accurately modeling complex systems such as climate forecasting, traffic predictions, and other processes involving dynamic spatial data.

The core innovation of DISTANA resides in its distributed and scalable learning format that applies consistent principles across time and space, promoting generalization. PKs are employed at each node of the data mesh, handling predictions based on local and temporal data. Meanwhile, TKs facilitate information transfer across neighboring PKs, enabling localized modifications to general underlying dynamics. This distributed, weight-sharing setup not only aids in parallel computation but also constrains the architecture to avoid overfitting, a persistent issue in competing models such as CNNs, RNNs, and ConvLSTM networks.

The experiments conducted in the paper emphasize the robustness and efficiency of DISTANA. In tests involving waves propagating in a mesh, DISTANA outperformed several benchmark models like fully connected networks, CNNs, and ConvLSTMs, demonstrating superior accuracy and reduced susceptibility to overfitting. The architecture also handled the challenges introduced by a dynamic dataset with reflecting border waves, which increased the complexity of interactions within the data. Here, DISTANA required minor modifications to further outperform existing architectures in maintaining accurate, long-term predictions of the wave dynamics.

In addition to its technical proficiency, DISTANA’s design to operate on heterogeneously structured meshes positions it for real-world applications beyond regular grid systems. This capability is particularly relevant for domains like meteorology, where sensor and data distribution may vary irregularly based on geographic constraints. The paper suggests forward-looking applications, notably in enhancing short-range weather forecasting by identifying coherent spatio-temporal patterns from distributed sensor networks.

Overall, this work presents a significant development in spatio-temporal predictive modeling. By ensuring that predictions remain stable and accurate over extended periods, DISTANA addresses some of the critical limitations in existing neural network approaches to spatio-temporal data analysis. Future work could include integrating DISTANA with more complex, real-world datasets, testing its flexibility and performance scalability in operational environments, and exploring its efficacy with different types of irregular mesh structures. This research underscores the potential for distributed neural architectures to advance the capacity of predictive models to interpret and anticipate complex dynamic systems.

Markdown Report Issue