Detecting Multivariate Time Series Anomalies with Zero Known Label

Published 3 Aug 2022 in cs.LG and cs.AI | (2208.02108v3)

Abstract: Multivariate time series anomaly detection has been extensively studied under the semi-supervised setting, where a training dataset with all normal instances is required. However, preparing such a dataset is very laborious since each single data instance should be fully guaranteed to be normal. It is, therefore, desired to explore multivariate time series anomaly detection methods based on the dataset without any label knowledge. In this paper, we propose MTGFlow, an unsupervised anomaly detection approach for multivariate time series anomaly detection via dynamic graph and entity-aware normalizing flow, leaning only on a widely accepted hypothesis that abnormal instances exhibit sparse densities than the normal. However, the complex interdependencies among entities and the diverse inherent characteristics of each entity pose significant challenges on the density estimation, let alone to detect anomalies based on the estimated possibility distribution. To tackle these problems, we propose to learn the mutual and dynamic relations among entities via a graph structure learning model, which helps to model accurate distribution of multivariate time series. Moreover, taking account of distinct characteristics of the individual entities, an entity-aware normalizing flow is developed to describe each entity into a parameterized normal distribution, thereby producing fine-grained density estimation. Incorporating these two strategies, MTGFlow achieves superior anomaly detection performance. Experiments on five public datasets with seven baselines are conducted, MTGFlow outperforms the SOTA methods by up to 5.0 AUROC\%. Codes will be released at https://github.com/zqhang/Detecting-Multivariate-Time-Series-Anomalies-with-Zero-Known-Label.

Abstract PDF HTML Upgrade to Chat

Citations (14)

View on Semantic Scholar

Summary

The paper introduces MTGFlow, an unsupervised MTS anomaly detection method that innovates by combining dynamic graph models with entity-aware normalizing flows.
It leverages a dynamic graph structure to capture evolving interdependencies among entities and achieves fine-grained density estimation for diverse time series.
Experiments on real-world datasets show up to a 5.0 AUROC% improvement over state-of-the-art methods, highlighting its practical effectiveness.

Detecting Multivariate Time Series Anomalies with Zero Known Label

The paper presents a novel unsupervised approach for anomaly detection in multivariate time series (MTS) called MTGFlow, which is capable of detecting anomalies without the need for any labeled data. This method addresses the challenges associated with complex interdependencies among entities and the diverse characteristics of individual entities in MTS.

Problem and Methodology

The task of detecting anomalies in MTS without labeled data is particularly challenging due to the requirement to estimate the probability distribution accurately, especially when traditional one-class classification (OCC) methods relying on all normal data do not hold up in real-world scenarios. The authors adopt a hypothesis that anomalies will exhibit sparser densities than normal instances and design a model to effectively estimate these densities using dynamic graphs and entity-aware normalizing flow.

Dynamic Graph Structure

The first key innovation is learning dynamic interdependencies among entities using a graph structure learning module. This aspect is crucial because static graphs, as used in previous works like GANF, do not capture the evolving nature of relationships among sensors or entities in real-world applications.

Figure 1: Dynamic graph structure in MTGFlow.

Entity-Aware Normalizing Flow

Secondly, the method employs an entity-aware normalizing flow to estimate the distribution of each entity specifically, rather than applying the same distribution to all entities. By assigning each entity to its unique target distribution, the model achieves fine-grained density estimation, improving the overall accuracy of anomaly detection.

Figure 2: Transformed distributions of multiple entities.

Experimental Results

MTGFlow is evaluated using several real-world datasets, achieving superior performance over state-of-the-art methods such as GANF, DeepSVDD, and USAD. Notably, the method demonstrates an improvement of up to 5.0 AUROC% over existing models, showcasing its ability to effectively learn and adapt to different data characteristics without requiring labeled samples.

Anomaly Detection Performance

Log likelihood variations for anomalies (Figure 3) and comparisons of normalized anomaly scores (Figure 4) between MTGFlow and GANF indicate that MTGFlow shows greater discrimination between normal and abnormal sequences. This superior detection capability is attributed to the model's capacity to dynamically adjust to changing data patterns and capture unique entity behaviors.

Figure 3: Log likelihoods for anomalies.

Figure 4: Comparison on normalized anomaly scores between MTGFlow and GANF.

Implications and Future Directions

MTGFlow's approach to MTS anomaly detection via dynamic modeling and entity-aware flows has significant implications for domains where labeled anomaly data is scarce or costly to obtain. By leveraging graph-based dependencies and tailored flow distributions, this work paves the way for more adaptive and robust unsupervised anomaly detection techniques.

The paper's methodology suggests promising directions for further research, such as enhancing the model's scalability and exploring its application to even more diverse types of time series data. Future work could also investigate integrating contextual information to improve interpretability and predictive accuracy.

Conclusion

MTGFlow introduces an innovative framework for unsupervised MTS anomaly detection that effectively overcomes the limitations of existing methods by incorporating a dynamic graph-based approach and entity-aware normalizing flows. Through extensive experimentation and analysis, it has been demonstrated to provide superior performance on benchmark datasets, highlighting its potential for real-world deployment in scenarios with sparse labeled data availability.

Markdown Report Issue