Papers
Topics
Authors
Recent
Search
2000 character limit reached

Unsupervised Incremental Learning with Dual Concept Drift Detection for Identifying Anomalous Sequences

Published 6 Mar 2024 in cs.CE | (2403.03576v3)

Abstract: In the contemporary digital landscape, the continuous generation of extensive streaming data across diverse domains has become pervasive. Yet, a significant portion of this data remains unlabeled, posing a challenge in identifying infrequent events such as anomalies. This challenge is further amplified in non-stationary environments, where the performance of models can degrade over time due to concept drift. To address these challenges, this paper introduces a new method referred to as VAE4AS (Variational Autoencoder for Anomalous Sequences). VAE4AS integrates incremental learning with dual drift detection mechanisms, employing both a statistical test and a distance-based test. The anomaly detection is facilitated by a Variational Autoencoder. To gauge the effectiveness of VAE4AS, a comprehensive experimental study is conducted using real-world and synthetic datasets characterized by anomalous rates below 10\% and recurrent drift. The results show that the proposed method surpasses both robust baselines and state-of-the-art techniques, providing compelling evidence for their efficacy in effectively addressing some of the challenges associated with anomalous sequence detection in non-stationary streaming data.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (34)
  1. H. M. Gomes, J. P. Barddal, F. Enembreck, and A. Bifet, “A survey on ensemble learning for data stream classification,” ACM Computing Surveys (CSUR), vol. 50, no. 2, pp. 1–36, 2017.
  2. Z. Chen, Z. Peng, X. Zou, and H. Sun, “Deep learning based anomaly detection for muti-dimensional time series: A survey,” in China Cyber Security Annual Conference.   Springer Nature Singapore Singapore, 2021, pp. 71–92.
  3. G. Ditzler, M. Roveri, C. Alippi, and R. Polikar, “Learning in nonstationary environments: A survey,” IEEE Computational Intelligence Magazine, vol. 10, no. 4, pp. 12–25, 2015.
  4. V. Losing, B. Hammer, and H. Wersing, “Incremental on-line learning: A review and comparison of state of the art algorithms,” Neurocomputing, vol. 275, pp. 1261–1274, 2018.
  5. K. B. Dyer, R. Capo, and R. Polikar, “Compose: A semisupervised learning framework for initially labeled nonstationary streaming data,” IEEE Transactions on Neural Networks and Learning Systems, vol. 25, no. 1, pp. 12–26, 2013.
  6. I. Žliobaitė, A. Bifet, B. Pfahringer, and G. Holmes, “Active learning with drifting streaming data,” IEEE Transactions on Neural Networks and Learning Systems, vol. 25, no. 1, pp. 27–39, 2013.
  7. K. Malialis, C. G. Panayiotou, and M. M. Polycarpou, “Nonstationary data stream classification with online active learning and siamese neural networks,” Neurocomputing, vol. 512, pp. 235–252, 2022.
  8. K. Malialis, D. Papatheodoulou, S. Filippou, C. G. Panayiotou, and M. M. Polycarpou, “Data augmentation on-the-fly and active learning in data stream classification,” in IEEE Symposium Series on Computational Intelligence (SSCI), 2022.
  9. G. Hulten, L. Spencer, and P. Domingos, “Mining time-changing data streams,” in Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2001, pp. 97–106.
  10. N.-Y. Liang, G.-B. Huang, P. Saratchandran, and N. Sundararajan, “A fast and accurate online sequential learning algorithm for feedforward networks,” IEEE Transactions on Neural Networks, vol. 17, no. 6, pp. 1411–1423, 2006.
  11. L. L. Minku and X. Yao, “Ddd: A new ensemble approach for dealing with concept drift,” IEEE transactions on knowledge and data engineering, vol. 24, no. 4, pp. 619–633, 2011.
  12. W. N. Street and Y. Kim, “A streaming ensemble algorithm (sea) for large-scale classification,” in Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, 2001, pp. 377–382.
  13. K. Malialis, C. G. Panayiotou, and M. M. Polycarpou, “Online learning with adaptive rebalancing in nonstationary environments,” IEEE Transactions on Neural Networks and Learning Systems, 2020.
  14. A. Cano and B. Krawczyk, “Rose: robust online self-adjusting ensemble for continual learning on imbalanced drifting data streams,” Machine Learning, pp. 1–39, 2022.
  15. M. Jaworski, L. Rutkowski, and P. Angelov, “Concept drift detection using autoencoders in data streams processing,” in International Conference on Artificial Intelligence and Soft Computing.   Springer, 2020, pp. 124–133.
  16. K. Malialis, M. Roveri, C. Alippi, C. G. Panayiotou, and M. M. Polycarpou, “A hybrid active-passive approach to imbalanced nonstationary data stream classification,” in IEEE Symposium Series on Computational Intelligence (SSCI), 2022.
  17. A. Artelt, K. Malialis, C. Panayiotou, M. Polycarpou, and B. Hammer, “Unsupervised unlearning of concept drift with autoencoders,” in IEEE Symposium Series on Computational Intelligence (SSCI), 2023.
  18. F. T. Liu, K. M. Ting, and Z. Zhou, “Isolation forest,” in 2008 eighth IEEE International Conference on Data Mining.   IEEE, 2008, pp. 413–422.
  19. R. Chalapathy and S. Chawla, “Deep learning for anomaly detection: A survey,” arXiv preprint arXiv:1901.03407, 2019.
  20. A. M. Mustafa, G. Ayoade, K. Al-Naami, L. Khan, K. W. Hamlen, B. Thuraisingham, and F. Araujo, “Unsupervised deep embedding for novel class detection over data stream,” in 2017 IEEE International Conference on Big Data (Big Data).   IEEE, 2017, pp. 1830–1839.
  21. J. Li, K. Malialis, and M. M. Polycarpou, “Autoencoder-based anomaly detection in streaming data with incremental learning and concept drift adaptation,” in International Joint Conference on Neural Networks (IJCNN).   IEEE, 2023.
  22. Y. Dong and N. Japkowicz, “Threaded ensembles of autoencoders for stream learning,” Computational Intelligence, vol. 34, no. 1, pp. 261–281, 2018.
  23. Z. Niu, K. Yu, and X. Wu, “Lstm-based vae-gan for time-series anomaly detection,” Sensors, vol. 20, no. 13, p. 3738, 2020.
  24. D. Fährmann, N. Damer, F. Kirchbuchner, and A. Kuijper, “Lightweight long short-term memory variational auto-encoder for multivariate time series anomaly detection in industrial control systems,” Sensors, vol. 22, no. 8, p. 2886, 2022.
  25. J. Rosenberger, K. Müller, A. Selig, M. Bühren, and D. Schramm, “Extended kernel density estimation for anomaly detection in streaming data,” Procedia CIRP, vol. 112, pp. 156–161, 2022.
  26. P.-F. Marteau, “Random partitioning forest for point-wise and collective anomaly detection—application to network intrusion detection,” IEEE Transactions on Information Forensics and Security, vol. 16, pp. 2157–2172, 2021.
  27. D. P. Kingma and M. Welling, “Auto-encoding variational bayes,” arXiv preprint arXiv:1312.6114, 2013.
  28. S. Kullback and R. A. Leibler, “On information and sufficiency,” The Annals of Mathematical Statistics, vol. 22, no. 1, pp. 79–86, 1951.
  29. J. Clark, Z. Liu, and N. Japkowicz, “Adaptive threshold for outlier detection on data streams,” in 2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA).   IEEE, 2018, pp. 41–49.
  30. T. Nitta, Y. Shi, T. Hirakawa, T. Yamashita, and H. Fujiyoshi, “Detecting data drift with ks test using attention map,” in Asian Conference on Pattern Recognition.   Springer, 2023, pp. 68–80.
  31. J. Gama, P. Medas, G. Castillo, and P. Rodrigues, “Learning with drift detection,” in Brazilian Symposium on Artificial Intelligence.   Springer, 2004, pp. 286–295.
  32. A. Dal Pozzolo, O. Caelen, R. A. Johnson, and G. Bontempi, “Calibrating probability with undersampling for unbalanced classification,” in Computational Intelligence, 2015 IEEE Symposium Series on.   IEEE, 2015, pp. 159–166.
  33. Y. Sun, M. S. Kamel, and Y. Wang, “Boosting for learning multiple classes with imbalanced class distribution,” in International Conference on Data Mining.   IEEE, 2006, pp. 592–602.
  34. J. Gama, R. Sebastião, and P. P. Rodrigues, “On evaluating stream learning algorithms,” Machine Learning, vol. 90, no. 3, pp. 317–346, 2013.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.